The advent of "big data" in areas such as internet search and social media has disrupted existing industries, created new industries, and led to the extraordinary success of companies such as Google and Facebook. Big data unleashes a range of productive possibilities in the education domain in particular, since data that reflects cognition is structurally unique from the data generated by user activity around web pages, social profiles, and online purchasing habits.
One feature that distinguishes the data produced by students (from that of consumers shopping online or engaging in social media, for example) is the fact that academic study requires a prolonged period of engagement; students thus remain on the platform for an extended length of time. Furthermore, there is a focus, intention, and intensity to students' activity: they are engaging in high stakes situations — taking a course for credit, trying to improve their future, expanding their range of skills. The sustained intensity of these efforts generates vast quantities of meaningful data that can be harnessed continuously to power personalized learning for each individual.
Another feature that distinguishes the data produced by students is the very high degree of correlation between educational data and the aggregated effect of all those correlations. If, for example, a student has demonstrated mastery of fractions, algorithms can reveal how likely it is that he will demonstrate mastery of exponentiation as well — and how best to introduce that concept to him. If a student has demonstrated mastery of various grammatical concepts (say, subjects, verbs, and clauses), educational data can optimize his or her learning path, so that different sentence patterns will "click" for the student as quickly as possible.
The hierarchical nature of educational concepts means that they can be organized in a graph-like structure, which means that the student flow from concept-to-concept can be optimized over time, as Knewton learns more and more about the relationships between them through data. Every student action and response around each content item ripples out and affects the system's understanding of all the content in the system and all the students in the network.