The Knewton Blog

Subscribe to Newsletter

Our monthly newsletter features edtech and product updates, with a healthy dose of fun Knerd news.

Heavy Duty Infrastructure for the Adaptive World

Posted in CEO Jose Ferreira on November 22, 2013 by



The adaptive learning landscape has changed dramatically since Knewton was founded nearly six years ago. Back then, my pedantic lecturing about adaptive learning was met mostly with blank stares. The term itself was unused in the market.

In April, I wrote a post predicting that in the next few years all learning materials will become digital and adaptive. Knewton is premised on this revolution. We envision a world of adaptive learning apps, a world where every app maker is by definition an adaptive app maker.

But there’s a potential obstacle to such a world. While it’s relatively straightforward to make simple differentiated learning apps, it’s extremely difficult and expensive to make proficiency-based adaptive learning. There are many apps on the market today that offer rich learning experiences, with wonderful instructional design, content, and pedagogy. But without proficiency-based adaptivity, these apps are severely limited.

The difference is data. Specifically, each student’s concept-level proficiency data.

By that I mean something pretty specific, that goes way beyond “observable data” like test scores or time taken. Capturing a student’s performance on a test or assignment does not take into account the difficulty of the material, the concepts to which it has relevance, or a student’s prior experience on similar content. A true model of proficiency can estimate what students know, how prepared they are for further instruction or assessment, and how their abilities evolve over time.

Concept-level proficiency data is not what a student did, but what we are confident that they know, at a granular level. Extracting what students know from what students do is extremely difficult and absolutely critical.

To get this kind of proficiency data, it is essential to have large pools of “normed” content. To get normed content, you need infrastructure to passively, algorithmically, and inexpensively norm content at scale. Then you need infrastructure to make sense of and take action on the resulting data. There are no shortcuts.

Without each of those infrastructures, you have, at best, good guesses.1 For instance, some apps have a pre-determined decision tree, with a simple hurdle rate made up by some content editor, that says something like, “Students who get 8 out of 10 questions right on this algebra quiz can move on; otherwise give them more algebra questions.” There are a number of problems with simple rules-based systems like this, such as: they can’t control for differences in English language skill; they’re fundamentally arbitrary; they’re often used for endless drilling rather than learning. But the biggest problem is that there is no infrastructure involved that can produce any actual student proficiency data. It’s all unnormed practice questions and guesswork. That content editor might make some pretty good guesses, some not so good, but either way the error rate of those guesses compounds exponentially2 from one to the next.

This isn’t to say that there aren’t terrific apps like this out there today. I just wouldn’t call such apps “adaptive learning.”

To me, the term “adaptive learning” can only mean learning that’s based on rigorous estimates of each student’s proficiency on each concept. After all, “adaptive learning” grew out of, and is a play on, “adaptive testing,” which is based on the coarser, but similar notion of construct-level proficiency data. Instead, I would call proficiency-less apps “differentiated learning” — an instructional designer somewhere has made some (hopefully) intelligent guesses that will differentiate each student’s path based on observable data.

Note that you don’t necessarily need the Knewton platform to make a true adaptive learning app. Before we built our platform, Knewton produced an adaptive learning GMAT prep application. We wrote the questions ourselves and normed them by paying randomized students (via Amazon’s “Mechanical Turk” and Craigslist) to answer each question and then cleaning the resulting data with successive rounds of testing, analysis, and evaluation.

We know firsthand that producing a self-contained adaptive learning app is painful, expensive, functionally constrained, and unscalable. But building a platform to scalably norm assessment items, extract proficiency data, confidently infer cascades of additional data, and optimize learning based on those data is vastly more complex and expensive, with hundreds of different critical components all of which have to be built just right and interact with each other in exactly the right way.

Once you have normed items, you need infrastructures that can use those items to generate insights about students and content. And then you need infrastructures to turn that insight into great product features to help students, teachers, and parents — features like recommendations, predictive analytics, and unified learning histories across apps. Building such a platform requires that you know exactly what you’re building before you even start, and have the world’s top data scientists and software developers to build it. It simply makes no sense for any one company to build all of that just to power its own apps.

I created Knewton to solve this conundrum. Knewton has built the necessary infrastructures to gather, multiply, process, and action student proficiency data. Anyone who wants to builds true adaptive learning apps, without doing all the painful and expensive work on a one-off basis, can plug into our network and build on top of our infrastructures.

Subtle Art vs. Heavy Machinery

Each third-party app we power brings its own core competencies in the subtle arts of content creation, pedagogy, and user experience, while outsourcing the heavy machinery of its personalization infrastructure to Knewton. Today, the Knewton platform comprises three main parts:

Data Collection Infrastructure: Collects and processes huge amounts of proficiency data.

  • Adaptive ontology: Maps the relationships between individual concepts, then integrates desired taxonomies, objectives, and student interactions.
  • Model computation engine: Processes data from real-time streams and parallel distributed cluster computations for later use.

Inference Infrastructure: Further increases data set and generates insights from collected data.

  • Psychometrics engine: Evaluates student proficiencies, content parameters, efficacy, and more. Exponentially increases each student’s data set through inference.
  • Learning strategy engine: Evaluates students’ sensitivities to changes in teaching, assessment, pacing, and more.
  • Feedback engine: Unifies this data and feeds results back into the adaptive ontology.

Personalization Infrastructure: Takes the combined data power of the entire network to find the optimal strategy for each student for every concept she learns.

  • Recommendations engine: Provides ranked suggestions of what a student should do next, balancing goals, student strengths and weaknesses, engagement, and other factors.
  • Predictive analytics engine: Predicts student metrics such as the rate and likelihood of achieving instructor-created goals (e.g., how likely is a student to pass an upcoming test with at least a 70%?), expected score, proficiency on concepts and taxonomies (e.g., state standards), and more.
  • Unified learning history: A private account that enables students to connect learning experiences across disparate learning apps, subject areas, and time gaps to allow for a “hot start” in any subsequent Knewton-powered app.

Schools used to do everything themselves: teachers, materials, cafeterias (if any), technology (if any), etc. Similarly, factories used to do everything on site. Eventually people realized that an ecosystem of just-in-time parts providers resulted in far better quality and lower cost manufacturing. Schools today are increasingly moving in that direction, with outsourced content, content management, food services, academic services, etc. This way schools can do what they do best — teach, administer, offer next-step guidance, and foster community.

In a much smaller way, Knewton is trying to contribute to that ecosystem. We don’t do the sexy stuff — content, instructional design, pedagogy, etc. — but we do help the creative geniuses who excel at those arts to bring the most powerful vision of their products to life.


  1. Strictly speaking, even normed content yields only estimates of proficiency — they’re just estimates that one can have a lot of confidence in, and that get better with more data. 

  2. Because every cluster of concepts has other clusters dependent upon it, suboptimizing the learning around any one cluster suboptimizes that student’s learning around every dependent cluster