Last month, I addressed the most common arguments people use today against big data in education. As Knewton emerges as a global leader in big data for education, it is increasingly our responsibility to explain both the benefits and the limits of data. Ever since Knewton was founded, we’ve been as careful as possible with student data (so much so that we don’t even hold any Personally Identifiable Information for our partners’ students1).
Over the years, we’ve carefully arrived at the standards, below, that we hold ourselves to. Given the recent debate over data in edtech, it occurred to us that others who are still formulating their own standards may be interested in learning about ours. One problem with standards is that they can quickly become outdated and stifle progress in ways that were never intended. Another is that they are targeted at today’s problems (real or perceived), and those problems will change. We’ve tried hard to make our standards comprehensive, attainable, and future-proof.
First, five principles we consider inviolable:
1) Student data belongs to the student.
Everyone else is only a custodian. Parents should continue to have the authority to make decisions on their children’s behalf until they reach age of consent. Schools and publishers tend to be proprietary about data — they think it’s theirs. It isn’t their data. (They couldn’t sell it to Facebook or Coca-Cola, for instance.) It’s the student’s data. The rest of us have only been given the right to use the data to help improve student learning outcomes.
2) Student data should never be sold or shared without explicit permission.
Education companies should never sell or share personal student data without the student’s explicit approval — period. (Knewton will never sell a student’s data, and we’ll only share it if the student asks us to.)
3) Student data should only be used to improve learning outcomes.
It shouldn’t be used for “gotcha” purposes. It should never be used to embarrass anyone — whether students, teachers, schools, or education companies. It should only be used to help students learn.
4) Student data should be easy to manage.
Some consumer web companies deliberately make it hard for users to understand and set privacy preferences — in large part because these companies’ ad-supported models mean they can only make money by selling your data. For those companies, obscure privacy settings are a deliberate marketing tactic. Edtech companies, who charge by subscription rather than advertising, have no such conflict of interest.2 For us, privacy settings should be as transparent as possible. Managing permissions (e.g., choosing whether to allow teachers and education companies to see your learning history so they can better help you) should be a simple, easy process.
5) Student data should be very carefully protected.
Banks use the highest levels of encryption when storing and transmitting financial information. Society rightfully expects a higher standard from entities that manage financial data or personal health information. So, too, with education data.
The following standards describe tasks that are either difficult to implement or can always be improved upon. They are more ongoing goals than they are all-or-nothing.
6) Student data should be clear and comprehensible.
Data should be presented to students in a way that is as understandable and useful as possible. Students should be empowered to manage their own data.
7) Students should be able to consolidate their data.
Students should be able to merge their data from different sources into one learning history. A unified set of data is greater than the sum of its parts. Maintaining a learning history with their own unique, secure profile will help students (and their teachers) improve learning outcomes much more than keeping it isolated in separate applications.
8) Student data should be portable.
If students so choose, they should be able to take their relevant learning history out of any one system and add it to another system, whether that is from one course to another, one grade to another, one school to another, one publisher to another, or one technology platform to another.
9) Student data analysis should be completely stoppable — and recoverable.
Students should be able to turn all data analysis off. They should have confidence that the systems they’re using will totally ignore them if they wish. Data should be recoverable since students may one day want their learning history back (e.g., if they find themselves struggling in school, or when they reach age of majority). Destroying a student’s data outright may simply lead to a permanent disadvantage vs. other students. A better solution is to offer the option to turn all data analysis off — but keep the data recoverable.
Lastly, we have an important standard that isn’t directly about students, but rather about institutional partnerships, which we believe will become increasingly common as education publishers come to look more like tech companies.
10) Institutional IP should be protected.
Schools compete with other schools; education companies compete with other education companies. In doing so, they often try new strategies and pedagogies to find an advantage. If any of this experimentation results in notably improved learning outcomes, the institution that tried it has invented a new technique. While in the short term, it would be better for students if such approaches were immediately shared widely, these approaches would never have come into existence if the institutions’ IP hadn’t been protected.
Students using our partners’ Knewton-powered products are by default anonymous in our system. Only with explicit student consent (or parental consent for minors) can we link pieces of a learning history to a specific person. Even then, we never really know who a student is. This ensures student data is safe even in the unlikely case of a security breach. ↩
Edtech companies (Knewton among them) choose subscription models deliberately to avoid this conflict. ↩