About Jonathan Goldman

Jonathan is a computer science major at Carnegie Mellon University.

Algorithmically Generating Questions

Developing assessment materials is a lot of work. Teachers spend countless hours scavenging textbooks and developing original exercises for practice worksheets, homework problems, remedial materials, and exams. To prevent cheating, many teachers write several versions of each test, multiplying the work required for creating problems (not to mention grading!). The issue is exacerbated by the rising popularity of MOOCs, in which tens of thousands of students may be enrolled in the same course.

Furthermore, access to a large, varied pool of assessment items is a key to the success of many adaptive courses of study. Without a large pool of items, proficiency estimates can be compromised and personalized courses of study can become less effective than they would be otherwise.

Machine Generated Questions

Machine generated questions have been studied for decades as a component of intelligent tutoring systems. Most research falls into two categories: solution-oriented approaches, and template-based approaches.

Solution-Oriented Approach*

In this approach, questions are generated based on the set of skills and concepts required to solve them. For example, skills related to addition includes adding single digit numbers, adding multi-digit numbers, adding three or more numbers, and carrying digits.

A recent paper, entitled “A Trace-Based Framework for Analyzing and Synthesizing Educational Progressions,” describes an interesting implementation of solution-oriented question generation. On page three, the authors write out pseudocode for the standard, classroom addition procedure. They then annotate the code with symbols representing skills (for example, C for “carry digit”). Thus, by running the pseudocode and keeping track of the symbols, one can obtain a “trace” that categorizes each possible addition problem.

Because solution-oriented approaches group problems based on skills, they lend themselves well to adaptivity. As a student answers questions, one can identify skills he or she is struggling with, and then recommend material reinforcing those skills. However, a major drawback of solution-oriented approaches is that developing questions even for a topic as simple as addition requires a fair amount of labor and domain expertise.

Template-Based Approach*

In this approach, a question template is used to represent a potentially large class of problems. For example, consider a familiar question:

Find all roots of _x+ _x + _.

The underlines are “holes” that must be filled in by the question generator. A template might also specify valid ways to fill in the holes. For example, maybe each hole can only be filled in by the integers one through 10, leading to 10^3= 1000 possible questions. The instructor may wish to further restrict the template to only permit quadratics with real, distinct roots.

The biggest advantage of this approach is that it is accessible to a majority of instructors, provided there is an intuitive and robust templating language. In addition, template-based approaches are easily generalizable, capable of representing entire domains. A disadvantage of templates is that they tend to group problems based on appearance, not skills.

This summer, I set out to create an assessments generator engine that would be both accessible and expressive enough to generate a wide variety of problems. For these reasons, a templating approach made the most sense. Furthermore, Knewton already has infrastructure in place that will enable us to achieve adaptivity even with a template-oriented approach.

The Art of Templating

My first task was to devise a templating language. I decided that it would be a good exercise to define a domain specific language (DSL) that formalizes the space of possible templates. This DSL must let instructors specify the following:

  • Which variables in the question can be adjusted?

  • What values are these variables allowed to take on?

  • How is the correct answer computed?

  • How are the incorrect answers computed? (for multiple choice questions)

After several iterations, I came up with a format general enough to cover many (if not most) of the questions used by Knewton Math Readiness. I’ll go through some examples, beginning with simple addition, that illustrate the main features of the templating language.

Adding Two Numbers

The below template is used to generate questions of the form _ +_ = ?

template "add 2 numbers" {
  question is "<x>+<y>=?"
  answer is "<sum>"
  variables {
    x, y are integers between 1 and 9
    sum=x+y
  }
}

The question and answer are simply strings with variable names denoting the “holes.” Variables come in two flavors: generated (x and y) and derived (sum). Generated variables are bound to a sample set, which could be a range of integers, numbers with 2 decimal places, or even the set of fibonacci numbers. Derived variables are defined by mathematical expressions.

Finding Quadratic Roots

Let’s return to the earlier question of finding the roots of a quadratic. Following the addition example, we might try:

template "quadratic roots" {
  question is "<a>x^2 + <b>x + <c> = 0. Solve for x."
  answer is "x = <root1>, <root2>"
  variables {
    a, b, c are integers between -10 and 10
    discriminant = b^2 - 4*a*c
    root1 = -b + sqrt(discriminant)/(2*a)
    root2 = -b - sqrt(discriminant)/(2*a)
  }
}

Here, we are generating each coefficient (a, b, c) from the range [-10, 10]. However, the below table illustrates an issue with this template. For certain coefficients, the solutions can be integral, fractions, irrational numbers, or even imaginary numbers.

(a,b,c)

Solutions

(1, -10, 16)

2, 8

(-4, 7, -3)

0.75, 1

(3, 6, 2)

-1.577…, -0.422…

(1, 2, 1)

-1

(1, 4, 8)

-2 + 2i, -2 -2i

Because of this, the template can represent questions requiring different skill sets and mastery levels. It is important to give instructors the ability to maintain a consistent difficulty level, and to control the concepts covered by a single template. This is achieved using constraints.

template "quadratic roots" {
  question is "<a>x^2 + <b>x + <c> = 0. Solve for x."
  answer is "x = <root1>, <root2>"
  variables {
    a, b, c are integers between -10 and 10
    discriminant = b^2 - 4*a*c
    root1 = -b + sqrt(discriminant)/(2*a)
    root2 = -b - sqrt(discriminant)/(2*a)
  }
  constraints {
    root1, root2 must be integers
    root1 <> root2
    gcd(a, b, c) = 1
  }
}

In general, constraints are useful for narrowing the skill set covered by the template, and to ensure that instantiations of the template are sufficiently varied. In this example, requiring unique, integer roots is used to control difficulty level, while requiring gcd(a, b, c) = 1 ensures that no two questions will be multiples of one another.

Car Distance

Another important feature of the templating language is the ability to specify wrong answers.

template "car distance" {
  question is "How far does a car travel in <m> minutes traveling <r> miles/hour?"
  answer is "<dist>miles"
  variables {
    m is an integer between 30 and 90 divisible by 10
    r is an integer between 40 and 65 divisible by 5
    dist = rate*time/60
    wrong1 = rate*time
    wrong2 = rate/time
    wrong3 = rate/time/60
  }
  wrong answers {
    "<wrong1>miles"
    "<wrong2>miles"
    "<wrong3>miles"
  }
}

Wrong answers can be either static or variable. What’s powerful about this feature is that each wrong answer might be associated with a particular deficiency or misconception. In the example, a student might choose “rate/time/60” because she doesn’t know the correct distance formula, while another student might choose “rate*time” because she has trouble converting units. This is another source of information that Knewton can use to provide more targeted recommendations.

The Problem Generation Algorithm

Great, so we have a template. Now how do we actually generate questions? My first inclination was to start with the simplest possible algorithm:

  1. Go down the list of variables, selecting values for the generated variables uniformly at random from the sample sets and using the formulas to compute the derived variables

  2. If the variables satisfy all of the constraints, add it to the list of questions.

  3. Repeat.

This naive algorithm performs nicely given one key assumption: a large enough fraction of the sample space (the set of all possible questions, i.e. the cartesian product of the sample sets) must meet the constraints specified in the template. For instance, if 100 questions are desired and the algorithm can handle 100,000 iterations, roughly 1/1000 questions need to be valid. This isn’t too daunting: as long as we offer an expressive library of sample sets and constraints, instructors can be expected to provide templates meeting this requirement.

It is a very difficult to come up with a more efficient approach. For some problems, algorithms do exist to generate solutions (see Euclid’s method for pythagorean triples), but for others it is mathematically impossible (see Hilbert’s tenth problem). In many cases, introducing heuristics may improve the random algorithm. For instance, it may be possible to identify a large chunk of the sample space that leads to solutions that are too large, non integral, negative, etc.

The Prototype

I chose to implement the assessment generator in Scala for several reasons:

  • Scala’s interoperability with Java made integrating with the rest of the Knewton code base an easy task.

  • Scala’s powerful Parser Combinators library made implementing the template DSL straightforward. Because of their low overhead, I also used parser combinators for converting math expressions like “sqrt(x^3 + 5)” into my internal data model. While there are many existing Java/Scala libraries that accomplish this, I was unable to find any capable of manipulating general mathematical objects, such as fractions, matrices, polynomials, polygons, and groups.

  • Scala’s parallel collections allowed running iterations of the problem generator routine in parallel. Doing so only required swapping out a Map with a ParMap, and appending “.par” to the main program loop.

Here is a screenshot of the prototype in action.

*For examples of solution-oriented approaches in the literature, see https://act.org/research/researchers/reports/pdf/ACT_RR93-09.pdf (survey) and http://research.microsoft.com/en-us/um/people/sumitg/pubs/chi13.pdf (grade school math).

*For examples of template-based approaches in the literature, see http://www.eecs.berkeley.edu/~dsadigh/WESE12.pdf (embedded systems).