Greg Tyler

Generative Quizzing

Published on 4th August 2019

I’m a big fan of Sporcle. It’s a quiz website which is entirely user-submitted, meaning there’s a really wide array of quizzes on a variety of subjects. Subjects like Fernando Torres’s 2010/11 goal record or US President birth states.

Part of my love of Sporcle is its simplicity. It harkens back to the early internet: where sites were run for love and not profit, where users provided data because they wanted to, where creating an account was an option rather than a requirement.

That Web 1.0 simplicity aludes to the fact that Sporcle is totally independent. Unlike many of the big websites , it’s not a subsidiary of Amazon or Google, it’s not owned by some umbrella corporation. Parent company Sporcle Inc. has three offices, a conference and a bar.

I’m not sure what Sporcle’s profitability model looks like (I assume it’s mostly ad-based) and maybe that’s dissuaded potential buyers. But, like many recent acquisitions, I think its real value is the data.

Feeding the beast

As mentioned earlier, Sporcle’s quizzes are all written by users. Some of these are spur of the moment or strangely specific, others are generic, popular and well-maintained.

Take, for example, the quizzes to name every Disney movie or every president. Both of these quizzes ultimately contain structured and maintained data which is useful for many data-driven companies. Voice assistant questions like “when was Aladdin released” and “who was President in 1932” are answerable on tap.

Users are readily submitting and maintaining this information for others’ enjoyment, and that value could be multiplied by a savvy tech company (at a very real reputational risk).

Beasting the feed

Using user-submitted data for new purposes is cool, but I’m more interested in the inverse. Voice assistants can already answer those questions today, the data is already available, so can we reverse the process to turn that into a quiz?

At its simplest level, this could be taking Wikipedia categories and turning them into simple how-many-can-you-name quizzes. For that specific purpose, I’m pleased to announce Spork Hell, my first attempt at generative quizzing.

Spork Hell finds a Wikipedia category at random (completely at random, which can produce interesting results) and asks you how many member pages you can list.

Quiz asking players to name "County seats in California". There are 58 possible answers, but only three have been identified so far.

Spork Hell is random, variously interesting and consistently hard. I think it perfectly captures the amazing possibilities and serious challenges of generative quizzing.

I also made a version of Spork Hell which uses Wikipedia’s “List of…” pages which is even more random and way more challenging to implement.

Structured data

So far I’ve only looked at “flat” data: I take a list of items and just replay them back to the user. I’d like to take this further and generate quizzes from structured data, which would allow much more complex quizzes and massively swell the number of possibilities.

As a basic example: imagine I have some basic information on countries, cities and which country each city is in. Some quizzes I could extract are (square brackets represent variables):

Not only could I manually extract these quizzes, but I think I could reasonably expect a computer program to do so for me just by looking at the data available.

New data would then lead to many more possibilities: adding language information could make quizzes like “Name the most populous [German] speaking countries”; economic information could give us “Name the biggest exporters of [bananas]”.

And all of this can be further subdivided into more specific quizzes:

Finally, I’d hope that a computer program would continue to generate the weird stuff.

Hopefully this is the first in a series of posts of playing with generative quizzing, but for now please enjoy my early, extremely challenging attempts: