I had the luxury of dashing out to California last week for the San Francisco Turk Meetup hosted by Dolores Labs (see the O’Reilly radar blog summary). The TurKit Application presented by Greg Little really caught my eye. You, too, can read the pre-publication paper:
- Little, Greg, Lydia B. Chilton, Robert C. Miller, and Max Goldman. 2009. TurKit: Tools for iterative tasks on Mechanical Turk. Not yet published.
TurKit’s an open-source package that lets you submit a single task to multiple Turkers in a novel way. Rather than having them all do the task independently, the Turkers work on the tasks sequentially. That is the first Turker does it from scratch, then the second Turker works to improve the first Turker’s answer. The example in the talk involved transcribing a very messily written note. It’s amazing to see the progression from the first Turker to the eighth Turker.
TurKit has a beautiful programming paradigm which lets you write jobs imperatively. The base language is JavaScript (with JSON-based serialization). The beautiful part is that TurKit lets the programmer write loops, the inner operations of which call out to Turk annotation jobs. This is completely natural, and completely unsupported by the Turk API, which is your usual REST-type stateless application. It reminds me of Python’s yield-based iterator implementations.
June 15, 2009 at 4:54 pm |
I will second that.
I really liked the idea of having users do multiple passes on the same task, until the output reaches the point where no easy improvement is possible. The iterative handwriting recognition and the “rewrite this paragraph” examples are really nice.
I especially like the fact that the idea allows users to submit low-quality unstructured answers that end up being useful as bases for building something better. “When life gives you lemons, make lemonade”