I went down to the Johns Hopkins to teach information extraction for a day at the NAACL summer school. It was 28 students with an hour
morning presentation and a 3.5 hour lab in the afternoon. The only constraint was that I was to return them in good condition and preferrably a bit more learned in the ways of LingPipe and information extraction. The students ranged in experience from undergraduates to senior graduate students.
I decided that a good lab project would be for them to reprocess with the results returned by a search engine. So I loaded the
excellent open source search engine Lucene with 1300 FBIS articles from back in the TIDES days and set the problem of helping intelligence analysts sort through a days worth of fresh intelligence about Iraq. Their task was to find better ways of presenting returned results. In the morning presentation I covered the basic input/output setup I was giving them and the source for using LingPipe to do sentence ranking with language models and extraction of named entities up to the level of coreference.
After the morning presentation, they broke up into 6 groups of on average 4 people and hatched a plan over lunch. At 1:30 we started the lab, Bob showed up to lend a helping hand. All the groups briefed Bob or me on what they were doing and we helped them get started. Lots of interesting ideas were floated and a steady hum built in the lab as we got working.
Once they got going I slipped out and procured a bottle of Moet Champagne (the real stuff–none of this California malarky) as 1st prize. Bob noted that I was perhaps as interested in teaching them about quality wine as linguistics….
The whole session was a blur, but in the end we saw interesting applications using entity detection for node/link visualization, a few efforts linking locations to google maps (not very detailed in Iraq), an effort to recognize sentences of future intent using tense.
We ended with votes after brief presentations and a group of students sliped out in search of an ice bucket.
Lessons learned: 3.5 hours is not much time, we should have structutred things more perhaps–the project would have been much better set as a week long effort. It is really fun to work with smart motivated students. The tasks limitations had more to do with project management than coding skills.
Thanks to Roy Tromble our TA, Jason Eisner and David Yarowski who invited us.