Matthew Wilkins Evaluates POS Taggers


There’s a fascinating thread on Matthew Wilkins‘s blog Work Product. Matthew’s a humanities postdoc studying literature. His thread starts in a 25 October 2008 entry:

with this quote:

I surely just need to test a bunch of them [part of speech taggers] is some semi-systematic way, but is there any existing consensus about what works best for literary material?

I would highly recommend reading from the first post in the thread forward. It’s a great fly-on-the-wall view of a non-specialist coming to grips with natural language processing.

Over the past three months, Matthew’s evaluated a whole bunch of different part-of-speech taggers looking for something that’ll satisfy his accuracy, speed, and licensing needs. The data is literary English, for now, mostly culled from Project Gutenberg.

The current entry, Evaluating POS Taggers: Conclusions, dated 27 January 2009, starts with:

OK, I’m as done as I care to be with the evaluation stage of this tagging business, which has taken the better part of three months of intermittent work. This for a project that I thought would take a week or two. There’s a lesson here, surely, …

Amen, brother. That’s one reason why Kevin Cohen’s organizing the software workshops at ACL (this year Marc Light co-chairs, but I’m still on the PC). So I suggested to Matthew that he submit this diary of his work to the NAACL 2009 Software Workshop, which is explicitly calling for just such case studies.

2 Responses to “Matthew Wilkins Evaluates POS Taggers”

  1. Matthew Wilkens Says:

    Howdy Bob,

    Thanks for the link – I hope my series of posts is of interest to someone, somewhere.

    For those who know NLP better than I do (a frighteningly low bar, alas), the stuff I’ve done should probably be seen as the product of a native informant. Imagine an Oxbridge anthropologist’s accent from decades past: “Ahh, so this is where they’re coming from over in the English department. Well, they do have a bit to learn, don’t they?”

    But yes, I’ll certainly look into the NAACL workshop. Intimidating, but I’m happy to be someone’s case study :).

  2. lingpipe Says:

    I should’ve included this paper as an example. Andrew’s talk was very informative.

    Andrew B. Clegg and Adrian J. Shepherd. 2005. Evaluating and Integrating Treebank Parsers on a Biomedical Corpus. 2005 ACL Software Workshop.

    There’s actually a whole conference, LREC, devoted to language resources and evaluation.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: