ACE ‘08 X-Doc Coref Bakeoff

by

The ACE ’08 site is up. In addition to the schedule, you’ll want to check out the evaluation plan.

Anyone can play, but there’s a gag order on any result comparisons and the data’s strictly proprietary (from LDC).

The evaluation plan is particularly interesting as a case study in specifying an ontology and tagging standard for a complicated problem. If you’ve never thought about something like this, I’d recommend it highly. Don’t spend too much time worrying about their evaluation metrics.

One interesting note: they’ll be using Breck Baldwin’s and Amit Bagga’s B-Cubed measure for cross-document coreference scoring. I still like relational scoring myself, especially as it allows for uncertainty on coreference to be “integrated out”. But I’ve never been able to convince anyone else about its usefulness.

We (Alias-i) probably won’t have time to participate. Our research energy’s going to be mostly directed toward our NIH grant, which itself has a cross-document coreference and entity extraction component, but one focused on high recall and database linkage rather than first-best clustering.

If I had my research druthers and decided to take on this problem, I’d focus on Dirichlet process clusterers. In particular, I’d like to see a truly Bayesian version of something like Haghighi and Klein (2007) that used posterior sampling. Even more fun would be to integrate (pun intended) with a Bayesian tagger, like the one described in Finkel et al. (2005). In fact, it looks like Finkel et al. (2006) are already thinking along these lines in other arenas.

I’ve been fascinating with cascading processing, and in particular on-line disambiguation, ever since grad school, where I was encouraged in this pursuit by Mark Steedman (who knew computational linguists had Wikipedia entries?). Online processing and disparate information integration for disambiguation was even the subect of my job talk at Carnegie Mellon way back in 1989. It was what I was working on at the end of my time at Bell Labs, which spun out into my last ACL publication, Collins et al. (2004).