NYU is hosting HLT-NAACL this year (thanks especially to Setoshi Sekine for local organizer duties). The opening reception is tonight (Sunday, 4 June) in the neighborhood I share with NYU (Greenwich Village). After tutorials today, the conference starts in earnest tomorrow.

The main conference is in Brooklyn, home of Alias-i. We’ll be there with hot-out-of-the-burner LingPipe 2.2.2 CDs to pass out.

As usual, the workshops look more interesting than the main conference. I have to agree with Ken Church’s “editorial” in the Computational Linguistics journal — the acceptance rates are ridiculously low. I also took his point about the burden it puts on reviewers. I declined all invitations to review for the main conference.

For better or worse, the workshops are looking more and more like annual organized group meetings than ad-hoc workshop. Standing conferences posing as workshops include CoNLL-X, DUC, SIGPHON and BioNLP, ScaNaLU — about half of the total, with others announced as “First International”. We really wish Martin Jansche’s software workshop from last year was being held again this year. I guess my mom and dad in the audience didn’t tip the balance in Ann Arbor last year.

In any case, we’re really looking forward to BioNLP ’06. Both Breck and I were on the program committee and there are a lot of very interesting looking papers. The resources available are staggering, and best yet, mostly free (as in beer and as in speech).

The workshop on joint inference should also be interesting — we saw a preview of Heng Ji, Cynthia Rudin and Ralph Grishman’s paper Re-ranking algorithms for name tagging at the NYU seminar on Friday, which is some of the niftiest work on information extraction that we’ve seen.

  1. alexsagemorgan Says:

    I am not sure that having a low acceptance rate is a bad thing. Unfortunately, as noted in the Church editorial, the acceptance standards are also highly conservative and favor incremental changes. I don’t have a simple solution, but might it be the case that papers which have a high variance in review scores become better candidates – especially if the reviewers refuse to change their scores in the second stage? Shouldn’t controversy be a strong positive feature? There are so many smaller conferences that I think Church’s plan D (more conferences) is already in effect (particularly in the Bio Text mining area), so that something can probably get published somewhere if the submitter is patient and tries different venues.

    You’ve mentioned standing groups, and I’d definitely like there to be a standing BioNLP group of some kind, in my mind ideally a child of ICSB, but with some strong formal associations to ACL and AMIA.

    -Alex Morgan

  2. carp Says:

    Most people agree with Alex and think a low acceptance rate is good. I think that presupposes that the “best” papers are selected by the selection committee, but this is basically what Church says. This ties into the people who’ve told me that it’s important for tenure review, which rather makes sense given that the CL journal is less representative of the field than the ACL conference and most of the top ACL papers are never converted into journal articles. I think this argues for a better journal, but it’s a vicious circle.

    I think more focused workshops would be a good idea, but not more big generic conferences. For those, I think getting everyone to go to the same one is a good idea.

    I like the annual meeting feel of ICSLP/Eurospeech, where acceptance rates are in the 80s and multiple papers are allowed and everyone in the field shows up.

    The border case is highly variant. Let’s say a paper gets a 5,5,4
    and is accepted. We could use a bootstrap-style analysis to say that the chance of a reviewer giving it a 5 is 2/3 and the chance of a 4 is 1/3 (let’s not worry about smoothing). What’s the chance of the paper getting
    accepted if 5,5,4 is the minimum score?

    Winning outcomes:

    5,5,5 (2/3)**3
    5,5,4 (2/3)**2 (1/3)
    5,4,5 (2/3)**2 (1/3)
    4,5,5 (2/3)**2 (1/3)
    Total Chance: 74%!!

    Now what about a paper with a 4,4,5 that would’ve
    been rejected. Just swap 2/3 for 1/3 above and
    the result is that it actually had a 26% chance of being
    accepted. This means that we’re at about 75% F measure
    even given the very conservative model. If we smoothed or used subsampling, the variance estimate would be higher.

    Also note that the mean is pretty meaningless to this kind
    of analysis. A 5,5,1 paper has a low mean, but still has a
    2/3**3=8/27 (about 30%) chance of being accepted by the bootstrap-style analysis. Even a 5,1,1 paper has a 1/27 chance of being accepted.

