Sampling, Modeling and Measurement Error in Inference from Clinical Text


Here’s a link to the slides of the talk I presented recently at ICML:

It’s basically a list of the kinds of things that can go wrong and introduce error (bias and noise) into inferences. Although the examples are mostly clinical (with one on baseball and one on cancer clusters), the point is generally applicable.

Small, Focused Workshops

I really like small, focused workshops, and this one was very good, with lots of presentations on people’s practical experiences launching systems in hospitals and working on fascinating text mining problems from clinical notes.

Thanks to the Organizers

Thanks again to the organizers, especially Faisal Farooq, who handled all the paperwork. It’s a pretty thankless job in my experience, but having done it myself, I can really appreciate how much work it is to run something that comes off smoothly.

I don’t know how long the page will last, but here’s a link to the workshop itself:

Unintended (Beneficial) Consequences

When Noémie Elhadad invited me to give a talk, I met with her to see if there was a topic I could talk about. During that meeting, she mentioned how hard it had been to hire an NLP programmer in biomedical informatics (it’s just as hard if not harder at a small company). The upshot is that Mitzi got a new job at Columbia into the bargain. In a way, it’s too bad, because I miss talking to Mitzi about her work in genomics, about which I know relatively little compared to NLP.

2 Responses to “Sampling, Modeling and Measurement Error in Inference from Clinical Text”

  1. Dave Kincaid (@davekincaid) Says:

    Any chance there is some video around of this talk? The content looks excellent. I’d love more explanation to understand it better.

  2. Bob Carpenter Says:

    Sorry, but the session wasn’t videotaped. You can find discussions of sampling error in any stats textbook (my first example came from Gelman et al.’s Bayesian Data Analysis). Texts on survey sampling seem to treat the problem most generally. You’ll also find measurement error models in survey sampling books.

    Model specification error is the dirty little secret of both Bayesian and frequentist stats. You often see this tested (for instance, generating data with one model and fitting with another), but rarely discussed in any generality.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: