I recently received a request from someone asking how to tune their undergrad curriculum for computational linguistics. As I’ll recount later, I have considerable experience in designing curricula for computational linguistics at all levels of study. Here’s what I’d recommend.
My main piece of advice is to make sure you’re very solid in at least one of the component fields, which I take to be (1) computer science, (2) statistics, (3) linguistics, and (4) cognitive psychology. The biggest danger of an interdisciplinary eduction is becoming a jack of all trades, master of none.
For CS, it’s mainly the software side that’s relevant, including programming languages/data structures, algorithm analysis and automata theory/logic/discrete math.
These days, you’ll need a strong stats background for the machine learning component of the field, and I’d recommend a core math stats sequence, a course on linear regression and a course on Bayesian data analysis if one’s available.
As background to stats and CS theory, you’ll need math through calc, diff eqs and matrices, though any kind of more advanced math would help, either analysis or abstract algebra. Numerical methods are extremely helpful.
For linguistics, what you want to learn for comptuational linguistics is the good old-fashioned, data-intensive, detail-oriented descriptive linguistics. That’s still practiced among the laboratory or acoustic phonetics folks, but by few others. Even so, you’ll want to take in phonetics/phonology, syntax, semantics and pragmatics. Sociolinguistics is good, too. The more empirical data you get to play with, the better.
It helps to get some basic cognitive psych and psycholinguistics if the latter’s available. This isn’t so important from an engineering perspective, but it really helps to have some basic ideas about how the brain works. And there are some really really nifty experiments. Sometimes you can count these as your social science requirements.
If you can take other social sci requirements in something quantitative like micro-economics, all the better, especially if you can get beyond maximizing convex functions into things like behavioral econ and decision theory.
Then there are the interdisciplinary courses like machine learning, artificial intelligence, and computational linguistics itself. By all means, take these classes.
Other interdisciplinary studies include speech recognition, which is often taught in electrical engineering. Speech recognition’s great because it gives you some continuous math as a basis and also has efficiency issues beyond what you see with text.
Information retrieval is also very useful if you can find a class in that in either CS or a library science school.
Genomics sequence analysis would also be a great thing to take as the algorithms and math are very similar to much of comp ling and it’s a really fun area.
You can do worse than follow Chris Manning’s Stanford Course on NLP. It was the first version of this course in 1995 that got me into statistical NLP. Why doesn’t Michael Collins have an MIT courseware version?
Software to Study
There are also lots of software packages out there distributed with source. Steve Bird’s NLTK is designed explicitly for teaching and is based on Python. Our toolkit, LingPipe isn’t explicitly designed for teaching, but contains a large number of tutorials. The two other big toolkits out there, Mallet and MinorThird, are much harder to understand without already knowing the field.
Books to Read
Here’s a list of some recommended reading in these topics I put together a comprehensive computational linguisitics reading list on Amazon and occassionally update it (it’s up to date as of now).
Do Some Real Research
If at all possible, do some real research. It’s the single biggest factor a grad school will look at if you have the grades and test scores to make their basic cuts.
I’d also highly recommend browsing recent ACL proceedings online to get a feeling for what the field’s really like. And I’d highly suggest going to the one of the ACL meetings. Next year, University of Colorado Boulder’s hosting NAACL 2009.
Another great opportunity for undergrads is the Johns Hopkins Summer Workshops. This is a phenomenal opportunity to work with good and diverse teams. There’s nothing like applying to grad school with a publication and references from top researchers.
Almost everyone in this field seems to offer internships. They’ll be harder to land as an undergrad unless you have an advisor hooked into the field who can set you up. Try to get an internship that’s different from what you do in school. The best thing you can get is product programming experience on real projects with teams of more than one person.
Blogs to Read
You’re already here, so you can presumably read our links. This’ll give you more of a feel for the day-to-day in the field than the textbooks.
What the Profs Think
You can see what some of the professors in the field think of curricula and teaching by looking at the proceedings of this year’s ACL workshop on teaching comp ling.
How’d I Get Here?
I started back in 6th grade (1973-74) when my parents got me a Digi-Comp I from Edmund Scientific as a present. The reproduction manufacturers, rightly conclude that “… Digi-Comp is an ingenious, transparent Logical Gizmo that can teach anyone about binary numbers and Boolean algebra …”. I don’t know about transparent, but the books explained boolean algebra, binary arithmetic, and game trees, concepts even an elementary school student can grasp.
In 12th grade (1980-81), I read Hofstadter’s inspiring book Gödel, Escher, Bach, which made artificial intelligence in general, and learning logic in particular, sound fascinating.
As an undergraduate math major in Michigan State University’s Honors College (1981-1984), I created a computational linguistics curriculum for myself without knowing it. I took philosophy of language and analytic philosophy and non-standard logics as my humanities classes, I took developmental psych, micro-econ, cognitive psych and pyshcolinguistics as soc classes, and split the rest of my classes between computer science and (mostly discrete!) math.
As a Ph.D. student at the University of Edinburgh (1984-1987; go to the U.K. for speed), I found myself in another computational linguistics degree, this time masquerading as a
School of Epistemics Centre for Cognitive Science. Our four qualifying exams were in syntax, semantics, computational linguistics and psycholinguistics, which is not exactly a general cognitive science curriculum.
After a brief stint writing my thesis while hanging out at Stanford’s Center for the Study of Langauge and Information (1987-1988), I landed a faculty gig in Carnegie Mellon’s Computational Linguistics program (1988-1996), which is now part of the Language Technologies Institute. We introduced new M.S. and B.S. programs while I was there and I had a significant hand in designing (and teaching) the undergraduate and graduate curricula in computational linguistics. Sitting in on Chris Manning’s 1995/96 class on statistical NLP was the last straw sending me down the statistical NLP path.
I learned how to program professionally at SpeechWorks (2000–2002). For that, you need a real project, a good team, and a great mentor, like Sasha Caskey.