Programmers Expect Dog Like Consistency in Software, but NLP is more like Cats

Typical Startup Staffing Choices

Photo Credit hoangnam_nguyen at

From the perspective of a traditionally trained programmer, one of the hardest things to appreciate when working with Natural Language Processing(NLP) systems is the approximate, unpredictable quality provided by NLP modules. I liken it to getting dog people to learn to work with cats.

For example, a standard NLP module could decide whether “Bill Jones” is mentioned in the sentence “An announcement was made by Sir Bill Jones on the eve of his retirement…” In database terms the task is to satisfy the query “SELECT person FROM text” or a string function “detectPerson(sentence)”.

A range of answers is possible. They include:

  • Getting it right. “Bill Jones” is definitely a person in the sentence.
  • Getting it right, but hedging. “Bill Jones” has a 75% likelihood of being a person in the sentence. A person could also be “Sir Bill” with 15% likelihood or “Jones on” with likelihood 1% with some other low confidence estimates.
  • Getting it dead wrong. No “Bill Jones” person found at all. It turns out that the system had never seen “Sir” as an honorific in training and had other issues. Sorry, maybe next time — really the system gets a 95% F-measure so it won’t happen often.

Obviously getting it right is what we want. The programmer throws the stick and the program dutifully fetches the information like a dog. Provide pat on head and a who’s-a-good-program positive reinforcement and all is well.

Getting it right but hedging is more of a problem. The existence of a probability complicates things a bit. What do you do with probabilities that are low or tied? What about close cases where the spans are overlapping? It’s like throwing a ball and getting a chew toy, a bit of a stick and some regurgitated food.

Getting it dead wrong is really a wrench in the works because it is just so friggin’ obvious that the right answer is sitting right there, pristine. How can the program ignore it? Just go get it, pick it up, and bring it back you stoooopid program. What is this 95% pedigree worth if it can’t handle that? Try playing catch with a cat and you will have exactly the right experience. But it is not the cat’s fault; you can’t expect a cat to act like a dog, but cats are a useful tool if you want to catch mice or amusingly pursue a laser pointer. Like NLP, cats have to be influenced to do what you want, not commanded.

Working with Cats

The challenges of NLP require that the developer work with the strengths of the technology and factor in the shortcomings. It is a different mind set that tends to require more complicated data structures. Also multiple possible answers slows things down with extra decision making loops. Systems often need extensive tuning to work with the capabilities of the cat^h^h^h NLP technology. Once that is mastered, then the additional flexibility and tolerance of uncertainty of signal in data will result in a very different capability. Systems can degrade more gracefully, apply to more situations and provide a more human flexibility to text processing. But you got to scratch the cat just the right way.

Breck (who loves dogs and cats)

P.S. If you want to get a start on NLP programming check out our tutorials.

2 Responses to “Programmers Expect Dog Like Consistency in Software, but NLP is more like Cats”

  1. L Says:

    This is spot-on, I’m always saying something along these same lines, and our company, as a pure NLP-house that often hires programmers with no NLP background, often has to explain this to new employees. Additionally, we find that there are many non-NLP programmers who just can’t take it. They might try, but will eventually reach some point where they just cannot work with NLP anymore.

    Like getting overly frustrated that the cat won’t pick up the ball that is sitting right in front of it.

  2. Wojtek Says:

    Great post! It is exactly what NLP is like.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: