Hunter/Gatherer vs Farming with Respect to Information Access


Antonio Valderrabanos of gave a talk at NYU today and he compared the current search/NLP strategies by information providers to¬† humanity’s hunter/gatherer stage and offered a vision of information farming. I kept having images of Google’s web spider out digging roots and chasing animals with a pointed stick wearing a grubby little loin cloth. Then I would switch to images of a farm stocked supermarket with well organized shelves, helpful clerks and lots of choice.

The analogy brought up a strong bias that I have in applying natural language processing (NLP) to real word problems– I generally assume that the software must encounter text as it occurs in the “wild”–after all it is what humans do so well and we are in the business of emulating human language processing right?

Nope, not on the farm we’re not. On the farm we use NLP help to enhance information that was never a part of standard written form. We use NLP to suggest and assign meta tags, connect entities to databases of concepts and create new database entries for new entities. These are things that humans are horrible at but humans are excellent at choosing from NLP driven suggestions– NLP is pretty good at suggestions. So NLP is helping create the tools to index and cross reference at the concept level all the information in the supermarket. Humans function as filters of what is correct. At least initially.

As the information supermarket gets bigger, the quality of the NLP (machine learning based) will get better, perhaps good enough to start automatically bringing in “wild” information with decent concept indexing and meta tagging. A keyword index is crude yet effective tool but an inventory system it is not and that is what we need to advance to the next level of information access.

