Archive for the ‘Business’ Category

Canceled: Help Build a Watson Clone–Participants Sought for LingPipe Code Camp

January 10, 2014

Code Camp is canceled. We are late on delivering a LingPipe recipes book to our publisher and that will have to be our project for March. But we could have testers/reviewers come and hang out. Less fun I think.

Apologies.
Breck

—————————

 

Dates: To be absolutely clear: Dates are 3/2/2014 to 3/31/14 in Driggs Idaho. You can work remotely, we will be doing stuff before as well for setup.

New: We have setup a github repository. URL is https://github.com/watsonclone/jeopardy-.git

————————

Every year we go out west for a month of coding and skiing. Last year it was Salt Lake City Utah, this year it is Driggs Idaho for access to Grand Targhee and Jackson Hole. The house is rented for the month of March and the project selected. This year we will create a Watson clone.

I have blogged about Watson before and I totally respect what they have done. But the coverage is getting a bit breathless with the latest billion dollar effort. So how about assembling a scrappy bunch of developers and see how close we can come to recreating the Jeopardy beast?

How this works:

  • We have a month of focused development. We tend to code mornings, ski afternoons. House has 3 bedrooms if you want to come stay. Prior arrangements must be made. House is paid for. Nothing else is.
  • The code base will be open source. We are moving LingPipe to the AGPL so maybe that license or we could just apache license it. We want folks to be comfortable contributing.
  • You don’t have to be present to contribute.
  • We have a very fun time last year. We worked very hard on an email app that didn’t quite get launched but the lesson learned was to start with the project defined.

If you are interested in participating let us know at watsonclone@lingpipe.com. Let us know your background, what you want to do, do you expect to stay with us etc…. No visa situations please and we don’t have any funds to support folks. Obviously we have limited space physically and mentally so we may say no but we will do our best to be inclusive. Step 1–transcribe some Jeopardy shows.

Ask questions in the comments so all can benefit. Check comments before asking questions pls. I’ll answer the first one that is on everyone’s mind:

Q: Are you frigging crazy???

A: Why, yes, yes we are. But we are also really good computational linguists….

Breck

Natural Language Generation for Spam

March 31, 2012

In a recent comment on an earlier post on licensing, we got this spam comment. I know it’s spam because of the links and the URL.

It makes faculty adage what humans can do with it. We’ve approved to beacon bright of that with LingPipe’s authorization — we artlessly can’t allow the attorneys to adapt our own arbitrary royalty-free license! It was advised to accept some AGPL-like restrictions (though we’d never heard of AGPL). At atomic with the (A)GPL, there are FAQs that I can about understand.

ELIZA, all over Again

What’s cool is how they used ELIZA-like technologies to read a bit of the post and insert it into some boilerplate-type generation. There are so many crazy and disfluent legitimate comments that with a little more work, this would be hard to filter out automatically. Certainly the WordPress spam filter, Akismet, didn’t catch it, despite the embedded links.

Black Hat NLP is Going to Get Worse

It would be really easy to improve on this technology with a little topic modeling and better word spotting (though they seem to do an OK job of that) and better language modeling for generation. Plus better filtering a la modern machine translation systems.

The real nasty applications of such light processing and random regeneration will be in auto-generating reviews and even full social media, etc. It’ll sure complicate sentiment analysis at scale. You can just create blogs full of this stuff, link them all up like a good SEO practitioner, and off you go.

YapMap: Breck’s Fun New Project to Improve Search

January 27, 2012

I have signed on as chief scientist at YapMap. It is a part time position that grew out of me being on their advisory board for the past 3 years. Try the search interface for the forums below:

Automotive Forums

YapMap search for Low Carb Breakfast on Diabetes Daily

A screen shot of the interface:

UI for YapMap's search results

What I like about the user interface is that threads can be browsed easily–I have spent hours on remote controlled airplane forums reading every post because it is quite difficult to find relevant information within a thread. The color coding and summary views are quite helpful in eliminating irrelevant posts.

My first job is to get query spell checking rolling. Next is search optimized for the challenges of thread based postings. The fact that relevance of a post to a query is a function of a thread is very interesting. I will hopefully get to do some discourse analysis as well.

I will continue to run Alias-i/LingPipe. The YapMap involvement is just too fun a project to pass up given that I get to build a fancy search and discovery tool.

Breck

“Academic” Licenses, GPL, and “Free” Software

November 3, 2011

[This post repeats a long comment I posted about licensing in response to Brendan O'Connor's blog entry, End-to-End NLP Packages. Brendan's post goes over some packages for NLP and singles out LingPipe as being only “quasi free.”]

Restrictive “Academic-Only” Licenses

Some of those other packages, like C&C Tools and Senna, are in the same “quasi free” category as LingPipe in the sense that they’re released under what their authors call “non-commercial” licenses. For instance, none of the Senna, C&C, or LingPipe licenses are compatible with GPL-ed code. Senna goes so far as to prohibit derived works altogether.

The LingPipe License

The intent for the

was a little different from the “academic use only” licenses in that we didn’t single out academia as a special class of users. We do allow free use for research purposes for industrialists and academics alike. We also provide a “developers” license that explicitly gives you this right, which makes some users’ organizations feel better.

Truly Free NLP Software

The other tools, like NLTK, Mallet, OpenNLP, and GATE are released under more flexible licenses (LGPL, Apache or BSD), which I really do think of as being truly “free”. Mahout’s also in this category, though not mentioned by Brendan, whereas packages like TreeTagger are more like Senna or C&C in their restrictive “academic only” licensing.

Stanford and the GPL

Stanford NLP’s license sounds like it was written by someone who didn’t quite understand the GPL. Their page says (the link is also theirs):

The Stanford CoreNLP code is licensed under the full GPL, which allows its use for research purposes, free software projects, software services, etc., but not in distributed proprietary software.

Technically, what they say is true. It would’ve been clearer if they’d replaced “research” with “research and non-research” and “free” with “free and for-profit”. Instead, their choice of examples suggests “free” or “research” have some special status under the GPL, which they don’t. With my linguist hat on, I’d say their text leads the reader to a false implicature. The terms “research” and “academia” don’t even show up in the GPL, and although “free” does, GNU and others clarify this usage elswewhere as “free as in free speech”, not “free as in free beer”.

Understanding the GPL

The key to understanding the GPL lies behind Stanford’s embedded link to

Here, proprietary doesn’t have to do with ownership, but rather with closed source. Basically, if you redistribute source code or an application based on GPL-ed code, you have to also release your code under the GPL, which is why it’s called a “copyleft” or “viral” license. In some cases, you can get away with using a less restrictive license like LGPL or BSD for your mods or interacting libraries, though you can’t change the underlying GPL-ed source’s license.

GPL Applies to Academics, Too

There’s no free ride for academics here — you can’t take GPL-ed code, use it to build a research project for your thesis, then give an executable away for free without also distributing your code with a compatible license. And you can’t restrict the license to something research only. Similarly, you couldn’t roll a GPL-ed library into Senna or C&C or LingPipe and redistribute them under their own licenses. Academics are often violating these terms because they somehow think “research use only” is special.

Services Based on GPL-ed Software and the AGPL

You can also set up a software service, for example on Amazon’s Elastic Compute Cloud (EC2) or on your own servers, that’s entirely driven by GPL-ed software, like say Stanford NLP or Weka, and then charge users for accessing it. Because you’re not redistributing the software itself, you can modify it any way you like and write code around it without releasing your own software. GNU introduced the Affero GPL (AGPL), a license even more restrictive than the GPL that tries to close this server loophole for the basic GPL.

Charging for GPL-ed Code

You can charge for GPL-ed code if you can find someone to pay you. That’s what RedHat’s doing with Linux, what Revolution R’s doing with R, and what Enthought’s doing with Python.

LingPipe’s Business Model is Like MySQL’s

Note that this is not what MySQL did with MySQL (before they sold it to Oracle) nor is it what we do with LingPipe. In both those cases, the company owns all the intellectual property and copyrights and thus is able to release the code under multiple licenses. This strategy’s explained on the

We license LingPipe under custom licenses as well as our royalty-free license. These licenses include all sorts of additional restrictions (like only using some of the modules on so many servers) and additional guarantees (like indemnification and maintenance); don’t ask me about the details — that’s Breck’s bailiwick. Suffice it to say most companies don’t like to get involved with copyleft, be it from GPL or LingPipe’s royalty-free license. So we let them pay us extra and get an unencumbered license so they can do what they want with LingPipe and not have to share their code. We’ve had more than one customer buy commercial license for LingPipe who wouldn’t even tell us what they were going to do with our software.

Free “Academic” Software

Also, keep in mind that as an academic, your university (or lab) probably has a claim to your intellectual property developed using their resources. Here’s some advice from GNU on that front:

 

Oracle buys Endeca, HP buys Autonomy, Microsoft buys FAST

October 28, 2011

The news that Oracle’s buying Endeca sounds awfully familiar. But this time it cuts a little closer to home, because we’re an Endeca technology partner. Endeca has been a great customer to work with — we’ve been really impressed with their engineers at every turn.

Clean Sweep

I believe this makes it almost a clean sweep of the medium-to-medium-large-sized independent search companies. Maybe Vivisimo will be next. Of course, there are still small companies delivering search via Apache Lucene and SOLR, such as Sematext and Lucid Imagination. I imagine they will be delighted that yet another small competitor was snapped up by a tech giant.

2008: Microsoft buys FAST

By combining the innovation and agility of FAST with the discipline and resources of Microsoft, our customers get the best of both worlds: market-leading products from a trusted technology partner. … Enterprise Search from Microsoft offers best-in-class technologies…

from: Microsoft fact sheet

2011: HP buys Autonomy

Autonomy brings to HP higher value business solutions that will help customers manage the explosion of information. Together with Autonomy, we plan to reinvent how both unstructured and structured data is processed, analyzed, optimized, automated and protected. … this bold action will squarely position HP in software and information to create the next-generation Information Platform, and thereby, create significant value for our shareholders.

from: HP press release

2011: Oracle buys Endeca

Combination [of Oracle and Endeca] provides best-in-class technology and applications for unstructured data management, business intelligence, and web commerce. … The convergence of structured and unstructured information is driving the need for a common data management and analytics platform.

from: Oracle Press Release

Maybe the tough economic climate has made it hard for small-to-medium-sized tech companies to survive without the deep pockets of a successful large tech company. Maybe Oracle made them an offer that was too good to refuse. Far better to sell when you can get a good price than to suffer the fate of Yahoo! (or SpeechWorks, for that matter).

Make us an Offer?

On that note, feel free to make us an offer for LingPipe.

 

IBM’s Watson and the State of NLP

June 14, 2011

Aditya Kalyanpur presented an overview of the Jeopardy! winning Watson computer system June 6 at TheLadders.com in New York for the New York Semantic Web Meetup. I was asked to present a three minute overview of the state of Natural Language Processing (NLP). In this post I want to couch the Watson system in the context of the state-of-the-art since it didn’t make sense to do it at the meetup because I presented first.

The State of NLP According to Breck

Conveying the state-of-the-art in three minutes is quite a challenge so lets run with the analogy of aviation for ease of comprehension. So where is NLP?


It Flies!

We have achieved the analog of basic powered flight. No doubt.


Yikes! and away

But in no sense have we gotten to the this level of performance.


Amelia Earhart's Lockheed Vega

My best guess is that we are at the point of a reasonable commercial foundation as an industry with some changes to come that we don’t know about yet, not unlike aviation in the mid 1920′s. Perhaps the beginning of the Golden Age of NLP.


And in no sense are we in the reliable, high technology commercial space that modern air transport provides.

Boing 777


Where does Watson Fit in the Analogy

Watson fits perfectly in the example of the red 1928 Lockheed Vega above for the following reasons:

  • The Vega is actually Amelia Earhart’s plane that was used to break records (crossing the Atlantic solo), generate publicity and was a stunning success for a nascent industry.
  • While inspirational, the Vega’s success had little to do with advancing the underlying technology. What would I consider an advancement of technology? Frank Whittle patented the turbojet in 1930.
  • Watson shows how a 20 person team working 4 years can win a very challenging game with skill, effort and daring much in the same way that aviation records were broken with the same. Don’t think some careers were not on the line with the Watson effort–I think IBM ceased termination by firing squad in the 70′s so Earhart had more on the line. But what are the prospects of a mid-level ex-IBM exec in today’s economy? Perhaps the firing squad would be a kindness.

But Watson is Playing a Game

There is one issue that seriously concerns me; Watson won a question answering game with the trivial twist that the answers must be phrased as questions. So the clue “First President of the US” is answered with “Who is George Washington”. But Watson is not a general purpose question answering system. What is the difference?

Another analogy: The game of chess is based on medieval battles but even though Big Blue beats the best human players one would never consider using Big Blue to manage an actual battle. Real war is messy, approximate and without clear rules which makes chess algorithms totally inappropriate.

Real world question answering has similar qualities to real war: messy, approximate and no clear rules. The game of Jeopardy! is based on the existence of a unique, easily understood and verified answer given the clue. Taking one of the examples from the talk;

In 1698, this comet discoverer took a ship called the Paramour Pink on the first purely scientific sea voyage

The correct “question” is “Who is Edmond Halley” of Halley’s Comet fame. The example is used to work through an impressive system diagram that resembles a well developed model train set (thanks to Prof. Mark Steedman for the simile). Much is done to generate the correct answer while avoiding distractors like Peter Sellers from the Pink Panther movies. But run the same clue past Google with “-watson -jeopardy” appended to eliminate pages that mention discussion of this publicized example and the first result is Halley’s Comet Stamps with the first sentence mentioning the correct answer.

There is still an impressive amount of work in extracting the correct name but the hunt for the answer was ready to be found exactly because it is a game, unambiguous, well known and well selected given the clue.

What does Real World Question Answering Look Like?

What kinds of questions have I approached a search engine with?

What is the current 30 year FHA mortgage rate?

This question is a disaster from the uniqueness of answer perspective. My initial search results were pretty low quality and did not provide accurate rate information for what I knew the answer to be.

When is it best to ski in Chile?

This went better. There was a FAQ on the first page of results but the answer just went on and on. “The season runs from mid-June to mid-October. Although every year is different, and it comes down to Mother Nature, the best time for dry powder is mid-June, July, August, and up to the 2nd week in September. After that,….” Again we have a non-unique answer because my question was not that specific in the first place.

What is the Reputation of LingPipe?

This is a question that a group of Columbia MBA students took on for us in their small business program which I recommend btw.

This question was hopeless in search because there is not a page out there that needs to be found with our reputation nicely summarized. Answering the question requires distillation across many resources even if information was restricted to web only.

Welcome to the real world, question answering is hell.

Where Might Watson Flourish Outside of Jeopardy! Tournaments?

Jeopardy! is a game of finding the uniquely obvious given indirect clues. Otherwise it is not a game that can be judged and played. What else in the world has this quality? The Watson team is now approaching medical diagnosis which is a real world use case that might match the Jeopardy! game format with symptoms as clues and diagnosis as the answer. Uniqueness is not guaranteed in diagnosis but Watson can handle multiple answers. This is an area where computer systems from the 1970′s, e.g. Mycin, out performed experts but they didn’t have a NLP component. Medical diagnosis, once symptoms are recognized, is a game like problem.

In the end Watson is an engineering achievement, but in no way have the skills of a good reference librarian been replicated.

I came across an interesting article by Michael Lind on information technology and its role in productivity while writing this blog post. Interestingly he puts information technology in the same time bracket as I do.

Breck

Pro Bono Projects and Internships with LingPipe

April 4, 2011

We have initiated a pro bono program that teams a lingpipe employee with a programmer wanting to learn about LingPipe to serve a non-profit or researcher needing help with a text analytics/NLP/computational linguistics project.

So far is is going pretty well with one classification project underway. The way it went was that I worked with an intern, a solid NLP programmer, to help a Ga Tech researcher get a classification system up and running. Now the intern is running the project with some feedback from me but not much is required at this point.

We are taking applications for interns wanting to learn how to code with LingPipe. We also will take applications for projects. Non-profits or academic research only please. Just send breck@lingpipe.com an email outlining your skills/needs and we will see what we can do.

Breck

LingPipe Classifiers and Chunkers for Endeca Extend Partner Program

November 3, 2009

A couple weeks ago, Endeca made the following press release:

The “leading text analytics software vendors” are us (props to Breck for naming us with an “A”), Basis Technology, Lexalytics, MetaCarta, NetOwl, Nstein, Semantia and Temis. But wait, that’s not all. A slew of text analytics companies had either joined earlier or announced joining now, including ChoiceStream, BayNote, Lexalytics, Coremetrics, NStein, and Searchandise.

It’s no surprise that we’re all thinking Endeca has quite a bit of potential as a channel partner.

After the usual marketing blather (e.g. “leveraging the extensibility of the McKinley platform”, “lower cost of ownership”, “value-added capabilities”, etc.) and vague promises (e.g. “unrestricted exploration of unstructured content”), the third paragraph of Endeca’s press release explains what it’s all about in allowing Endeca’s search customers to

… run their data through an Endeca Extend partner solution, extract additional meta-data elements from the text, and append that meta-data to the original content

Endeca Records

Endeca stores documents in record data structures, which associate string keys with lists of string values. This is the same rought structure as is found in a Lucene Document.

One striking difference is that Endeca’s API is cleaner and better documented. Overall, I’m very impressed with Endeca’s API. Looking at their API reminds me of the APIs we built at SpeechWorks, where wiser heads prevailed on me to forego complex controls designed for control-freak grad students in favor of making easy things easy.

Another striking difference is that Lucene’s document structure is much richer, allowing for binary blobs to be stored by those trying to use Lucene as a database. Lucene also allows both documents as a whole and fields within a document to be boosted, adding a multiplier to their search scores for matching queries.

Manipulator Extensions

Endeca’s produced an API for extensions. An extension visits records, modifies them, and writes them back to the index. It can also write into its own scratch space on the file system and generate all new records.

An extension consists of three components: configuration, factory, and runtime.

Class 1. Configuration

The bean-like configuration class provides setters and getters for strings, booleans, integers, and doubles. These are labeled with attributes and accessed through reflection. There’s then a method to validate a configuration that returns a list of errors as structured objects. I’m a big fan of immutable objects, so working with beans drives me crazy. They could use some more doc on concurrency and lifecycle order; as is, I was conservative and programmed defensively against changes in config.

Configuration is handled through an administrator interface. As I said, it’s bean-like.

Class 2. Factory

There is then a factory class with a method that returns the config class (so the admin interface can tell what kind of config to build for it). It also contains a method that takes an Endeca application context and configuration and produces a runtime application. The context provides services like logging, a path to local file space, and a hook into a pipe into which modified records may be sent.

Class 3. Runtime

The runtime simply provides a record visitor method. To write out changes, you grab the output channel from the context provided to the factory. There are also some lifecycle methods used as callbacks: interrupt processing, processing of records is complete, and final cleanup. You can still write out answers during the completion callback.

Endeca’s Demo Manipulator Extension

Endeca has great programmers and their Java API design was really clear. I love it when vendors follow standard patterns and idioms in their API designs. Especially when they use generics usefully.

The PDF developer doc’s still in progress, but their Javadoc’s mostly in place. What was really sweet is that they gave us a working demo extension program with all of its configuration, tests, and even mock objects for use in JUnit testing the entire framework without a complete install of Endeca’s platform. I’m so happy when someone sends me a Java package that unpacks then compiles with Ant without griping.

LingPipe Classifier CAS Manipulator Extension

The first extension I wrote is configured with a path to a serialized text classifier on the classpath. I then configured a list of field names (only strings are available, so I went with comma-separated values) from which to collect text, and a field name into which to write the result of classification. [Correction: 5 Nov 2009, Endeca let me know that they had this covered; if I declare the variables in the bean-like configuration to be list-like values, the reflection-based config system will figure it out. This is awesome. I always hate rolling in ad-hoc little parsing "languages" like CSV in config. It's just sooo hard to doc and code correctly.]

LingPipe Chunker CAS Manipulator Extension

The second extension is a chunker. It requires a path to a chunker. Optionally, it allows a sentence detector to be configured for preprocessing (most of our chunkers work better at the sentence level). It also optionally allows a dictionary (and tokenizer factory) to be specified for overriding the chunks found by the chunker. Then, a list of field names from which to read text. The output gets written into chunk-specific fields. Because a given field name can contain multiple values, you can keep the resulting spans separate.

Endeca’s Faceting

Endeca’s big on faceted search. You may be familiar with it from two of the best online stores, NewEgg and Amazon.

It’s easy to treat our classifier plugin output as a facet. For instance, classify documents by sentiment and now sentiment’s a facet. Do a search, and you’ll get a summary of how many positive and how many negative documents, with an option to restrict search to either subset.

It’s also easy to treat our chunker output as a facet. For instance, if you include a company name chunker, you’ll be able to use companies as facets (e.g. as NewEgg does with manufacturers, though documents may contain references to more than one company).

Buying Plugins

Drop Breck a line.

Now that I have my head around the bigger picture, it’s pretty easy to build these kinds of extensions. So if there’s something you’d like integrated into Endeca and you’re willing to pay for it, let us know.

Has Open Source Failed Commercially?

July 29, 2009

I just read in Business Week‘s Technology at Work Blog, a post by Peter Yared, The Failure of Commercial Open Source.

Yared argues that companies aren’t making money from open source, because:

  1. Open source only works for commodities (e.g. operating systems)
  2. Open source is as expensive as proprietary software to develop
  3. Selling software is miserable
  4. Customers are switching to software as a service (SaaS)

He argues that open source is only successful when it’s free and supported by a broad community of developers, and argues that only a handful of commercial software companies based on open source have had “liquidity events” (did I mention this was in Business Week?).

I replied (perhaps into the ether given their moderation):

We are operating under a MySQL-like model, but with even more restrictions on our source code. We have no marketing or sales department (but then we’re only 2.5 people full time with an additional 0-2 contractors at any given time). A paycheck and a job I love with lots of flexibility is the only “liquidity event” I’m looking for.

It helps having the source out there, even if it’s not covered by an official open source license, because then developers can kick the tires, go beyond the doc, and help fix bugs. And users can try it and inspect it before they buy.

We don’t need help patching the bugs, but we do need help finding them. As much as we unit test, bugs wind up in production code, and users find them. When they report source code locations and suggest patches, it really does save us time.

We’ve dealt with several Fortune 500 companies, but only the U.S. government has put us through the “approved vendor” mill, and only the U.S. defense department was ever interested in security audits (and then mostly to make sure you weren’t sending packets out over the net).

A big issue for sales is that companies want indemnification against lawsuits, mainly for patent infringement.

Are SaaS businesses really taking off? All the big companies we talk to are too paranoid to let their data offsite.

How Breck approaches new projects in natural language processing

March 8, 2009

A skill developers typically don’t get in school is how to frame problems in terms of the messy, approximate world of heuristic and machine learning driven natural language processing. This blog entry should help shed some light on what remains a mostly self taught black art. This is not the only way to do things, just my preferred way.

At the top level I seek three things:

  1. Human annotated data that directly encodes the intended output of your NLP program.
  2. A brain dead, completely simple instance of a program that connects all inputs to the intended output.
  3. An evaluation setup that takes 1) and 2) and produces a score for how good a job the system did. That score should map to a management approved objective.

Once I have the above I can then turn my attention to improving a score without worrying about whether I am solving the right problem (1 and 2 handle this) and whether I have sorted out access to the raw data and have a rough architecture that makes sense. Some more details on each point:

Human Annotated Data

If a human cannot carry out the task you expect the computer to do (given that we are doing NLP), then the project is extremely likely to fail. Humans are the best NLP systems in the world. Humans are just amazing at it and humans fail to appreciate the sophistication of what they do with zero effort. I almost always ask customers to provide annotated data before accepting work. What does this provide?

  • Disambiguation: Annotated data forces a decision on what the NLP system is supposed to do and it communicates it clearly to all involved parties. It also keeps the project from morphing away from what is being developed without an explicit negotiation over the annotation.
  • Buy in by relevant parties: It is amazing what happens when you sit management, UI developers, business development folks in a room and force them to take a text document and annotate it together. Disagreements that would be at the end of a project surface immediately, people know what they are buying and they get a sense that it might be hard. The majority of the hand waving “Oh, just have the NLP do the right thing ” goes away. Bonus points if you have multiple people annotate the same document independently and compare them. If the agreement rate is low then how can you expect a piece of software to do it?
  • Evaluation: The annotated data is a starting place for evaluation to take place. You have gold standard data to compare to. Without it you are flying blind.

Simple implementation that connects the bits

I am what Bob calls a "thin wire developer" because I really prefer to reduce project risk by being sure all the bits of software/information can talk to each other. I have been amazed at how difficult access to data/logs/programs can be in enterprise setups. Some judgement is required here, I want to hit where there are likely blocks that may force completely different approaches (e.g. access search engine logs for dynamic updates or lists of names that should be tracked in data). Once again this forces decisions early in development rather than later. Unfortunately it takes experience to know what bits are likely to be difficult to get and valuable in the end system.

Evaluation

An evaluation setup will truly save the day. It is very frustrating to build a system where the evaluation consists of "eyeballing data by hand" (I actually said this at my PhD defense to the teasing delight of Michael Niv, a fellow graduate student, who to this day ponders my ocularly enhanced appendages). Some of the benefits are:

  • Developers like a goal and like to see performance improve. It gets addictive and can be quite fun. You will get a better system as a result.
  • If the evaluation numbers map well to the business objective then the NLP efforts are well aligned with what the business wants. (Business objectives can be to win an academic bake-off for graduate students).
  • Looks great to management. Tuning systems for better performance can be a long and opaque process to management. I got some good advice to always link the quality of the GUI (Graphical User Interface) to the quality of the underlying software to communicate transparently the state of the project. An evaluation score that is better than last month communicates the same thing especially if management helped design the evaluation metric.

I will likely continue this blog thread picking up in greater detail the above points. Perhaps some use cases would be informative.

Breck


Follow

Get every new post delivered to your Inbox.

Join 807 other followers