Lucene Tutorial updated for Lucene 3.6

by

[Update: 8 Mar 2014. I've just written a quick introduction to Lucene 4:

The contents of this introduction are excerpted from the Lucene chapter in my new book:

This chapter covers search, indexing, and how to use Lucene for simple text classification tasks. A bonus feature is a quick reference guide to Lucene's search query syntax.]

The current Apache Lucene Java version is 3.6, released in April of 2012. We’ve updated the Lucene 3 tutorial and the accompanying source code to bring it in line with the current API so that it doesn’t use any deprecated methods and my, there are a lot of them. Bob blogged about this tutorial back in February 2011, shortly after Lucene Java rolled over to version 3.0.

Like other 3.x minor releases, Lucene 3.6 introduces performance enhancements, bug fixes, new analyzers, and changes that bring the Lucene API in line with Solr. In addition, Lucene 3.6 anticipates Lucene 4, billed as “the next major backwards-incompatible release.”

Significant changes since version 3.0

  • IndexReader delete methods are deprecated and will be removed entirely in Lucene 4. All deletes and updates are done via an IndexWriter.
  • There is a single IndexWriter constructor that takes two arguments: the index directory and an IndexWriterConfig object. The latter was introduced in Lucene 3.1. It holds configuration information that was previously specified directly as additional arguments to the constructor.
  • IndexWriter optimize methods are deprecated. The merge method(s) supply this functionality.

Building the Source

The ant build file is in the file src/applucene/build.xml and should be run from that directory. The book’s distribution is organized this way so that each chapter’s demo code is roughly standalone, but they are able to share libs. There are some minor dependencies on LingPipe in the example (jar included), but those are just for I/O and could be easily removed or replicated. As an added bonus, the source code now includes the data used in the examples throughout the tutorial, the venerable Federalist Papers from Project Gutenberg.

3 Responses to “Lucene Tutorial updated for Lucene 3.6”

  1. jeffye Says:

    Lucene4.0 aphla has been released. looking forwards to a new tutorial.

  2. Apache Lucene 3.0 Tutorial « LingPipe Blog Says:

    [...] Lucene 3.6 Tutorial [...]

  3. Volltextsuche mit Lucene | queo blog Says:

    [...] sich weitergehend mit dem Thema beschäftigen möchte, dem sei dieses umfassende Tutorial ans Herz [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 810 other followers