Apache Lucene 3.0 Tutorial

by

[Update: 10 Feb 2014. Much has changed in Lucene since 3.0. An extensive tutorial for Lucene 4 is now available as a chapter in the book

Text Processing in Java

This chapter covers search, indexing, and how to use Lucene for simple text classification tasks. A bonus feature is a quick reference guide to Lucene's search query syntax.]

Update (24 July 2012) The tutorial has been updated for Lucene 3.6. See:


With this release of the LingPipe Book, I created a standalone version of the tutorial for version 3 of the Apache Lucene search library.

It contains about 20 pages covering the basics of analysis, indexing and search. It’s distributed with sample code and an Ant build file with targets to run the demos.

Building the Source

The ant build file is in the file src/applucene/build.xml and should be run from that directory. The book’s distribution is organized this way so that each chapter’s demo code is roughly standalone, but they are able to share libs. There are some minor dependencies on LingPipe in the example (jar included), but those are just for I/O and could be easily removed or replicated.

More In-Depth Info on Lucene

The standard reference for Lucene is not its own site or javadoc, which are fairly limited tutorial-wise, but rather the recently released (as of February 2011) book by three Lucene committers:

Looking at the Manning Press page for the book (linked above), I just realized they blurbed one of my previous blog posts, a review of Lucene in Action!

But wait, there’s more

If you’re interested in natural language, or just need a tutorial on character encodings and Java strings and I/O, you can find the rest of the LingPipe book at its home page:

Enjoy. And as always, let me know if you have any comments, here, or directly to carp@lingpipe.com.

4 Responses to “Apache Lucene 3.0 Tutorial”

  1. Tutorial de Apache Lucene 3.0 con códigos fuentes | Javier Murillo Blanco Says:

    [...] Tutorial: http://lingpipe-blog.com/2011/02/11/apachelucene-3-0-tutorial/ [...]

  2. OMG Says:

    If you are writing a Tutorial for Lucene 3.0… than you should’nt use deprecated Functions that are pre 3.0 …
    If you add code snippets… a project file would be great/nice…
    summorized: This Tutorial SUCKS!
    THIS WEBSITE WAS A WASTE OF TIME!

    • Bob Carpenter Says:

      1. It’s compiling with Lucene 3.0.1. Are there specific deprecated features you’re worried about?

      2. The code’s linked from the post above. There’s an Ant build file.

  3. Lucene Tutorial updated for Lucene 3.6 « LingPipe Blog Says:

    [...] API so that it doesn’t use any deprecated methods and my, there are a lot of them. Bob blogged about this tutorial back in February 2011, shortly after Lucene Java rolled over to version [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 811 other followers