Tuesday, 18 November 2008

Lucene Analyzers

There are the conclusions of differents benchmarks with differents document analyzers and Lucene.

Tested analyzers:
(if you have no idea about Information Retrieval Metrics look here)

For the precision metric the best analyzer was the Standard Analyzer.
For the recall metric the best analyzer was Double Metaphone Analyzer.
For R-Precision metric and K-Precision the best was the Snowball Analyzer with stop words.

If you are developing an information retrival system with Lucene I recommend you to use the Snowball Analyzer with a good set of stop words, this will return to your users better results.