Apache Lucene Flexible Query Parser

X-Git-Url: https://git.mdrn.pl/pylucene.git/blobdiff_plain/a2e61f0c04805cfcb8706176758d1283c7e3a55c..aaeed5504b982cf3545252ab528713250aa33eed:/lucene-java-3.4.0/lucene/contrib/queryparser/src/java/overview.html?ds=sidebyside diff --git a/lucene-java-3.4.0/lucene/contrib/queryparser/src/java/overview.html b/lucene-java-3.4.0/lucene/contrib/queryparser/src/java/overview.html deleted file mode 100644 index 63ebc13..0000000 --- a/lucene-java-3.4.0/lucene/contrib/queryparser/src/java/overview.html +++ /dev/null @@ -1,147 +0,0 @@ - - - - - - Apache Lucene Flexible Query Parser - - - -

Apache Lucene Flexible Query Parser

- -

-This contrib project contains the new Lucene query parser implementation, which matches the syntax of the core QueryParser but offers a more modular architecture to enable customization. -

- -

-It's currently divided in 2 main packages: -

{@link org.apache.lucene.queryParser.core}: it contains the query parser API classes, which should be extended by query parser implementations.
{@link org.apache.lucene.queryParser.standard}: it contains the current Lucene query parser implementation using the new query parser API.

- -

Features

- -

Full support for boolean logic (not enabled)
QueryNode Trees - support for several syntaxes, - that can be converted into similar syntax QueryNode trees.
QueryNode Processors - Optimize, validate, rewrite the - QueryNode trees
Processors Pipelines - Select your favorite Processor - and build a processor pipeline, to implement the features you need
Config Interfaces - Allow the consumer of the Query Parser to implement - a diff Config Handler Objects to suite their needs.
Standard Builders - convert QueryNode's into several lucene - representations. Supported conversion is using a 2.4 compatible logic
QueryNode tree's can be converted to a lucene 2.4 syntax string, using toQueryString

- -

Design

-This new query parser was designed to have very generic -architecture, so that it can be easily used for different -products with varying query syntaxes. This code is much more -flexible and extensible than the Lucene query parser in 2.4.X. -

-The new query parser goal is to separate syntax and semantics of a query. E.g. 'a AND -b', '+a +b', 'AND(a,b)' could be different syntaxes for the same query. -It distinguishes the semantics of the different query components, e.g. -whether and how to tokenize/lemmatize/normalize the different terms or -which Query objects to create for the terms. It allows to -write a parser with a new syntax, while reusing the underlying -semantics, as quickly as possible. -

-The query parser has three layers and its core is what we call the -QueryNode tree. It is a tree that initially represents the syntax of the -original query, e.g. for 'a AND b': -

-      AND
-     /   \
-    A     B
-

-The three layers are: -

QueryParser: -This layer is the text parsing layer which simply transforms the -query text string into a {@link org.apache.lucene.queryParser.core.nodes.QueryNode} tree. Every text parser -must implement the interface {@link org.apache.lucene.queryParser.core.parser.SyntaxParser}. -Lucene default implementations implements it using JavaCC. -
QueryNodeProcessor: The query node processors do most of the work. It is in fact a -configurable chain of processors. Each processors can walk the tree and -modify nodes or even the tree's structure. That makes it possible to -e.g. do query optimization before the query is executed or to tokenize -terms. -
QueryBuilder: -The third layer is a configurable map of builders, which map {@link org.apache.lucene.queryParser.core.nodes.QueryNode} types to its specific -builder that will transform the QueryNode into Lucene Query object. -

- -

-Furthermore, the query parser uses flexible configuration objects. It also uses message classes that -allow to attach resource bundles. This makes it possible to translate -messages, which is an important feature of a query parser. -

-This design allows to develop different query syntaxes very quickly. -

- -

StandardQueryParser and QueryParserWrapper

- -

-The standard (default) Lucene query parser is located under -org.apache.lucene.queryParser.standard. -

-To make it simpler to use the new query parser -the class {@link org.apache.lucene.queryParser.standard.StandardQueryParser} may be helpful, -specially for people that do not want to extend the Query Parser. -It uses the default Lucene query processors, text parser and builders, so -you don't need to worry about dealing with those. - -{@link org.apache.lucene.queryParser.standard.StandardQueryParser} usage: - -

-      StandardQueryParser qpHelper = new StandardQueryParser();
-      StandardQueryConfigHandler config =  qpHelper.getQueryConfigHandler();
-      config.setAllowLeadingWildcard(true);
-      config.setAnalyzer(new WhitespaceAnalyzer());
-      Query query = qpHelper.parse("apache AND lucene", "defaultField");
-

-To make it easy for people who are using current Lucene's query parser to switch to -the new one, there is a {@link org.apache.lucene.queryParser.standard.QueryParserWrapper} under org.apache.lucene.queryParser.standard -that keeps the old query parser interface, but uses the new query parser infrastructure. -

- - -