Apache Lucene Flexible Query Parser

X-Git-Url: https://git.mdrn.pl/pylucene.git/blobdiff_plain/a2e61f0c04805cfcb8706176758d1283c7e3a55c..aaeed5504b982cf3545252ab528713250aa33eed:/lucene-java-3.5.0/lucene/contrib/queryparser/src/java/overview.html diff --git a/lucene-java-3.5.0/lucene/contrib/queryparser/src/java/overview.html b/lucene-java-3.5.0/lucene/contrib/queryparser/src/java/overview.html new file mode 100644 index 0000000..63ebc13 --- /dev/null +++ b/lucene-java-3.5.0/lucene/contrib/queryparser/src/java/overview.html @@ -0,0 +1,147 @@ + + + + + + Apache Lucene Flexible Query Parser + + + +

Apache Lucene Flexible Query Parser

+ +

+This contrib project contains the new Lucene query parser implementation, which matches the syntax of the core QueryParser but offers a more modular architecture to enable customization. +

+ +

+It's currently divided in 2 main packages: +

{@link org.apache.lucene.queryParser.core}: it contains the query parser API classes, which should be extended by query parser implementations.
{@link org.apache.lucene.queryParser.standard}: it contains the current Lucene query parser implementation using the new query parser API.

+ +

Features

+ +

Full support for boolean logic (not enabled)
QueryNode Trees - support for several syntaxes, + that can be converted into similar syntax QueryNode trees.
QueryNode Processors - Optimize, validate, rewrite the + QueryNode trees
Processors Pipelines - Select your favorite Processor + and build a processor pipeline, to implement the features you need
Config Interfaces - Allow the consumer of the Query Parser to implement + a diff Config Handler Objects to suite their needs.
Standard Builders - convert QueryNode's into several lucene + representations. Supported conversion is using a 2.4 compatible logic
QueryNode tree's can be converted to a lucene 2.4 syntax string, using toQueryString

+ +

Design

+This new query parser was designed to have very generic +architecture, so that it can be easily used for different +products with varying query syntaxes. This code is much more +flexible and extensible than the Lucene query parser in 2.4.X. +

+The new query parser goal is to separate syntax and semantics of a query. E.g. 'a AND +b', '+a +b', 'AND(a,b)' could be different syntaxes for the same query. +It distinguishes the semantics of the different query components, e.g. +whether and how to tokenize/lemmatize/normalize the different terms or +which Query objects to create for the terms. It allows to +write a parser with a new syntax, while reusing the underlying +semantics, as quickly as possible. +

+The query parser has three layers and its core is what we call the +QueryNode tree. It is a tree that initially represents the syntax of the +original query, e.g. for 'a AND b': +

+      AND
+     /   \
+    A     B
+

+The three layers are: +

QueryParser: +This layer is the text parsing layer which simply transforms the +query text string into a {@link org.apache.lucene.queryParser.core.nodes.QueryNode} tree. Every text parser +must implement the interface {@link org.apache.lucene.queryParser.core.parser.SyntaxParser}. +Lucene default implementations implements it using JavaCC. +
QueryNodeProcessor: The query node processors do most of the work. It is in fact a +configurable chain of processors. Each processors can walk the tree and +modify nodes or even the tree's structure. That makes it possible to +e.g. do query optimization before the query is executed or to tokenize +terms. +
QueryBuilder: +The third layer is a configurable map of builders, which map {@link org.apache.lucene.queryParser.core.nodes.QueryNode} types to its specific +builder that will transform the QueryNode into Lucene Query object. +

+ +

+Furthermore, the query parser uses flexible configuration objects. It also uses message classes that +allow to attach resource bundles. This makes it possible to translate +messages, which is an important feature of a query parser. +

+This design allows to develop different query syntaxes very quickly. +

+ +

StandardQueryParser and QueryParserWrapper

+ +

+The standard (default) Lucene query parser is located under +org.apache.lucene.queryParser.standard. +

+To make it simpler to use the new query parser +the class {@link org.apache.lucene.queryParser.standard.StandardQueryParser} may be helpful, +specially for people that do not want to extend the Query Parser. +It uses the default Lucene query processors, text parser and builders, so +you don't need to worry about dealing with those. + +{@link org.apache.lucene.queryParser.standard.StandardQueryParser} usage: + +

+      StandardQueryParser qpHelper = new StandardQueryParser();
+      StandardQueryConfigHandler config =  qpHelper.getQueryConfigHandler();
+      config.setAllowLeadingWildcard(true);
+      config.setAnalyzer(new WhitespaceAnalyzer());
+      Query query = qpHelper.parse("apache AND lucene", "defaultField");
+

+To make it easy for people who are using current Lucene's query parser to switch to +the new one, there is a {@link org.apache.lucene.queryParser.standard.QueryParserWrapper} under org.apache.lucene.queryParser.standard +that keeps the old query parser interface, but uses the new query parser infrastructure. +

+ + +