lucene-java-3.4.0/lucene/contrib/queryparser/src/java/overview.html

   1 <!doctype html public "-//w3c//dtd html 4.0 transitional//en">
   2 <!--
   3  Licensed to the Apache Software Foundation (ASF) under one or more
   4  contributor license agreements.  See the NOTICE file distributed with
   5  this work for additional information regarding copyright ownership.
   6  The ASF licenses this file to You under the Apache License, Version 2.0
   7  (the "License"); you may not use this file except in compliance with
   8  the License.  You may obtain a copy of the License at
   9
  10      http://www.apache.org/licenses/LICENSE-2.0
  11
  12  Unless required by applicable law or agreed to in writing, software
  13  distributed under the License is distributed on an "AS IS" BASIS,
  14  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  15  See the License for the specific language governing permissions and
  16  limitations under the License.
  17 -->
  18 <html>
  19 <head>
  20    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  21    <title>Apache Lucene Flexible Query Parser</title>
  22 </head>
  23 <body>
  24
  25 <h2>Apache Lucene Flexible Query Parser</h2>
  26
  27 <p>
  28 This contrib project contains the new Lucene query parser implementation, which matches the syntax of the core QueryParser but offers a more modular architecture to enable customization.
  29 </p>
  30
  31 <p>
  32 It's currently divided in 2 main packages:
  33 <ul>
  34 <li>{@link org.apache.lucene.queryParser.core}: it contains the query parser API classes, which should be extended by query parser implementations. </li>
  35 <li>{@link org.apache.lucene.queryParser.standard}: it contains the current Lucene query parser implementation using the new query parser API.</li>
  36 </ul>
  37 </p>
  38
  39 <h3>Features</h3>
  40
  41     <ol>
  42         <li>Full support for boolean logic (not enabled)</li>
  43         <li>QueryNode Trees - support for several syntaxes,
  44             that can be converted into similar syntax QueryNode trees.</li>
  45         <li>QueryNode Processors - Optimize, validate, rewrite the
  46             QueryNode trees</li>
  47                 <li>Processors Pipelines - Select your favorite Processor
  48                     and build a processor pipeline, to implement the features you need</li>
  49         <li>Config Interfaces - Allow the consumer of the Query Parser to implement
  50             a diff Config Handler Objects to suite their needs.</li>
  51         <li>Standard Builders - convert QueryNode's into several lucene
  52             representations. Supported conversion is using a 2.4 compatible logic</li>
  53         <li>QueryNode tree's can be converted to a lucene 2.4 syntax string, using toQueryString</li>
  54     </ol>
  55
  56 <h3>Design</h3>
  57 <p>
  58 This new query parser was designed to have very generic
  59 architecture, so that it can be easily used for different
  60 products with varying query syntaxes. This code is much more
  61 flexible and extensible than the Lucene query parser in 2.4.X.
  62 </p>
  63 <p>
  64 The new query parser  goal is to separate syntax and semantics of a query. E.g. 'a AND
  65 b', '+a +b', 'AND(a,b)' could be different syntaxes for the same query.
  66 It distinguishes the semantics of the different query components, e.g.
  67 whether and how to tokenize/lemmatize/normalize the different terms or
  68 which Query objects to create for the terms. It allows to
  69 write a parser with a new syntax, while reusing the underlying
  70 semantics, as quickly as possible.
  71 </p>
  72 <p>
  73 The query parser has three layers and its core is what we call the
  74 QueryNode tree. It is a tree that initially represents the syntax of the
  75 original query, e.g. for 'a AND b':
  76 </p>
  77 <pre>
  78       AND
  79      /   \
  80     A     B
  81 </pre>
  82 <p>
  83 The three layers are:
  84 </p>
  85 <dl>
  86 <dt>QueryParser</dt>
  87 <dd>
  88 This layer is the text parsing layer which simply transforms the
  89 query text string into a {@link org.apache.lucene.queryParser.core.nodes.QueryNode} tree. Every text parser
  90 must implement the interface {@link org.apache.lucene.queryParser.core.parser.SyntaxParser}.
  91 Lucene default implementations implements it using JavaCC.
  92 </dd>
  93
  94 <dt>QueryNodeProcessor</dt>
  95 <dd>The query node processors do most of the work. It is in fact a
  96 configurable chain of processors. Each processors can walk the tree and
  97 modify nodes or even the tree's structure. That makes it possible to
  98 e.g. do query optimization before the query is executed or to tokenize
  99 terms.
 100 </dd>
 101
 102 <dt>QueryBuilder</dt>
 103 <dd>
 104 The third layer is a configurable map of builders, which map {@link org.apache.lucene.queryParser.core.nodes.QueryNode} types to its specific
 105 builder that will transform the QueryNode into Lucene Query object.
 106 </dd>
 107
 108 </dl>
 109
 110 <p>
 111 Furthermore, the query parser uses flexible configuration objects. It also uses message classes that
 112 allow to attach resource bundles. This makes it possible to translate
 113 messages, which is an important feature of a query parser.
 114 </p>
 115 <p>
 116 This design allows to develop different query syntaxes very quickly.
 117 </p>
 118
 119 <h3>StandardQueryParser and QueryParserWrapper</h3>
 120
 121 <p>
 122 The standard (default) Lucene query parser is located under
 123 org.apache.lucene.queryParser.standard.
 124 <p>
 125 To make it simpler to use the new query parser
 126 the class {@link org.apache.lucene.queryParser.standard.StandardQueryParser} may be helpful,
 127 specially for people that do not want to extend the Query Parser.
 128 It uses the default Lucene query processors, text parser and builders, so
 129 you don't need to worry about dealing with those.
 130
 131 {@link org.apache.lucene.queryParser.standard.StandardQueryParser} usage:
 132
 133 <pre class="prettyprint">
 134       StandardQueryParser qpHelper = new StandardQueryParser();
 135       StandardQueryConfigHandler config =  qpHelper.getQueryConfigHandler();
 136       config.setAllowLeadingWildcard(true);
 137       config.setAnalyzer(new WhitespaceAnalyzer());
 138       Query query = qpHelper.parse("apache AND lucene", "defaultField");
 139 </pre>
 140 <p>
 141 To make it easy for people who are using current Lucene's query parser to switch to
 142 the new one, there is a {@link org.apache.lucene.queryParser.standard.QueryParserWrapper} under org.apache.lucene.queryParser.standard
 143 that keeps the old query parser interface, but uses the new query parser infrastructure.
 144 </p>
 145
 146 </body>
 147 </html>