+++ /dev/null
-<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<html>
-<body>
-This is an another highlighter implementation.
-
-<h2>Features</h2>
-<ul>
-<li>fast for large docs</li>
-<li>support N-gram fields</li>
-<li>support phrase-unit highlighting with slops</li>
-<li>need Java 1.5</li>
-<li>highlight fields need to be TermVector.WITH_POSITIONS_OFFSETS</li>
-<li>take into account query boost to score fragments</li>
-<li>support colored highlight tags</li>
-<li>pluggable FragListBuilder</li>
-<li>pluggable FragmentsBuilder</li>
-</ul>
-
-<h2>Algorithm</h2>
-<p>To explain the algorithm, let's use the following sample text
- (to be highlighted) and user query:</p>
-
-<table border=1>
-<tr>
-<td><b>Sample Text</b></td>
-<td>Lucene is a search engine library.</td>
-</tr>
-<tr>
-<td><b>User Query</b></td>
-<td>Lucene^2 OR "search library"~1</td>
-</tr>
-</table>
-
-<p>The user query is a BooleanQuery that consists of TermQuery("Lucene")
-with boost of 2 and PhraseQuery("search library") with slop of 1.</p>
-<p>For your convenience, here is the offsets and positions info of the
-sample text.</p>
-
-<pre>
-+--------+-----------------------------------+
-| | 1111111111222222222233333|
-| offset|01234567890123456789012345678901234|
-+--------+-----------------------------------+
-|document|Lucene is a search engine library. |
-+--------*-----------------------------------+
-|position|0 1 2 3 4 5 |
-+--------*-----------------------------------+
-</pre>
-
-<h3>Step 1.</h3>
-<p>In Step 1, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldQuery.QueryPhraseMap} from the user query.
-<code>QueryPhraseMap</code> consists of the following members:</p>
-<pre class="prettyprint">
-public class QueryPhraseMap {
- boolean terminal;
- int slop; // valid if terminal == true and phraseHighlight == true
- float boost; // valid if terminal == true
- Map<String, QueryPhraseMap> subMap;
-}
-</pre>
-<p><code>QueryPhraseMap</code> has subMap. The key of the subMap is a term
-text in the user query and the value is a subsequent <code>QueryPhraseMap</code>.
-If the query is a term (not phrase), then the subsequent <code>QueryPhraseMap</code>
-is marked as terminal. If the query is a phrase, then the subsequent <code>QueryPhraseMap</code>
-is not a terminal and it has the next term text in the phrase.</p>
-
-<p>From the sample user query, the following <code>QueryPhraseMap</code>
-will be generated:</p>
-<pre>
- QueryPhraseMap
-+--------+-+ +-------+-+
-|"Lucene"|o+->|boost=2|*| * : terminal
-+--------+-+ +-------+-+
-
-+--------+-+ +---------+-+ +-------+------+-+
-|"search"|o+->|"library"|o+->|boost=1|slop=1|*|
-+--------+-+ +---------+-+ +-------+------+-+
-</pre>
-
-<h3>Step 2.</h3>
-<p>In Step 2, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldTermStack}. Fast Vector Highlighter uses {@link org.apache.lucene.index.TermFreqVector} data
-(must be stored {@link org.apache.lucene.document.Field.TermVector#WITH_POSITIONS_OFFSETS})
-to generate it. <code>FieldTermStack</code> keeps the terms in the user query.
-Therefore, in this sample case, Fast Vector Highlighter generates the following <code>FieldTermStack</code>:</p>
-<pre>
- FieldTermStack
-+------------------+
-|"Lucene"(0,6,0) |
-+------------------+
-|"search"(12,18,3) |
-+------------------+
-|"library"(26,33,5)|
-+------------------+
-where : "termText"(startOffset,endOffset,position)
-</pre>
-<h3>Step 3.</h3>
-<p>In Step 3, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldPhraseList}
-by reference to <code>QueryPhraseMap</code> and <code>FieldTermStack</code>.</p>
-<pre>
- FieldPhraseList
-+----------------+-----------------+---+
-|"Lucene" |[(0,6)] |w=2|
-+----------------+-----------------+---+
-|"search library"|[(12,18),(26,33)]|w=1|
-+----------------+-----------------+---+
-</pre>
-<p>The type of each entry is <code>WeightedPhraseInfo</code> that consists of
-an array of terms offsets and weight. The weight (Fast Vector Highlighter uses query boost to
-calculate the weight) will be taken into account when Fast Vector Highlighter creates
-{@link org.apache.lucene.search.vectorhighlight.FieldFragList} in the next step.</p>
-<h3>Step 4.</h3>
-<p>In Step 4, Fast Vector Highlighter creates <code>FieldFragList</code> by reference to
-<code>FieldPhraseList</code>. In this sample case, the following
-<code>FieldFragList</code> will be generated:</p>
-<pre>
- FieldFragList
-+---------------------------------+
-|"Lucene"[(0,6)] |
-|"search library"[(12,18),(26,33)]|
-|totalBoost=3 |
-+---------------------------------+
-</pre>
-<h3>Step 5.</h3>
-<p>In Step 5, by using <code>FieldFragList</code> and the field stored data,
-Fast Vector Highlighter creates highlighted snippets!</p>
-</body>
-</html>