X-Git-Url: https://git.mdrn.pl/pylucene.git/blobdiff_plain/a2e61f0c04805cfcb8706176758d1283c7e3a55c..aaeed5504b982cf3545252ab528713250aa33eed:/lucene-java-3.5.0/lucene/src/java/org/apache/lucene/document/package.html?ds=sidebyside diff --git a/lucene-java-3.5.0/lucene/src/java/org/apache/lucene/document/package.html b/lucene-java-3.5.0/lucene/src/java/org/apache/lucene/document/package.html new file mode 100644 index 0000000..e497184 --- /dev/null +++ b/lucene-java-3.5.0/lucene/src/java/org/apache/lucene/document/package.html @@ -0,0 +1,56 @@ + + + + + + + +

The logical representation of a {@link org.apache.lucene.document.Document} for indexing and searching.

The document package provides the user level logical representation of content to be indexed and searched. The +package also provides utilities for working with {@link org.apache.lucene.document.Document}s and {@link org.apache.lucene.document.Fieldable}s.

Document and Fieldable

A {@link org.apache.lucene.document.Document} is a collection of {@link org.apache.lucene.document.Fieldable}s. A + {@link org.apache.lucene.document.Fieldable} is a logical representation of a user's content that needs to be indexed or stored. + {@link org.apache.lucene.document.Fieldable}s have a number of properties that tell Lucene how to treat the content (like indexed, tokenized, + stored, etc.) See the {@link org.apache.lucene.document.Field} implementation of {@link org.apache.lucene.document.Fieldable} + for specifics on these properties. +

Note: it is common to refer to {@link org.apache.lucene.document.Document}s having {@link org.apache.lucene.document.Field}s, even though technically they have +{@link org.apache.lucene.document.Fieldable}s.

Working with Documents

First and foremost, a {@link org.apache.lucene.document.Document} is something created by the user application. It is your job + to create Documents based on the content of the files you are working with in your application (Word, txt, PDF, Excel or any other format.) + How this is done is completely up to you. That being said, there are many tools available in other projects that can make + the process of taking a file and converting it into a Lucene {@link org.apache.lucene.document.Document}. To see an example of this, + take a look at the Lucene demo and the associated source code + for extracting content from HTML. +

The {@link org.apache.lucene.document.DateTools} is a utility class to make dates and times searchable +(remember, Lucene only searches text). {@link org.apache.lucene.document.NumericField} is a special helper class +to simplify indexing of numeric values (and also dates) for fast range range queries with {@link org.apache.lucene.search.NumericRangeQuery} +(using a special sortable string representation of numeric values).

The {@link org.apache.lucene.document.FieldSelector} class provides a mechanism to tell Lucene how to load Documents from +storage. If no FieldSelector is used, all Fieldables on a Document will be loaded. As an example of the FieldSelector usage, consider + the common use case of +displaying search results on a web page and then having users click through to see the full document. In this scenario, it is often + the case that there are many small fields and one or two large fields (containing the contents of the original file). Before the FieldSelector, +the full Document had to be loaded, including the large fields, in order to display the results. Now, using the FieldSelector, one +can {@link org.apache.lucene.document.FieldSelectorResult#LAZY_LOAD} the large fields, thus only loading the large fields +when a user clicks on the actual link to view the original content.

+ +