X-Git-Url: https://git.mdrn.pl/pylucene.git/blobdiff_plain/a2e61f0c04805cfcb8706176758d1283c7e3a55c..aaeed5504b982cf3545252ab528713250aa33eed:/lucene-java-3.5.0/lucene/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html
diff --git a/lucene-java-3.5.0/lucene/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html b/lucene-java-3.5.0/lucene/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html
new file mode 100644
index 0000000..31ea96e
--- /dev/null
+++ b/lucene-java-3.5.0/lucene/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html
@@ -0,0 +1,46 @@
+
+
+
+
+
+
+
+
+Analyzer for Simplified Chinese, which indexes words.
+
+
+@lucene.experimental
+
+
+Three analyzers are provided for Chinese, each of which treats Chinese text in a different way.
+
+ - StandardAnalyzer: Index unigrams (individual Chinese characters) as a token.
+
- CJKAnalyzer (in the analyzers/cjk package): Index bigrams (overlapping groups of two adjacent Chinese characters) as tokens.
+
- SmartChineseAnalyzer (in this package): Index words (attempt to segment Chinese text into words) as tokens.
+
+
+Example phraseï¼ "ææ¯ä¸å½äºº"
+
+ - StandardAnalyzer: æï¼æ¯ï¼ä¸ï¼å½ï¼äºº
+ - CJKAnalyzer: ææ¯ï¼æ¯ä¸ï¼ä¸å½ï¼å½äºº
+ - SmartChineseAnalyzer: æï¼æ¯ï¼ä¸å½ï¼äºº
+
+
+
+
+