2 # This file defines a stopword set for Japanese.
4 # This set is made up of hand-picked frequent terms from segmented Japanese Wikipedia.
5 # Punctuation characters and frequent kanji have mostly been left out. See LUCENE-3745
6 # for frequency lists, etc. that can be useful for making your own set (if desired)
8 # Note that there is an overlap between these stopwords and the terms stopped when used
9 # in combination with the JapanesePartOfSpeechStopFilter. When editing this file, note
10 # that comments are not allowed on the same line as stopwords.
12 # Also note that stopping is done in a case-insensitive manner. Change your StopFilter
13 # configuration if you need case-sensitive stopping. Lastly, note that stopping is done
14 # using the same character width as the entries in this file. Since this StopFilter is
15 # normally done after a CJKWidthFilter in your chain, you would usually want your romaji
16 # entries to be in half-width and your kana entries to be in full-width.