doc/solr-conf/lang/stopwords_ja.txt

   1 #
   2 # This file defines a stopword set for Japanese.
   3 #
   4 # This set is made up of hand-picked frequent terms from segmented Japanese Wikipedia.
   5 # Punctuation characters and frequent kanji have mostly been left out.  See LUCENE-3745
   6 # for frequency lists, etc. that can be useful for making your own set (if desired)
   7 #
   8 # Note that there is an overlap between these stopwords and the terms stopped when used
   9 # in combination with the JapanesePartOfSpeechStopFilter.  When editing this file, note
  10 # that comments are not allowed on the same line as stopwords.
  11 #
  12 # Also note that stopping is done in a case-insensitive manner.  Change your StopFilter
  13 # configuration if you need case-sensitive stopping.  Lastly, note that stopping is done
  14 # using the same character width as the entries in this file.  Since this StopFilter is
  15 # normally done after a CJKWidthFilter in your chain, you would usually want your romaji
  16 # entries to be in half-width and your kana entries to be in full-width.
  17 #
  18 の
  19 に
  20 は
  21 を
  22 た
  23 が
  24 で
  25 て
  26 と
  27 し
  28 れ
  29 さ
  30 ある
  31 いる
  32 も
  33 する
  34 から
  35 な
  36 こと
  37 として
  38 い
  39 や
  40 れる
  41 など
  42 なっ
  43 ない
  44 この
  45 ため
  46 その
  47 あっ
  48 よう
  49 また
  50 もの
  51 という
  52 あり
  53 まで
  54 られ
  55 なる
  56 へ
  57 か
  58 だ
  59 これ
  60 によって
  61 により
  62 おり
  63 より
  64 による
  65 ず
  66 なり
  67 られる
  68 において
  69 ば
  70 なかっ
  71 なく
  72 しかし
  73 について
  74 せ
  75 だっ
  76 その後
  77 できる
  78 それ
  79 う
  80 ので
  81 なお
  82 のみ
  83 でき
  84 き
  85 つ
  86 における
  87 および
  88 いう
  89 さらに
  90 でも
  91 ら
  92 たり
  93 その他
  94 に関する
  95 たち
  96 ます
  97 ん
  98 なら
  99 に対して
 100 特に
 101 せる
 102 及び
 103 これら
 104 とき
 105 では
 106 にて
 107 ほか
 108 ながら
 109 うち
 110 そして
 111 とともに
 112 ただし
 113 かつて
 114 それぞれ
 115 または
 116 お
 117 ほど
 118 ものの
 119 に対する
 120 ほとんど
 121 と共に
 122 といった
 123 です
 124 とも
 125 ところ
 126 ここ
 127 ##### End of file