pylucene 3.5.0-3
[pylucene.git] / lucene-java-3.5.0 / lucene / contrib / analyzers / common / src / resources / org / apache / lucene / analysis / snowball / english_stop.txt
1  | From svn.tartarus.org/snowball/trunk/website/algorithms/english/stop.txt
2  | This file is distributed under the BSD License.
3  | See http://snowball.tartarus.org/license.php
4  | Also see http://www.opensource.org/licenses/bsd-license.html
5  |  - Encoding was converted to UTF-8.
6  |  - This notice was added.
7  
8  | An English stop word list. Comments begin with vertical bar. Each stop
9  | word is at the start of a line.
10
11  | Many of the forms below are quite rare (e.g. "yourselves") but included for
12  |  completeness.
13
14            | PRONOUNS FORMS
15              | 1st person sing
16
17 i              | subject, always in upper case of course
18
19 me             | object
20 my             | possessive adjective
21                | the possessive pronoun `mine' is best suppressed, because of the
22                | sense of coal-mine etc.
23 myself         | reflexive
24              | 1st person plural
25 we             | subject
26
27 | us           | object
28                | care is required here because US = United States. It is usually
29                | safe to remove it if it is in lower case.
30 our            | possessive adjective
31 ours           | possessive pronoun
32 ourselves      | reflexive
33              | second person (archaic `thou' forms not included)
34 you            | subject and object
35 your           | possessive adjective
36 yours          | possessive pronoun
37 yourself       | reflexive (singular)
38 yourselves     | reflexive (plural)
39              | third person singular
40 he             | subject
41 him            | object
42 his            | possessive adjective and pronoun
43 himself        | reflexive
44
45 she            | subject
46 her            | object and possessive adjective
47 hers           | possessive pronoun
48 herself        | reflexive
49
50 it             | subject and object
51 its            | possessive adjective
52 itself         | reflexive
53              | third person plural
54 they           | subject
55 them           | object
56 their          | possessive adjective
57 theirs         | possessive pronoun
58 themselves     | reflexive
59              | other forms (demonstratives, interrogatives)
60 what
61 which
62 who
63 whom
64 this
65 that
66 these
67 those
68
69            | VERB FORMS (using F.R. Palmer's nomenclature)
70              | BE
71 am             | 1st person, present
72 is             | -s form (3rd person, present)
73 are            | present
74 was            | 1st person, past
75 were           | past
76 be             | infinitive
77 been           | past participle
78 being          | -ing form
79              | HAVE
80 have           | simple
81 has            | -s form
82 had            | past
83 having         | -ing form
84              | DO
85 do             | simple
86 does           | -s form
87 did            | past
88 doing          | -ing form
89
90  | The forms below are, I believe, best omitted, because of the significant
91  | homonym forms:
92
93  |  He made a WILL
94  |  old tin CAN
95  |  merry month of MAY
96  |  a smell of MUST
97  |  fight the good fight with all thy MIGHT
98
99  | would, could, should, ought might however be included
100
101  |          | AUXILIARIES
102  |            | WILL
103  |will
104
105 would
106
107  |            | SHALL
108  |shall
109
110 should
111
112  |            | CAN
113  |can
114
115 could
116
117  |            | MAY
118  |may
119  |might
120  |            | MUST
121  |must
122  |            | OUGHT
123
124 ought
125
126            | COMPOUND FORMS, increasingly encountered nowadays in 'formal' writing
127               | pronoun + verb
128
129 i'm
130 you're
131 he's
132 she's
133 it's
134 we're
135 they're
136 i've
137 you've
138 we've
139 they've
140 i'd
141 you'd
142 he'd
143 she'd
144 we'd
145 they'd
146 i'll
147 you'll
148 he'll
149 she'll
150 we'll
151 they'll
152
153               | verb + negation
154
155 isn't
156 aren't
157 wasn't
158 weren't
159 hasn't
160 haven't
161 hadn't
162 doesn't
163 don't
164 didn't
165
166               | auxiliary + negation
167
168 won't
169 wouldn't
170 shan't
171 shouldn't
172 can't
173 cannot
174 couldn't
175 mustn't
176
177              | miscellaneous forms
178
179 let's
180 that's
181 who's
182 what's
183 here's
184 there's
185 when's
186 where's
187 why's
188 how's
189
190               | rarer forms
191
192  | daren't needn't
193
194               | doubtful forms
195
196  | oughtn't mightn't
197
198            | ARTICLES
199 a
200 an
201 the
202
203            | THE REST (Overlap among prepositions, conjunctions, adverbs etc is so
204            | high, that classification is pointless.)
205 and
206 but
207 if
208 or
209 because
210 as
211 until
212 while
213
214 of
215 at
216 by
217 for
218 with
219 about
220 against
221 between
222 into
223 through
224 during
225 before
226 after
227 above
228 below
229 to
230 from
231 up
232 down
233 in
234 out
235 on
236 off
237 over
238 under
239
240 again
241 further
242 then
243 once
244
245 here
246 there
247 when
248 where
249 why
250 how
251
252 all
253 any
254 both
255 each
256 few
257 more
258 most
259 other
260 some
261 such
262
263 no
264 nor
265 not
266 only
267 own
268 same
269 so
270 than
271 too
272 very
273
274  | Just for the record, the following words are among the commonest in English
275
276     | one
277     | every
278     | least
279     | less
280     | many
281     | now
282     | ever
283     | never
284     | say
285     | says
286     | said
287     | also
288     | get
289     | go
290     | goes
291     | just
292     | made
293     | make
294     | put
295     | see
296     | seen
297     | whether
298     | like
299     | well
300     | back
301     | even
302     | still
303     | way
304     | take
305     | since
306     | another
307     | however
308     | two
309     | three
310     | four
311     | five
312     | first
313     | second
314     | new
315     | old
316     | high
317     | long