1 <!doctype html public "-//w3c//dtd html 4.0 transitional//en">
3 Licensed to the Apache Software Foundation (ASF) under one or more
4 contributor license agreements. See the NOTICE file distributed with
5 this work for additional information regarding copyright ownership.
6 The ASF licenses this file to You under the Apache License, Version 2.0
7 (the "License"); you may not use this file except in compliance with
8 the License. You may obtain a copy of the License at
10 http://www.apache.org/licenses/LICENSE-2.0
12 Unless required by applicable law or agreed to in writing, software
13 distributed under the License is distributed on an "AS IS" BASIS,
14 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 See the License for the specific language governing permissions and
16 limitations under the License.
20 <TITLE>Benchmarking Lucene By Tasks</TITLE>
24 Benchmarking Lucene By Tasks.
26 This package provides "task based" performance benchmarking of Lucene.
27 One can use the predefined benchmarks, or create new ones.
33 <table border=1 cellpadding=4>
35 <td><b>Package</b></td>
36 <td><b>Description</b></td>
39 <td><a href="stats/package-summary.html">stats</a></td>
40 <td>Statistics maintained when running benchmark tasks.</td>
43 <td><a href="tasks/package-summary.html">tasks</a></td>
44 <td>Benchmark tasks.</td>
47 <td><a href="feeds/package-summary.html">feeds</a></td>
48 <td>Sources for benchmark inputs: documents and queries.</td>
51 <td><a href="utils/package-summary.html">utils</a></td>
52 <td>Utilities used for the benchmark, and for the reports.</td>
55 <td><a href="programmatic/package-summary.html">programmatic</a></td>
56 <td>Sample performance test written programatically.</td>
60 <h2>Table Of Contents</h2>
63 <li><a href="#concept">Benchmarking By Tasks</a></li>
64 <li><a href="#usage">How to use</a></li>
65 <li><a href="#algorithm">Benchmark "algorithm"</a></li>
66 <li><a href="#tasks">Supported tasks/commands</a></li>
67 <li><a href="#properties">Benchmark properties</a></li>
68 <li><a href="#example">Example input algorithm and the result benchmark
70 <li><a href="#recsCounting">Results record counting clarified</a></li>
73 <a name="concept"></a>
74 <h2>Benchmarking By Tasks</h2>
76 Benchmark Lucene using task primitives.
80 A benchmark is composed of some predefined tasks, allowing for creating an
81 index, adding documents,
82 optimizing, searching, generating reports, and more. A benchmark run takes an
84 that contains a description of the sequence of tasks making up the run, and some
85 properties defining a few
86 additional characteristics of the benchmark run.
92 Easiest way to run a benchmarks is using the predefined ant task:
95 <br>- would run the <code>micro-standard.alg</code> "algorithm".
97 <li>ant run-task -Dtask.alg=conf/compound-penalty.alg
98 <br>- would run the <code>compound-penalty.alg</code> "algorithm".
100 <li>ant run-task -Dtask.alg=[full-path-to-your-alg-file]
101 <br>- would run <code>your perf test</code> "algorithm".
103 <li>java org.apache.lucene.benchmark.byTask.programmatic.Sample
104 <br>- would run a performance test programmatically - without using an alg
105 file. This is less readable, and less convinient, but possible.
111 You may find existing tasks sufficient for defining the benchmark <i>you</i>
112 need, otherwise, you can extend the framework to meet your needs, as explained
117 Each benchmark run has a DocMaker and a QueryMaker. These two should usually
118 match, so that "meaningful" queries are used for a certain collection.
119 Properties set at the header of the alg file define which "makers" should be
120 used. You can also specify your own makers, extending DocMaker and implementing
123 <b>Note:</b> since 2.9, DocMaker is a concrete class which accepts a
124 ContentSource. In most cases, you can use the DocMaker class to create
125 Documents, while providing your own ContentSource implementation. For
126 example, the current Benchmark package includes ContentSource
127 implementations for TREC, Enwiki and Reuters collections, as well as
128 others like LineDocSource which reads a 'line' file produced by
134 Benchmark .alg file contains the benchmark "algorithm". The syntax is described
135 below. Within the algorithm, you can specify groups of commands, assign them
136 names, specify commands that should be repeated,
137 do commands in serial or in parallel,
138 and also control the speed of "firing" the commands.
142 This allows, for instance, to specify
143 that an index should be opened for update,
144 documents should be added to it one by one but not faster than 20 docs a minute,
145 and, in parallel with this,
146 some N queries should be searched against that index,
147 again, no more than 2 queries a second.
148 You can have the searches all share an index reader,
149 or have them each open its own reader and close it afterwords.
153 If the commands available for use in the algorithm do not meet your needs,
154 you can add commands by adding a new task under
155 org.apache.lucene.benchmark.byTask.tasks -
156 you should extend the PerfTask abstract class.
157 Make sure that your new task class name is suffixed by Task.
158 Assume you added the class "WonderfulTask" - doing so also enables the
159 command "Wonderful" to be used in the algorithm.
163 <u>External classes</u>: It is sometimes useful to invoke the benchmark
164 package with your external alg file that configures the use of your own
165 doc/query maker and or html parser. You can work this out without
166 modifying the benchmark package code, by passing your class path
167 with the benchmark.ext.classpath property:
169 <li>ant run-task -Dtask.alg=[full-path-to-your-alg-file]
170 <font color="#FF0000">-Dbenchmark.ext.classpath=/mydir/classes
171 </font> -Dtask.mem=512M</li>
173 <u>External tasks</u>: When writing your own tasks under a package other than
174 <b>org.apache.lucene.benchmark.byTask.tasks</b> specify that package thru the
175 <font color="#FF0000">alt.tasks.packages</font> property.
178 <a name="algorithm"></a>
179 <h2>Benchmark "algorithm"</h2>
182 The following is an informal description of the supported syntax.
187 <b>Measuring</b>: When a command is executed, statistics for the elapsed
188 execution time and memory consumption are collected.
189 At any time, those statistics can be printed, using one of the
190 available ReportTasks.
193 <b>Comments</b> start with '<font color="#FF0066">#</font>'.
196 <b>Serial</b> sequences are enclosed within '<font color="#FF0066">{ }</font>'.
199 <b>Parallel</b> sequences are enclosed within
200 '<font color="#FF0066">[ ]</font>'
203 <b>Sequence naming:</b> To name a sequence, put
204 '<font color="#FF0066">"name"</font>' just after
205 '<font color="#FF0066">{</font>' or '<font color="#FF0066">[</font>'.
206 <br>Example - <font color="#FF0066">{ "ManyAdds" AddDoc } : 1000000</font> -
208 name the sequence of 1M add docs "ManyAdds", and this name would later appear
209 in statistic reports.
210 If you don't specify a name for a sequence, it is given one: you can see it as
211 the algorithm is printed just before benchmark execution starts.
215 To repeat sequence tasks N times, add '<font color="#FF0066">: N</font>' just
217 sequence closing tag - '<font color="#FF0066">}</font>' or
218 '<font color="#FF0066">]</font>' or '<font color="#FF0066">></font>'.
219 <br>Example - <font color="#FF0066">[ AddDoc ] : 4</font> - would do 4 addDoc
220 in parallel, spawning 4 threads at once.
221 <br>Example - <font color="#FF0066">[ AddDoc AddDoc ] : 4</font> - would do
222 8 addDoc in parallel, spawning 8 threads at once.
223 <br>Example - <font color="#FF0066">{ AddDoc } : 30</font> - would do addDoc
225 <br>Example - <font color="#FF0066">{ AddDoc AddDoc } : 30</font> - would do
226 addDoc 60 times in a row.
227 <br><b>Exhaustive repeating</b>: use <font color="#FF0066">*</font> instead of
228 a number to repeat exhaustively.
229 This is sometimes useful, for adding as many files as a doc maker can create,
230 without iterating over the same file again, especially when the exact
231 number of documents is not known in advance. For insance, TREC files extracted
232 from a zip file. Note: when using this, you must also set
233 <font color="#FF0066">doc.maker.forever</font> to false.
234 <br>Example - <font color="#FF0066">{ AddDoc } : *</font> - would add docs
235 until the doc maker is "exhausted".
238 <b>Command parameter</b>: a command can optionally take a single parameter.
239 If the certain command does not support a parameter, or if the parameter is of
241 reading the algorithm will fail with an exception and the test would not start.
242 Currently the following tasks take optional parameters:
244 <li><b>AddDoc</b> takes a numeric parameter, indicating the required size of
245 added document. Note: if the DocMaker implementation used in the test
246 does not support makeDoc(size), an exception would be thrown and the test
249 <li><b>DeleteDoc</b> takes numeric parameter, indicating the docid to be
250 deleted. The latter is not very useful for loops, since the docid is
251 fixed, so for deletion in loops it is better to use the
252 <code>doc.delete.step</code> property.
254 <li><b>SetProp</b> takes a <code>name,value<code> mandatory param,
255 ',' used as a separator.
257 <li><b>SearchTravRetTask</b> and <b>SearchTravTask</b> take a numeric
258 parameter, indicating the required traversal size.
260 <li><b>SearchTravRetLoadFieldSelectorTask</b> takes a string
261 parameter: a comma separated list of Fields to load.
263 <li><b>SearchTravRetHighlighterTask</b> takes a string
264 parameter: a comma separated list of parameters to define highlighting. See that
265 tasks javadocs for more information
268 <br>Example - <font color="#FF0066">AddDoc(2000)</font> - would add a document
269 of size 2000 (~bytes).
270 <br>See conf/task-sample.alg for how this can be used, for instance, to check
271 which is faster, adding
272 many smaller documents, or few larger documents.
273 Next candidates for supporting a parameter may be the Search tasks,
274 for controlling the qurey size.
277 <b>Statistic recording elimination</b>: - a sequence can also end with
278 '<font color="#FF0066">></font>',
279 in which case child tasks would not store their statistics.
280 This can be useful to avoid exploding stats data, for adding say 1M docs.
281 <br>Example - <font color="#FF0066">{ "ManyAdds" AddDoc > : 1000000</font> -
282 would add million docs, measure that total, but not save stats for each addDoc.
283 <br>Notice that the granularity of System.currentTimeMillis() (which is used
284 here) is system dependant,
285 and in some systems an operation that takes 5 ms to complete may show 0 ms
286 latency time in performance measurements.
287 Therefore it is sometimes more accurate to look at the elapsed time of a larger
288 sequence, as demonstrated here.
292 To set a rate (ops/sec or ops/min) for a sequence, add
293 '<font color="#FF0066">: N : R</font>' just after sequence closing tag.
294 This would specify repetition of N with rate of R operations/sec.
295 Use '<font color="#FF0066">R/sec</font>' or
296 '<font color="#FF0066">R/min</font>'
297 to explicitely specify that the rate is per second or per minute.
298 The default is per second,
299 <br>Example - <font color="#FF0066">[ AddDoc ] : 400 : 3</font> - would do 400
300 addDoc in parallel, starting up to 3 threads per second.
301 <br>Example - <font color="#FF0066">{ AddDoc } : 100 : 200/min</font> - would
302 do 100 addDoc serially,
303 waiting before starting next add, if otherwise rate would exceed 200 adds/min.
306 <b>Disable Counting</b>: Each task executed contributes to the records count.
307 This count is reflected in reports under recs/s and under recsPerRun.
308 Most tasks count 1, some count 0, and some count more.
309 (See <a href="#recsCounting">Results record counting clarified</a> for more details.)
310 It is possible to disable counting for a task by preceding it with <font color="#FF0066">-</font>.
311 <br>Example - <font color="#FF0066"> -CreateIndex </font> - would count 0 while
312 the default behavior for CreateIndex is to count 1.
315 <b>Command names</b>: Each class "AnyNameTask" in the
316 package org.apache.lucene.benchmark.byTask.tasks,
317 that extends PerfTask, is supported as command "AnyName" that can be
318 used in the benchmark "algorithm" description.
319 This allows to add new commands by just adding such classes.
325 <h2>Supported tasks/commands</h2>
328 Existing tasks can be divided into a few groups:
329 regular index/search work tasks, report tasks, and control tasks.
335 <b>Report tasks</b>: There are a few Report commands for generating reports.
336 Only task runs that were completed are reported.
337 (The 'Report tasks' themselves are not measured and not reported.)
340 <font color="#FF0066">RepAll</font> - all (completed) task runs.
343 <font color="#FF0066">RepSumByName</font> - all statistics,
344 aggregated by name. So, if AddDoc was executed 2000 times,
345 only 1 report line would be created for it, aggregating all those
346 2000 statistic records.
349 <font color="#FF0066">RepSelectByPref prefixWord</font> - all
350 records for tasks whose name start with
351 <font color="#FF0066">prefixWord</font>.
354 <font color="#FF0066">RepSumByPref prefixWord</font> - all
355 records for tasks whose name start with
356 <font color="#FF0066">prefixWord</font>,
357 aggregated by their full task name.
360 <font color="#FF0066">RepSumByNameRound</font> - all statistics,
361 aggregated by name and by <font color="#FF0066">Round</font>.
362 So, if AddDoc was executed 2000 times in each of 3
363 <font color="#FF0066">rounds</font>, 3 report lines would be
365 aggregating all those 2000 statistic records in each round.
366 See more about rounds in the <font color="#FF0066">NewRound</font>
367 command description below.
370 <font color="#FF0066">RepSumByPrefRound prefixWord</font> -
371 similar to <font color="#FF0066">RepSumByNameRound</font>,
372 just that only tasks whose name starts with
373 <font color="#FF0066">prefixWord</font> are included.
376 If needed, additional reports can be added by extending the abstract class
378 manipulating the statistics data in Points and TaskStats.
381 <li><b>Control tasks</b>: Few of the tasks control the benchmark algorithm
385 <font color="#FF0066">ClearStats</font> - clears the entire statistics.
386 Further reports would only include task runs that would start after this
390 <font color="#FF0066">NewRound</font> - virtually start a new round of
392 Although this command can be placed anywhere, it mostly makes sense at
393 the end of an outermost sequence.
394 <br>This increments a global "round counter". All task runs that
395 would start now would
396 record the new, updated round counter as their round number.
397 This would appear in reports.
398 In particular, see <font color="#FF0066">RepSumByNameRound</font> above.
399 <br>An additional effect of NewRound, is that numeric and boolean
400 properties defined (at the head
401 of the .alg file) as a sequence of values, e.g. <font color="#FF0066">
402 merge.factor=mrg:10:100:10:100</font> would
403 increment (cyclic) to the next value.
404 Note: this would also be reflected in the reports, in this case under a
405 column that would be named "mrg".
408 <font color="#FF0066">ResetInputs</font> - DocMaker and the
410 would reset their counters to start.
411 The way these Maker interfaces work, each call for makeDocument()
412 or makeQuery() creates the next document or query
413 that it "knows" to create.
414 If that pool is "exhausted", the "maker" start over again.
415 The resetInpus command
416 therefore allows to make the rounds comparable.
417 It is therefore useful to invoke ResetInputs together with NewRound.
420 <font color="#FF0066">ResetSystemErase</font> - reset all index
421 and input data and call gc.
422 Does NOT reset statistics. This contains ResetInputs.
423 All writers/readers are nullified, deleted, closed.
426 You would have to call CreateIndex once this was called...
429 <font color="#FF0066">ResetSystemSoft</font> - reset all
430 index and input data and call gc.
431 Does NOT reset statistics. This contains ResetInputs.
432 All writers/readers are nullified, closed.
434 Directory is NOT erased.
435 This is useful for testing performance on an existing index,
436 for instance if the construction of a large index
437 took a very long time and now you would to test
438 its search or update performance.
444 Other existing tasks are quite straightforward and would
445 just be briefly described here.
448 <font color="#FF0066">CreateIndex</font> and
449 <font color="#FF0066">OpenIndex</font> both leave the
450 index open for later update operations.
451 <font color="#FF0066">CloseIndex</font> would close it.
453 <font color="#FF0066">OpenReader</font>, similarly, would
454 leave an index reader open for later search operations.
455 But this have further semantics.
456 If a Read operation is performed, and an open reader exists,
458 Otherwise, the read operation would open its own reader
459 and close it when the read operation is done.
460 This allows testing various scenarios - sharing a reader,
461 searching with "cold" reader, with "warmed" reader, etc.
462 The read operations affected by this are:
463 <font color="#FF0066">Warm</font>,
464 <font color="#FF0066">Search</font>,
465 <font color="#FF0066">SearchTrav</font> (search and traverse),
466 and <font color="#FF0066">SearchTravRet</font> (search
467 and traverse and retrieve).
468 Notice that each of the 3 search task types maintains
469 its own queryMaker instance.
471 <font color="#FF0066">CommitIndex</font> and
472 <font color="#FF0066">Optimize</font> can be used to commit
473 changes to the index and/or optimize the index created thus
476 <font color="#FF0066">WriteLineDoc</font> prepares a 'line'
477 file where each line holds a document with <i>title</i>,
478 <i>date</i> and <i>body</i> elements, seperated by [TAB].
479 A line file is useful if one wants to measure pure indexing
480 performance, without the overhead of parsing the data.<br>
481 You can use LineDocSource as a ContentSource over a 'line'
484 <font color="#FF0066">ConsumeContentSource</font> consumes
485 a ContentSource. Useful for e.g. testing a ContentSource
486 performance, without the overhead of preparing a Document
492 <a name="properties"></a>
493 <h2>Benchmark properties</h2>
496 Properties are read from the header of the .alg file, and
497 define several parameters of the performance test.
498 As mentioned above for the <font color="#FF0066">NewRound</font> task,
499 numeric and boolean properties that are defined as a sequence
500 of values, e.g. <font color="#FF0066">merge.factor=mrg:10:100:10:100</font>
501 would increment (cyclic) to the next value,
502 when NewRound is called, and would also
503 appear as a named column in the reports (column
504 name would be "mrg" in this example).
508 Some of the currently defined properties are:
513 <font color="#FF0066">analyzer</font> - full
514 class name for the analyzer to use.
515 Same analyzer would be used in the entire test.
519 <font color="#FF0066">directory</font> - valid values are
520 This tells which directory to use for the performance test.
524 <b>Index work parameters</b>:
525 Multi int/boolean values would be iterated with calls to NewRound.
526 There would be also added as columns in the reports, first string in the
527 sequence is the column name.
528 (Make sure it is no shorter than any value in the sequence).
530 <li><font color="#FF0066">max.buffered</font>
531 <br>Example: max.buffered=buf:10:10:100:100 -
532 this would define using maxBufferedDocs of 10 in iterations 0 and 1,
533 and 100 in iterations 2 and 3.
536 <font color="#FF0066">merge.factor</font> - which
540 <font color="#FF0066">compound</font> - whether the index is
541 using the compound format or not. Valid values are "true" and "false".
547 Here is a list of currently defined properties:
551 <li><b>Root directory for data and indexes:</b></li>
552 <ul><li>work.dir (default is System property "benchmark.work.dir" or "work".)
556 <li><b>Docs and queries creation:</b></li>
559 </li><li>doc.maker.forever
562 </li><li>doc.tokenized
563 </li><li>doc.term.vector
564 </li><li>doc.term.vector.positions
565 </li><li>doc.term.vector.offsets
566 </li><li>doc.store.body.bytes
569 </li><li>file.query.maker.file
570 </li><li>file.query.maker.default.field
571 </li><li>search.num.hits
577 </li><li>log.step.[class name]Task ie log.step.DeleteDoc (e.g. log.step.Wonderful for the WonderfulTask example above).
579 </li><li>task.max.depth.log
583 <li><b>Index writing</b>:
585 </li><li>merge.factor
586 </li><li>max.buffered
588 </li><li>ram.flush.mb
592 <li><b>Doc deletion</b>:
593 <ul><li>doc.delete.step
597 <li><b>Task alternative packages</b>:
598 <ul><li>alt.tasks.packages
599 - comma separated list of additional packages where tasks classes will be looked for
600 when not found in the default package (that of PerfTask). If the same task class
601 appears in more than one package, the package indicated first in this list will be used.
608 For sample use of these properties see the *.alg files under conf.
611 <a name="example"></a>
612 <h2>Example input algorithm and the result benchmark report</h2>
614 The following example is in conf/sample.alg:
616 <font color="#003333"># --------------------------------------------------------
618 # Sample: what is the effect of doc size on indexing time?
620 # There are two parts in this test:
621 # - PopulateShort adds 2N documents of length L
622 # - PopulateLong adds N documents of length 2L
623 # Which one would be faster?
624 # The comparison is done twice.
626 # --------------------------------------------------------
628 <font color="#990066"># -------------------------------------------------------------------------------------
629 # multi val params are iterated by NewRound's, added to reports, start with column name.
630 merge.factor=mrg:10:20
631 max.buffered=buf:100:1000
634 analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
635 directory=FSDirectory
639 doc.term.vector=false
644 doc.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleDocMaker
646 query.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleQueryMaker
648 # task at this depth or less would print when they start
652 # -------------------------------------------------------------------------------------</font>
653 <font color="#3300FF">{
657 { AddDoc(4000) > : 20000
666 { AddDoc(8000) > : 10000
678 RepSelectByPref Populate
684 The command line for running this sample:
685 <br><code>ant run-task -Dtask.alg=conf/sample.alg</code>
689 The output report from running this test contains the following:
691 Operation round mrg buf runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem
692 PopulateShort 0 10 100 1 20003 119.6 167.26 12,959,120 14,241,792
693 PopulateLong - - 0 10 100 - - 1 - - 10003 - - - 74.3 - - 134.57 - 17,085,208 - 20,635,648
694 PopulateShort 1 20 1000 1 20003 143.5 139.39 63,982,040 94,756,864
695 PopulateLong - - 1 20 1000 - - 1 - - 10003 - - - 77.0 - - 129.92 - 87,309,608 - 100,831,232
699 <a name="recsCounting"></a>
700 <h2>Results record counting clarified</h2>
702 Two columns in the results table indicate records counts: records-per-run and
703 records-per-second. What does it mean?
705 Almost every task gets 1 in this count just for being executed.
706 Task sequences aggregate the counts of their child tasks,
707 plus their own count of 1.
708 So, a task sequence containing 5 other task sequences, each running a single
709 other task 10 times, would have a count of 1 + 5 * (1 + 10) = 56.
711 The traverse and retrieve tasks "count" more: a traverse task
712 would add 1 for each traversed result (hit), and a retrieve task would
713 additionally add 1 for each retrieved doc. So, regular Search would
714 count 1, SearchTrav that traverses 10 hits would count 11, and a
715 SearchTravRet task that retrieves (and traverses) 10, would count 21.
717 Confusing? this might help: always examine the <code>elapsedSec</code> column,
718 and always compare "apples to apples", .i.e. it is interesting to check how the
719 <code>rec/s</code> changed for the same task (or sequence) between two
720 different runs, but it is not very useful to know how the <code>rec/s</code>
721 differs between <code>Search</code> and <code>SearchTrav</code> tasks. For
722 the latter, <code>elapsedSec</code> would bring more insight.