X-Git-Url: https://git.mdrn.pl/pylucene.git/blobdiff_plain/a2e61f0c04805cfcb8706176758d1283c7e3a55c..aaeed5504b982cf3545252ab528713250aa33eed:/lucene-java-3.4.0/lucene/contrib/benchmark/CHANGES.txt?ds=sidebyside diff --git a/lucene-java-3.4.0/lucene/contrib/benchmark/CHANGES.txt b/lucene-java-3.4.0/lucene/contrib/benchmark/CHANGES.txt deleted file mode 100644 index af480da..0000000 --- a/lucene-java-3.4.0/lucene/contrib/benchmark/CHANGES.txt +++ /dev/null @@ -1,418 +0,0 @@ -Lucene Benchmark Contrib Change Log - -The Benchmark contrib package contains code for benchmarking Lucene in a variety of ways. - -For more information on past and future Lucene versions, please see: -http://s.apache.org/luceneversions - -05/25/2011 - LUCENE-3137: ExtractReuters supports out-dir param suffixed by a slash. (Doron Cohen) - -03/24/2011 - LUCENE-2977: WriteLineDocTask now automatically detects how to write - - GZip or BZip2 or Plain-text - according to the output file extension. - Property bzip.compression of WriteLineDocTask was canceled. (Doron Cohen) - -03/23/2011 - LUCENE-2980: Benchmark's ContentSource no more requires lower case file suffixes - for detecting file type (gzip/bzip2/text). As part of this fix worked around an - issue with gzip input streams which were remaining open (See COMPRESS-127). - (Doron Cohen) - -03/22/2011 - LUCENE-2978: Upgrade benchmark's commons-compress from 1.0 to 1.1 as - the move of gzip decompression in LUCENE-1540 from Java's GZipInputStream - to commons-compress 1.0 made it 15 times slower. In 1.1 no such slow-down - is observed. (Doron Cohen) - -03/21/2011 - LUCENE-2958: WriteLineDocTask improvements - allow to emit line docs also for empty - docs, and be flexible about which fields are added to the line file. For this, a header - line was added to the line file. That header is examined by LineDocSource. Old line - files which have no header line are handled as before, imposing the default header. - (Doron Cohen, Shai Erera, Mike McCandless) - -03/21/2011 - LUCENE-2964: Allow benchmark tasks from alternative packages, - specified through a new property "alt.tasks.packages". - (Doron Cohen, Shai Erera) - -03/20/2011 - LUCENE-2963: Easier way to run benchmark, by calling Benmchmark.exec(alg-file). - (Doron Cohen) - -03/10/2011 - LUCENE-2961: Removed lib/xml-apis.jar, since JVM 1.5+ already contains the - JAXP 1.3 interface classes it provides. - -02/03/2011 - LUCENE-1540: Improvements to contrib.benchmark for TREC collections. - ContentSource can now process plain text files, gzip files, and bzip2 files. - TREC doc parsing now handles the TREC gov2 collection and TREC disks 4&5-CR - collection (both used by many TREC tasks). (Shai Erera, Doron Cohen) - -01/31/2011 - LUCENE-1591: Rollback to xerces-2.9.1-patched-XERCESJ-1257.jar to workaround - XERCESJ-1257, which we hit on current Wikipedia XML export - (ENWIKI-20110115-pages-articles.xml) with xerces-2.10.0.jar. (Mike McCandless) - -01/26/2011 - LUCENE-929: ExtractReuters first extracts to a tmp dir and then renames. That - way, if a previous extract attempt failed, "ant extract-reuters" will still - extract the files. (Shai Erera, Doron Cohen, Grant Ingersoll) - -01/24/2011 - LUCENE-2885: Add WaitForMerges task (calls IndexWriter.waitForMerges()). - (Mike McCandless) - -10/10/2010 - The locally built patched version of the Xerces-J jar introduced - as part of LUCENE-1591 is no longer required, because Xerces - 2.10.0, which contains a fix for XERCESJ-1257 (see - http://svn.apache.org/viewvc?view=revision&revision=554069), - was released last year. Upgraded - xerces-2.9.1-patched-XERCESJ-1257.jar and xml-apis-2.9.0.jar - to xercesImpl-2.10.0.jar and xml-apis-2.10.0.jar. (Steven Rowe) - -4/27/2010 - LUCENE-2416: WriteLineDocTask now supports multi-threading. Also, - StringBufferReader was renamed to StringBuilderReader and works on - StringBuilder now. In addition, LongToEnglishContentSource starts from 0 - (instead of Long.MIN_VAL+10) and wraps around to MIN_VAL (if you ever hit - Long.MAX_VAL). (Shai Erera) - -4/07/2010 - LUCENE-2377: Enable the use of NoMergePolicy and NoMergeScheduler by - CreateIndexTask. (Shai Erera) - -3/28/2010 - LUCENE-2353: Fixed bug in Config where Windows absolute path property values - were incorrectly handled (Shai Erera) - -3/24/2010 - LUCENE-2343: Added support for benchmarking collectors. (Grant Ingersoll, Shai Erera) - -2/21/2010 - LUCENE-2254: Add support to the quality package for running - experiments with any combination of Title, Description, and Narrative. - (Robert Muir) - -1/28/2010 - LUCENE-2223: Add a benchmark for ShingleFilter. You can wrap any - analyzer with ShingleAnalyzerWrapper and specify shingle parameters - with the NewShingleAnalyzer task. (Steven Rowe via Robert Muir) - -1/14/2010 - LUCENE-2210: TrecTopicsReader now properly reads descriptions and - narratives from trec topics files. (Robert Muir) - -1/11/2010 - LUCENE-2181: Add a benchmark for collation. This adds NewLocaleTask, - which sets a Locale in the run data for collation to use, and can be - used in the future for benchmarking localized range queries and sorts. - Also add NewCollationAnalyzerTask, which works with both JDK and ICU - Collator implementations. Fix ReadTokensTask to not tokenize fields - unless they should be tokenized according to DocMaker config. The - easiest way to run the benchmark is to run 'ant collation' - (Steven Rowe via Robert Muir) - -12/22/2009 - LUCENE-2178: Allow multiple locations to add to the class path with - -Dbenchmark.ext.classpath=... when running "ant run-task" (Steven - Rowe via Mike McCandless) - -12/17/2009 - LUCENE-2168: Allow negative relative thread priority for BG tasks - (Mike McCandless) - -12/07/2009 - LUCENE-2106: ReadTask does not close its Reader when - OpenReader/CloseReader are not used. (Mark Miller) - -11/17/2009 - LUCENE-2079: Allow specifying delta thread priority after the "&"; - added log.time.step.msec to print per-time-period counts; fixed - NearRealTimeTask to print reopen times (in msec) of each reopen, at - the end. (Mike McCandless) - -11/13/2009 - LUCENE-2050: Added ability to run tasks within a serial sequence in - the background, by appending "&". The tasks are stopped & joined at - the end of the sequence. Also added Wait and RollbackIndex tasks. - Genericized NearRealTimeReaderTask to only reopen the reader - (previously it spawned its own thread, and also did searching). - Also changed the API of PerfRunData.getIndexReader: it now returns a - reference, and it's your job to decRef the reader when you're done - using it. (Mike McCandless) - -11/12/2009 - LUCENE-2059: allow TrecContentSource not to change the docname. - Previously, it would always append the iteration # to the docname. - With the new option content.source.excludeIteration, you can disable this. - The resulting index can then be used with the quality package to measure - relevance. (Robert Muir) - -11/12/2009 - LUCENE-2058: specify trec_eval submission output from the command line. - Previously, 4 arguments were required, but the third was unused. The - third argument is now the desired location of submission.txt (Robert Muir) - -11/08/2009 - LUCENE-2044: Added delete.percent.rand.seed to seed the Random instance - used by DeleteByPercentTask. (Mike McCandless) - -11/07/2009 - LUCENE-2043: Fix CommitIndexTask to also commit pending IndexReader - changes (Mike McCandless) - -11/07/2009 - LUCENE-2042: Added print.hits.field, to print each hit from the - Search* tasks. (Mike McCandless) - -11/04/2009 - LUCENE-2029: Added doc.body.stored and doc.body.tokenized; each - falls back to the non-body variant as its default. (Mike McCandless) - -10/28/2009 - LUCENE-1994: Fix thread safety of EnwikiContentSource and DocMaker - when doc.reuse.fields is false. Also made docs.reuse.fields=true - thread safe. (Mark Miller, Shai Erera, Mike McCandless) - -8/4/2009 - LUCENE-1770: Add EnwikiQueryMaker (Mark Miller) - -8/04/2009 - LUCENE-1773: Add FastVectorHighlighter tasks. This change is a - non-backwards compatible change in how subclasses of ReadTask define - a highlighter. The methods doHighlight, isMergeContiguousFragments, - maxNumFragments and getHighlighter are no longer used and have been - mark deprecated and package protected private so there's a compile - time error. Instead, the new getBenchmarkHighlighter method should - return an appropriate highlighter for the task. The configuration of - the highlighter tasks (maxFrags, mergeContiguous, etc.) is now - accepted as params to the task. (Koji Sekiguchi via Mike McCandless) - -8/03/2009 - LUCENE-1778: Add support for log.step setting per task type. Perviously, if - you included a log.step line in the .alg file, it had been applied to all - tasks. Now, you can include a log.step.AddDoc, or log.step.DeleteDoc (for - example) to control logging for just these tasks. If you want to ommit logging - for any other task, include log.step=-1. The syntax is "log.step." together - with the Task's 'short' name (i.e., without the 'Task' part). - (Shai Erera via Mark Miller) - -7/24/2009 - LUCENE-1595: Deprecate LineDocMaker and EnwikiDocMaker in favor of - using DocMaker directly, with content.source = LineDocSource or - EnwikiContentSource. NOTE: with this change, the "id" field from - the Wikipedia XML export is now indexed as the "docname" field - (previously it was indexed as "docid"). Additionaly, the - SearchWithSort task now accepts all types that SortField can accept - and no longer falls back to SortField.AUTO, which has been - deprecated. (Mike McCandless) - -7/20/2009 - LUCENE-1755: Fix WriteLineDocTask to output a document if it contains either - a title or body (or both). (Shai Erera via Mark Miller) - -7/14/2009 - LUCENE-1725: Fix the example Sort algorithm - auto is now deprecated and no longer works - with Benchmark. Benchmark will now throw an exception if you specify sort fields without - a type. The example sort algorithm is now typed. (Mark Miller) - -7/6/2009 - LUCENE-1730: Fix TrecContentSource to use ISO-8859-1 when reading the TREC files, - unless a different encoding is specified. Additionally, ContentSource now supports - a content.source.encoding parameter in the configuration file. - (Shai Erera via Mark Miller) - -6/26/2009 - LUCENE-1716: Added the following support: - doc.tokenized.norms: specifies whether to store norms - doc.body.tokenized.norms: special attribute for the body field - doc.index.props: specifies whether DocMaker should index the properties set on - DocData - writer.info.stream: specifies the info stream to set on IndexWriter (supported - values are: SystemOut, SystemErr and a file name). (Shai Erera via Mike McCandless) - -6/23/09 - LUCENE-1714: WriteLineDocTask incorrectly normalized text, by replacing only - occurrences of "\t" with a space. It now replaces "\r\n" in addition to that, - so that LineDocMaker won't fail. (Shai Erera via Michael McCandless) - -6/17/09 - LUCENE-1595: This issue breaks previous external algorithms. DocMaker has been - replaced with a concrete class which accepts a ContentSource for iterating over - a content source's documents. Most of the old DocMakers were changed to a - ContentSource implementation, and DocMaker is now a default document creation impl - that provides an easy way for reusing fields. When [doc.maker] is not defined in - an algorithm, the new DocMaker is the default. If you have .alg files which - specify a DocMaker (like ReutersDocMaker), you should change the [doc.maker] line to: - [content.source=org.apache.lucene.benchmark.byTask.feeds.ReutersContentSource] - - i.e. - doc.maker=org.apache.lucene.benchmark.byTask.feeds.ReutersDocMaker - becomes - content.source=org.apache.lucene.benchmark.byTask.feeds.ReutersContentSource - - doc.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleDocMaker - becomes - content.source=org.apache.lucene.benchmark.byTask.feeds.SingleDocSource - - Also, PerfTask now logs a message in tearDown() rather than each Task doing its - own logging. A new setting called [log.step] is consulted to determine how often - to log. [doc.add.log.step] is no longer a valid setting. For easy migration of - current .alg files, rename [doc.add.log.step] to [log.step] and [doc.delete.log.step] - to [delete.log.step]. - - Additionally, [doc.maker.forever] should be changed to [content.source.forever]. - (Shai Erera via Mark Miller) - -6/12/09 - LUCENE-1539: Added DeleteByPercentTask which enables deleting a - percentage of documents and searching on them. Changed CommitIndex - to optionally accept a label (recorded as userData=