X-Git-Url: https://git.mdrn.pl/pylucene.git/blobdiff_plain/a2e61f0c04805cfcb8706176758d1283c7e3a55c..aaeed5504b982cf3545252ab528713250aa33eed:/lucene-java-3.4.0/lucene/contrib/benchmark/README.enwiki diff --git a/lucene-java-3.4.0/lucene/contrib/benchmark/README.enwiki b/lucene-java-3.4.0/lucene/contrib/benchmark/README.enwiki deleted file mode 100644 index f9d4930..0000000 --- a/lucene-java-3.4.0/lucene/contrib/benchmark/README.enwiki +++ /dev/null @@ -1,22 +0,0 @@ -Support exists for downloading, parsing, and loading the English -version of wikipedia (enwiki). - -The build file can automatically try to download the most current -enwiki dataset (pages-articles.xml.bz2) from the "latest" directory, -http://download.wikimedia.org/enwiki/latest/. However, this file -doesn't always exist, depending on where wikipedia is in the dump -process and whether prior dumps have succeeded. If this file doesn't -exist, you can sometimes find an older or in progress version by -looking in the dated directories under -http://download.wikimedia.org/enwiki/. For example, as of this -writing, there is a page file in -http://download.wikimedia.org/enwiki/20070402/. You can download this -file manually and put it in temp. Note that the file you download will -probably have the date in the name, e.g., -http://download.wikimedia.org/enwiki/20070402/enwiki-20070402-pages-articles.xml.bz2. When -you put it in temp, rename it to enwiki-latest-pages-articles.xml.bz2. - -After that, ant enwiki should process the data set and run a load -test. Ant targets get-enwiki, expand-enwiki, and extract-enwiki can -also be used to download, decompress, and extract (to individual files -in work/enwiki) the dataset, respectively.