Versions tested: | 3.2 |
Optimal parameters: | -l6 -b32 -m32 |
Links: | http://sourceforge.n... wrt40-eng.rar |
Authors: | Przemyslaw Skibinski, Igor Pavlov, Dmitry Shkarin, Matt Mahoney |
Algorithms: | DICT+LZ/PPM/CM |
Notable peformances: | - |
XWRT (XML-WRT) is a high-performance XML compressor. It transforms XML to more compressible form and uses zlib (default), LZMA, PPMVC, or lpaq 6 as back-end compressor. This idea is based on well-known XML compressor - XMill. Moreover, XML-WRT creates a semi-dynamic dictionary and replaces frequently used words with shorter codes. There are additional techniques to improve compression ratio:the program documentation
- word alphabet can consist of start tags (like '<tag>'), urls, e-mails
- special model for numbers encoding
- input XML file is split into containers
- there are special containers for dates, time, pages and fractional numbers
- end tags ('</tag>') are replaced with a single char
- end tags + EOL symbols can also be replaced with a single char
- spaceless words model
- very effective methods for white-space preserving
- quotes modeling ('="' and '">' replaced with a single char)
March 31, 2009 the author submitted the following configurations:
a) semi-dynamic dictionary:
-l2 -b32 -m32
-l6 -b32 -m32
-l9 -b32 -m32
-l12 -b32 -m32
b) static dictionary (wrt-eng.dic in the same directory as xwrt.exe is required, please download http://www.ii.uni.wroc.pl/~inikep/research/wrt40-eng.rar):
-l2 +d -f65535 -m32
-l6 +d -f65535 -m32
-l9 +d -f65535 -m32
-l12 +d -f65535 -m3
(Have info that should be added here? E-mail.)
3.2 has verify error on Image1.
Ver | Rating | CPR | DPR | S.E. | R.E. | Ratio | C. kB/s | D. kB/s | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
-l2 +d -f65535 -m32 | ||||||||||||
3.2 | 7 | 7 | 6 | 78 | 0 | 2.706 | 7251 | 70084 | ||||
-l2 -b32 -m32 | ||||||||||||
3.2 | 7 | 7 | 6 | 76 | 0 | 2.712 | 6971 | 70397 | ||||
-l6 +d -f65535 -m32 | ||||||||||||
3.2 | 85 | 83 | 125 | 52 | 10 | 3.846 | 1399 | 24961 | ||||
-l6 -b32 -m32 | ||||||||||||
3.2 | 85 | 83 | 126 | 52 | 10 | 3.848 | 1398 | 25028 |
Ver | Rating | CPR | DPR | S.E. | R.E. | Ratio | C. kB/s | D. kB/s | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
-l12 +d -f65535 -m32 | ||||||||||||
3.2 | 467 | 431 | 505 | 125 | 190 | 3.588 | 664 | 693 | ||||
-l12 -b32 -m32 | ||||||||||||
3.2 | 400 | 369 | 432 | 112 | 153 | 3.564 | 604 | 629 | ||||
-l9 +d -f65535 -m32 | ||||||||||||
3.2 | 689 | 633 | 748 | 339 | 118 | 3.263 | 2319 | 2438 | ||||
-l9 -b32 -m32 | ||||||||||||
3.2 | verify error (img1) |