NOTES ON THE LZRW2 ALGORITHM ============================ Author : Ross Williams. Date : 29-Jun-1991. Abstract -------- This file announces the release of my LZRW2 algorithm which was invented on 25-Nov-1990. The LZRW2 is a descendant of the LZRW1-A algorithm. LZRW2 runs a little more slowly and uses a little more memory than LZRW1-A but yields better compression and is deterministic. LZRW2 is not patented and is of the LZ77 class and so is unlikely to be subject to a patent challenge. The exact figures for the Calgary corpus on C implementations on my Macintosh are (percentage remaining, compression speed, decompression speed, memory required during compression and decompression): PerRem ComK/S DecK/S ComMem DecMem LZRW1-A 55.1 % 17 K/s 69 K/s 16 K 0 K LZRW2 51.5 % 18 K/s 54 K/s 24 K 16 K LZRW2 has been written and released mainly to demonstrate the phrase table idea but may also be useful to those who are unhappy with LZRW1-A's compression performance. Availability ------------ The only implementation available is in C. It can be found in the following archive within a couple of days of 29-Jun-1991: FTP Archive Access: Machine : sirius.itd.adelaide.edu.au [IP=129.127.40.3] Directory : ~pub/compression Files : lzrw2.tex - This file. lzrw2.c - An implementation in C. Motivation for LZRW2 -------------------- LZ77 algorithms derive their high decompression speed from the simplicity of the semantics of the code they use. To process each item, the decompressor has to either copy a single literal byte or convert an offset into an address and perform a multi-byte copy. Because there is no dictionary to look up, the offset/length scheme is extremely simple and fast to decode. However, length/offsets schemes waste a lot of code space. For example: 1) Many offsets refer to the same string values. 2) The length range may be too long for some strings. More sophisticated algorithms (e.g. LZ78 class algorithms) are more miserly with their code space and construct elaborate dictionaries to map between individual strings and the code space values. Although it may seem that one has to choose between a simple mapping and a complex dictionary, I believe that there is a grey region of extremely simple hardly-dictionary structures that bridge the gap. LZRW2 uses such a structure which I call a phrase table. To increase speed LZRW1-A (and LZRW1) only updates its hash table at the end of each phrase (where here a phrase is defined to be either a single-byte literal item or a copy item). This technique works well because it means that the work that the algorithm puts into updating its table is inversely proportional to how well it is compressing. When it is compressing badly, it will be moving one literal byte at a time and updating its table frequently. When compressing well, it will be moving at a few (copy) bytes at a time, only updating its table once every few bytes. The technique loses just a few percent compression, but increases speed significantly. A consequence of the technique is that MOST of the offset space is rendered unusable because the compression algorithm will never attempt to match at a particular offset unless that offset (or a corresponding pointer) appears in the hash table, and if the hash able is updated only every few bytes then most offsets won't make it into the table. To exploit the redundant offset space, a phrase table mechanism can be employed. A phrase table is a table of pointers to the first byte of the most recent 4096 phrases processed by the algorithm. The phrase table is organized as a circular array with a pointer to the next position. Both the compressor and decompressor maintain the phrase table which is very cheap to update, requiring only that once per phrase a pointer to the start of the phrase just encoded/decoded be overwritten at the next position in the phrase table and the position pointer incremented circularly. As in LZRW1-A, the compressor maintains a hash table (the decompressor doesn't have to), but in LZRW2, the hash table entries are indexes to the phrase table. LZRW2 is identical to LZRW1-A except that instead of transmitting a length and an offset, LZRW2 transmits a length and a phrase table index. The effect of the phrase table is not to fill in the gaps in the history positions to which the compressor can point, but rather to accept the gaps as good for speed and use the phrase table to increase the reach of the compressor in pointing far back. LZRW1 and LZRW1-A can only reach back 4095 BYTES whereas LZRW2 can reach back 4096 PHRASES which is a lot more, particularly as the average length of phrases can get up in the 4 to 6 byte range. Whether the circular phrase table idea is new or not (and as far as I know, it is), the LZRW2 algorithm acts as a good example of its application. For more details, see the code itself. Benchmark --------- Here are the results of applying LZRW2.C compiled under THINK C 4.0 and running on a Mac-SE (8MHz 68000) to the standard Calgary corpus. +----------------------------------------------------------------+ | DATA COMPRESSION TEST | | ===================== | | Time of run : Sat 29-Jun-1991 01:24PM | | Timing accuracy : One part in 100 | | Context length : 262144 bytes (= 256.0000K) | | Test suite : Calgary Corpus Suite | | Files in suite : 14 | | Algorithm : LZRW2 | | Note: All averages are calculated from the un-rounded values. | +----------------------------------------------------------------+ | File Name Length CxB ComLen %Remn Bits Com K/s Dec K/s | | ---------- ------ --- ------ ----- ---- ------- ------- | | rpus:Bib.D 111261 1 58726 52.8 4.22 16.98 52.36 | | us:Book1.D 768771 3 491413 63.9 5.11 14.82 47.04 | | us:Book2.D 610856 3 331932 54.3 4.35 17.10 51.28 | | rpus:Geo.D 102400 1 84118 82.1 6.57 10.81 41.67 | | pus:News.D 377109 2 215652 57.2 4.57 15.20 50.68 | | pus:Obj1.D 21504 1 13032 60.6 4.85 13.13 50.15 | | pus:Obj2.D 246814 1 119078 48.2 3.86 17.81 55.01 | | s:Paper1.D 53161 1 28269 53.2 4.25 17.16 51.92 | | s:Paper2.D 82199 1 46789 56.9 4.55 16.58 49.96 | | rpus:Pic.D 513216 2 123311 24.0 1.92 33.17 71.63 | | us:Progc.D 39611 1 19959 50.4 4.03 17.65 53.36 | | us:Progl.D 71646 1 28571 39.9 3.19 22.63 59.13 | | us:Progp.D 49379 1 19544 39.6 3.17 22.52 59.45 | | us:Trans.D 93695 1 35364 37.7 3.02 22.87 60.89 | +----------------------------------------------------------------+ | Average 224401 1 115411 51.5 4.12 18.46 53.89 | +----------------------------------------------------------------+ ----