Merge pull request #367 from jrudolph/docs-table-examples

Add FSE decoding tables for pred. distributions to spec for reference + clarify bitstream reading
dev
Yann Collet 2016-09-14 20:15:13 +02:00 committed by GitHub
commit 6f358f5925
1 changed files with 198 additions and 8 deletions

View File

@ -1049,15 +1049,23 @@ by reading the required `Number_of_Bits`, and adding the specified `Baseline`.
#### Bitstream
All sequences are stored in a single bitstream, read _backward_.
It is therefore necessary to know the bitstream size,
which is deducted from compressed block size.
FSE bitstreams are read in reverse direction than written. In zstd,
the compressor writes bits forward into a block and the decompressor
must read the bitstream _backwards_.
The last useful bit of the stream is followed by an end-bit-flag.
Highest bit of last byte is this flag.
It does not belong to the useful part of the bitstream.
Therefore, last byte has 0-7 useful bits.
Note that it also means that last byte cannot be `0`.
To find the start of the bitstream it is therefore necessary to
know the offset of the last byte of the block which can be found
by counting `Block_Size` bytes after the block header.
After writing the last bit containing information, the compressor
writes a single `1`-bit and then fills the byte with 0-7 `0` bits of
padding. The last byte of the compressed bitstream cannot be `0` for
that reason.
When decompressing, the last byte containing the padding is the first
byte to read. The decompressor needs to skip 0-7 initial `0`-bits and
the first `1`-bit it occurs. Afterwards, the useful part of the bitstream
begins.
##### Starting states
@ -1164,6 +1172,188 @@ __`Content`__ : The rest of the dictionary is its content.
[compressed blocks]: #the-format-of-compressed_block
Appendix A - Decoding tables for predefined codes
-------------------------------------------------
This appendix contains FSE decoding tables for the predefined literal length, match length, and offset
codes. The tables have been constructed using the algorithm as given above in the
"from normalized distribution to decoding tables" chapter. The tables here can be used as examples
to crosscheck that an implementation implements the decoding table generation algorithm correctly.
#### Literal Length Code:
| State | Symbol | Number_Of_Bits | Base |
| ----- | ------ | -------------- | ---- |
| 0 | 0 | 4 | 0 |
| 1 | 0 | 4 | 16 |
| 2 | 1 | 5 | 32 |
| 3 | 3 | 5 | 0 |
| 4 | 4 | 5 | 0 |
| 5 | 6 | 5 | 0 |
| 6 | 7 | 5 | 0 |
| 7 | 9 | 5 | 0 |
| 8 | 10 | 5 | 0 |
| 9 | 12 | 5 | 0 |
| 10 | 14 | 6 | 0 |
| 11 | 16 | 5 | 0 |
| 12 | 18 | 5 | 0 |
| 13 | 19 | 5 | 0 |
| 14 | 21 | 5 | 0 |
| 15 | 22 | 5 | 0 |
| 16 | 24 | 5 | 0 |
| 17 | 25 | 5 | 32 |
| 18 | 26 | 5 | 0 |
| 19 | 27 | 6 | 0 |
| 20 | 29 | 6 | 0 |
| 21 | 31 | 6 | 0 |
| 22 | 0 | 4 | 32 |
| 23 | 1 | 4 | 0 |
| 24 | 2 | 5 | 0 |
| 25 | 4 | 5 | 32 |
| 26 | 5 | 5 | 0 |
| 27 | 7 | 5 | 32 |
| 28 | 8 | 5 | 0 |
| 29 | 10 | 5 | 32 |
| 30 | 11 | 5 | 0 |
| 31 | 13 | 6 | 0 |
| 32 | 16 | 5 | 32 |
| 33 | 17 | 5 | 0 |
| 34 | 19 | 5 | 32 |
| 35 | 20 | 5 | 0 |
| 36 | 22 | 5 | 32 |
| 37 | 23 | 5 | 0 |
| 38 | 25 | 4 | 0 |
| 39 | 25 | 4 | 16 |
| 40 | 26 | 5 | 32 |
| 41 | 28 | 6 | 0 |
| 42 | 30 | 6 | 0 |
| 43 | 0 | 4 | 48 |
| 44 | 1 | 4 | 16 |
| 45 | 2 | 5 | 32 |
| 46 | 3 | 5 | 32 |
| 47 | 5 | 5 | 32 |
| 48 | 6 | 5 | 32 |
| 49 | 8 | 5 | 32 |
| 50 | 9 | 5 | 32 |
| 51 | 11 | 5 | 32 |
| 52 | 12 | 5 | 32 |
| 53 | 15 | 6 | 0 |
| 54 | 17 | 5 | 32 |
| 55 | 18 | 5 | 32 |
| 56 | 20 | 5 | 32 |
| 57 | 21 | 5 | 32 |
| 58 | 23 | 5 | 32 |
| 59 | 24 | 5 | 32 |
| 60 | 35 | 6 | 0 |
| 61 | 34 | 6 | 0 |
| 62 | 33 | 6 | 0 |
| 63 | 32 | 6 | 0 |
#### Match Length Code:
| State | Symbol | Number_Of_Bits | Base |
| ----- | ------ | -------------- | ---- |
| 0 | 0 | 6 | 0 |
| 1 | 1 | 4 | 0 |
| 2 | 2 | 5 | 32 |
| 3 | 3 | 5 | 0 |
| 4 | 5 | 5 | 0 |
| 5 | 6 | 5 | 0 |
| 6 | 8 | 5 | 0 |
| 7 | 10 | 6 | 0 |
| 8 | 13 | 6 | 0 |
| 9 | 16 | 6 | 0 |
| 10 | 19 | 6 | 0 |
| 11 | 22 | 6 | 0 |
| 12 | 25 | 6 | 0 |
| 13 | 28 | 6 | 0 |
| 14 | 31 | 6 | 0 |
| 15 | 33 | 6 | 0 |
| 16 | 35 | 6 | 0 |
| 17 | 37 | 6 | 0 |
| 18 | 39 | 6 | 0 |
| 19 | 41 | 6 | 0 |
| 20 | 43 | 6 | 0 |
| 21 | 45 | 6 | 0 |
| 22 | 1 | 4 | 16 |
| 23 | 2 | 4 | 0 |
| 24 | 3 | 5 | 32 |
| 25 | 4 | 5 | 0 |
| 26 | 6 | 5 | 32 |
| 27 | 7 | 5 | 0 |
| 28 | 9 | 6 | 0 |
| 29 | 12 | 6 | 0 |
| 30 | 15 | 6 | 0 |
| 31 | 18 | 6 | 0 |
| 32 | 21 | 6 | 0 |
| 33 | 24 | 6 | 0 |
| 34 | 27 | 6 | 0 |
| 35 | 30 | 6 | 0 |
| 36 | 32 | 6 | 0 |
| 37 | 34 | 6 | 0 |
| 38 | 36 | 6 | 0 |
| 39 | 38 | 6 | 0 |
| 40 | 40 | 6 | 0 |
| 41 | 42 | 6 | 0 |
| 42 | 44 | 6 | 0 |
| 43 | 1 | 4 | 32 |
| 44 | 1 | 4 | 48 |
| 45 | 2 | 4 | 16 |
| 46 | 4 | 5 | 32 |
| 47 | 5 | 5 | 32 |
| 48 | 7 | 5 | 32 |
| 49 | 8 | 5 | 32 |
| 50 | 11 | 6 | 0 |
| 51 | 14 | 6 | 0 |
| 52 | 17 | 6 | 0 |
| 53 | 20 | 6 | 0 |
| 54 | 23 | 6 | 0 |
| 55 | 26 | 6 | 0 |
| 56 | 29 | 6 | 0 |
| 57 | 52 | 6 | 0 |
| 58 | 51 | 6 | 0 |
| 59 | 50 | 6 | 0 |
| 60 | 49 | 6 | 0 |
| 61 | 48 | 6 | 0 |
| 62 | 47 | 6 | 0 |
| 63 | 46 | 6 | 0 |
#### Offset Code:
| State | Symbol | Number_Of_Bits | Base |
| ----- | ------ | -------------- | ---- |
| 0 | 0 | 5 | 0 |
| 1 | 6 | 4 | 0 |
| 2 | 9 | 5 | 0 |
| 3 | 15 | 5 | 0 |
| 4 | 21 | 5 | 0 |
| 5 | 3 | 5 | 0 |
| 6 | 7 | 4 | 0 |
| 7 | 12 | 5 | 0 |
| 8 | 18 | 5 | 0 |
| 9 | 23 | 5 | 0 |
| 10 | 5 | 5 | 0 |
| 11 | 8 | 4 | 0 |
| 12 | 14 | 5 | 0 |
| 13 | 20 | 5 | 0 |
| 14 | 2 | 5 | 0 |
| 15 | 7 | 4 | 16 |
| 16 | 11 | 5 | 0 |
| 17 | 17 | 5 | 0 |
| 18 | 22 | 5 | 0 |
| 19 | 4 | 5 | 0 |
| 20 | 8 | 4 | 16 |
| 21 | 13 | 5 | 0 |
| 22 | 19 | 5 | 0 |
| 23 | 1 | 5 | 0 |
| 24 | 6 | 4 | 16 |
| 25 | 10 | 5 | 0 |
| 26 | 16 | 5 | 0 |
| 27 | 28 | 5 | 0 |
| 28 | 27 | 5 | 0 |
| 29 | 26 | 5 | 0 |
| 30 | 25 | 5 | 0 |
| 31 | 24 | 5 | 0 |
Version changes
---------------