From c05b270edc293cfd337176f9efdf9d16904229b8 Mon Sep 17 00:00:00 2001 From: "W. Felix Handte" Date: Wed, 17 Jul 2019 17:30:09 -0400 Subject: [PATCH] [doc] Remove Limitation that Compressed Block is Smaller than Uncompressed Content This changes the size limit on compressed blocks to match those of the other block types: they may not be larger than the `Block_Maximum_Decompressed_Size`, which is the smaller of the `Window_Size` and 128 KB, removing the additional restriction that had been placed on `Compressed_Block`s, that they be smaller than the decompressed content they represent. Several things motivate removing this restriction. On the one hand, this restriction is not useful for decoders: the decoder must nonetheless be prepared to accept compressed blocks that are the full `Block_Maximum_Decompressed_Size`. And on the other, this bound is actually artificially limiting. If block representations were entirely independent, a compressed representation of a block that is larger than the contents of the block would be ipso facto useless, and it would be strictly better to send it as an `Raw_Block`. However, blocks are not entirely independent, and it can make sense to pay the cost of encoding custom entropy tables in a block, even if that pushes that block size over the size of the data it represents, because those tables can be re-used by subsequent blocks. Finally, as far as I can tell, this restriction in the spec is not currently enforced in any Zstandard implementation, nor has it ever been. This change should therefore be safe to make. --- doc/zstd_compression_format.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/doc/zstd_compression_format.md b/doc/zstd_compression_format.md index ed758cf5..ad5a61ec 100644 --- a/doc/zstd_compression_format.md +++ b/doc/zstd_compression_format.md @@ -390,9 +390,7 @@ A block can contain any number of bytes (even zero), up to - Window_Size - 128 KB -A `Compressed_Block` has the extra restriction that `Block_Size` is always -strictly less than the decompressed size. -If this condition cannot be respected, +If this condition cannot be respected when generating a `Compressed_Block`, the block must be sent uncompressed instead (`Raw_Block`).