Commit Graph

2263 Commits (1d78fdea11b0aa5656d3eb0f8e02dfa48159b2bb)

Author SHA1 Message Date
Przemyslaw Skibinski 5fc901d1ce images/ moved to doc/images/ 2016-10-25 10:05:20 +02:00
Yann Collet 7b5948cca7 Merge pull request #426 from terrelln/fixes
Fix various {A, M}SAN bugs
2016-10-24 23:42:26 -07:00
Yann Collet 37d130031d updated comments on context re-use 2016-10-24 17:22:12 -07:00
Nick Terrell b2c39a22b0 Fix compiler narrowing warning 2016-10-24 14:50:13 -07:00
Nick Terrell f698ad6deb Merge remote-tracking branch 'upstream/dev' into fixes
* upstream/dev:
  added doc\zstd_manual.html
  added contrib\gen_html
  zstd_compression_format.md moved to doc/
  Fix small bug in ZSTD_execSequence()
  improved ZSTD_compressBlock_opt_extDict_generic
  protect ZSTD_decodeFrameHeader() from invalid usage, as suggested by @spaskob
  zstd_opt.h: small improvement in compression ratio
  improved dicitonary segment merge
  use implicit rules to compile zstd_decompress.c
  detect early impossible decompression scenario in legacy decoder v0.5
  no repeat mode in legacy v0.5
  fixed invalid invocation of dictionary in legacy decoder v0.5
  fix edge case
  fix command line interpretation
  fixed minor corner case
  zstd.h: added the Introduction section
  fixed clang 3.5 warnings
  zstd.h: updated comments
2016-10-24 13:10:13 -07:00
Yann Collet 4239a207dd Merge pull request #425 from inikep/dev11
Doc
2016-10-24 11:11:40 -07:00
Nick Terrell f9c9af3c2e Reject dictionaries with incomplete entropy tables
If a dictionary specifies that a symbol has probability zero in its
`matchLength`, `literalLength`, or `offset` FSE table, but the symbol
appears when compressing input, the compressor fails.

Ensure that dictionaries support all `matchLength`, and `literalLength`
codes.  They must also support all of the `offset` codes required to
represent every possible offset that can appear in the first block.
2016-10-24 10:42:44 -07:00
Przemyslaw Skibinski 86d9424c81 added doc\zstd_manual.html 2016-10-24 16:07:53 +02:00
Przemyslaw Skibinski 984b66cd72 added contrib\gen_html 2016-10-24 15:59:51 +02:00
Przemyslaw Skibinski 3ee94a7600 zstd_compression_format.md moved to doc/ 2016-10-24 15:58:07 +02:00
Yann Collet 97611611a3 Merge pull request #423 from terrelln/exec-seq-patch
Fix small bug in ZSTD_execSequence()
2016-10-21 17:02:06 -07:00
Nick Terrell ae1cb3b3d0 Fix small bug in ZSTD_execSequence()
`memmove(op, match, sequence.matchLength)` is not the desired behavior.
Overlap is allowed, and handled as if we did `*op++ = *match++`, which
is not how `memmove()` handles overlap.

Only triggered if both of the following conditions are met:
* The match spans extDict & currentPrefixSegment
* `oLitEnd <= oend_w < oLitEnd + length1 < oMatchEnd <= oend`.

These two conditions imply that the block is less than 15 bytes long.
This bug isn't triggered by the streaming API, because it allocates
enough space for the window size + the block size, so there cannot be
a match that is within 8 bytes of the end and overlaps with itself.
It cannot be triggered by the block decompression API because all of
the decompressed data is in the currentPrefixSegment.

Introduced by commit 7158584399
2016-10-21 12:13:44 -07:00
Przemyslaw Skibinski 4732074a71 improved ZSTD_compressBlock_opt_extDict_generic 2016-10-21 11:19:00 +02:00
Yann Collet da3bd8b6de protect ZSTD_decodeFrameHeader() from invalid usage, as suggested by @spaskob 2016-10-20 20:11:00 -07:00
Przemyslaw Skibinski d365ae3497 zstd_opt.h: small improvement in compression ratio 2016-10-20 11:49:02 +02:00
Przemyslaw Skibinski 575ab00db7 Merge remote-tracking branch 'refs/remotes/facebook/dev' into dev11 2016-10-20 11:01:52 +02:00
Przemyslaw Skibinski dafd66c7ac Merge remote-tracking branch 'refs/remotes/origin/dev' into dev11 2016-10-20 10:54:39 +02:00
Nick Terrell d760529a05 Fix stack buffer overrun when weightTotal == 0
If `weightTotal == 0`, then `BIT_highbit32(weightTotal)` is
undefined behavior in the case that it calls `__builtin_clz()`.
If `tableLog == HUF_TABLELOG_ABSOLUTEMAX` then we will access one
byte beyond the end of the buffer.
2016-10-19 11:39:11 -07:00
Nick Terrell bb68062c59 Unitialized memory read in ZSTD_decodeSeqHeaders()
Caused by two things:
1. Not checking that `ip` is in range except for the first byte.
2. `ZSTDv0{5,6}_decodeLiteralsBlock()` could return a value larger than `srcSize`.
2016-10-18 16:41:33 -07:00
Yann Collet 52c1bf93fe improved dicitonary segment merge 2016-10-18 16:34:58 -07:00
Yann Collet a7a4690b0a use implicit rules to compile zstd_decompress.c 2016-10-18 16:01:03 -07:00
Nick Terrell 7b06ad7a05 Backport fix from commit 125d817
This fixes a read of unitialized memory.
Full commit hash: 125d81774f.
2016-10-18 14:52:34 -07:00
Nick Terrell f45b157d95 Backport fix from commit 9e8b09a
Fixes uninitialized memory reads.
Full commit hash: 9e8b09a7bd
2016-10-18 14:22:49 -07:00
Yann Collet f7906d5955 detect early impossible decompression scenario in legacy decoder v0.5 2016-10-18 13:48:32 -07:00
Yann Collet 9313c8d953 no repeat mode in legacy v0.5 2016-10-18 13:36:15 -07:00
Yann Collet 83d7bdee4b fixed invalid invocation of dictionary in legacy decoder v0.5 2016-10-18 12:25:43 -07:00
Yann Collet 197a55ee7b fix edge case 2016-10-18 11:27:52 -07:00
Nick Terrell fd98087047 Fix stack buffer overflow in HUF_readCTable()
If `w ==0` on line 153, then `CTable[n].nbBits == tableLog + 1`.
Then `nbPerRank[CTable[n].nbBits]` and `valPerRank[CTable[n].nbBits]`
are stack buffer overflows.
2016-10-17 18:16:59 -07:00
Yann Collet 33fdd099bb fix command line interpretation 2016-10-17 17:48:48 -07:00
Yann Collet 06573e17be fixed minor corner case 2016-10-17 17:28:28 -07:00
Nick Terrell bfd943ace5 Fix buffer overrun in ZSTD_loadDictEntropyStats()
The table log set by `FSE_readNCount()` was not checked in
`ZSTD_loadDictEntropyStats()`.  This caused `FSE_buildCTable()`
to stack/heap overflow in a few places.

The benchmarks look good, there is no obvious compression performance regression:

  > ./zstds/zstd.opt.0 -i10 -b1 -e10 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 271.6 MB/s , 716.8 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 204.8 MB/s , 671.1 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 156.8 MB/s , 658.6 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 136.4 MB/s , 665.3 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  98.9 MB/s , 647.0 MB/s
   6#silesia.tar       : 211988480 ->  62979643 (3.366),  65.2 MB/s , 670.4 MB/s
   7#silesia.tar       : 211988480 ->  61974560 (3.421),  44.9 MB/s , 688.2 MB/s
   8#silesia.tar       : 211988480 ->  61028308 (3.474),  32.4 MB/s , 711.9 MB/s
   9#silesia.tar       : 211988480 ->  60416751 (3.509),  21.1 MB/s , 718.1 MB/s
  10#silesia.tar       : 211988480 ->  60174239 (3.523),  22.2 MB/s , 721.8 MB/s

  > ./compress_zstds/zstd.opt.1 -i10 -b1 -e10 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 273.8 MB/s , 722.0 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 203.2 MB/s , 666.6 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 157.4 MB/s , 666.5 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 132.1 MB/s , 661.9 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  96.8 MB/s , 641.6 MB/s
   6#silesia.tar       : 211988480 ->  62979643 (3.366),  63.1 MB/s , 677.0 MB/s
   7#silesia.tar       : 211988480 ->  61974560 (3.421),  44.3 MB/s , 678.2 MB/s
   8#silesia.tar       : 211988480 ->  61028308 (3.474),  33.1 MB/s , 708.9 MB/s
   9#silesia.tar       : 211988480 ->  60416751 (3.509),  21.5 MB/s , 710.1 MB/s
  10#silesia.tar       : 211988480 ->  60174239 (3.523),  21.9 MB/s , 723.9 MB/s
2016-10-17 16:55:52 -07:00
Nick Terrell 4db751668f Fix buffer overrun in ZSTD_loadEntropy()
The table log set by `FSE_readNCount()` was not checked in
`ZSTD_loadEntropy()`.  This caused `FSE_buildDTable(dctx->MLTable, ...)`
to overwrite the beginning of `dctx->hufTable`.

The benchmarks look good, there is no obvious performance regression:

  > ./zstds/zstd.opt.0 -i10 -b1 -e5 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 268.2 MB/s , 701.0 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 199.5 MB/s , 666.9 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 154.9 MB/s , 655.6 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 128.9 MB/s , 648.4 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  98.4 MB/s , 633.4 MB/s

  > ./zstds/zstd.opt.2 -i10 -b1 -e5 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 266.1 MB/s , 703.7 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 199.0 MB/s , 666.6 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 156.2 MB/s , 656.2 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 133.2 MB/s , 647.4 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  96.3 MB/s , 633.3 MB/s
2016-10-17 15:51:15 -07:00
Nick Terrell ccfcc643da Check if dict is empty before reading first byte 2016-10-17 11:46:03 -07:00
Yann Collet 2b361cf2f1 minor opt 2016-10-14 16:09:07 -07:00
Nick Terrell 8c6c686d0a [pzstd] Fix lantent bug in WorkQueue::push() 2016-10-14 15:26:56 -07:00
Nick Terrell baa152e56e [pzstd] Add Logger class 2016-10-14 15:26:55 -07:00
Nick Terrell e9e151ce31 [pzstd] Reuse ZSTD_{C,D}Stream 2016-10-14 15:26:55 -07:00
Nick Terrell 48294b57c3 [pzstd] Put ErrorHolder into SharedState 2016-10-14 15:26:55 -07:00
Nick Terrell 9b603ee284 [pzstd] Run the reading thread separately 2016-10-14 15:26:55 -07:00
Nick Terrell 4cb5e90a5c [pzstd] Add asan and tsan tests to travis
gcc-6 tsan is buggy.
It fails to use the correct linker.
It is also broken with `-pie` with linux kernels newer than 4.1, but previous versions require `-pie`...
2016-10-14 15:26:55 -07:00
Nick Terrell 96e0702c00 [pzstd] Print the correct width ints 2016-10-14 15:26:55 -07:00
Nick Terrell 8b4e84249b [pzstd] Fix Makefile 2016-10-14 15:26:50 -07:00
Yann Collet 70077bc9bb refactor for long commands 2016-10-14 14:41:17 -07:00
Yann Collet d7b120ab5c added long commands --memory= and --memlimit-decompress= 2016-10-14 14:22:32 -07:00
Yann Collet 1122349ac2 added long comment --memlimit= 2016-10-14 14:07:11 -07:00
Yann Collet 7933434fdf Merge branch 'dev' of github.com:facebook/zstd into dev 2016-10-14 13:32:35 -07:00
Yann Collet d4cda27b63 new command -M#, to limit memory usage during decompression (#403) 2016-10-14 13:32:20 -07:00
Yann Collet c8b1ecf4ba Merge pull request #417 from terrelln/ubsan-failures
Fix ubsan failures (pass NULL to memcpy)
2016-10-13 03:37:22 -07:00
Nick Terrell 3b9cdf9220 Fix ubsan failures (pass NULL to memcpy) 2016-10-12 20:54:42 -07:00
Yann Collet 5d919e7ac3 added ZSTD_error_frameParameter_windowTooLarge (#403) 2016-10-12 17:29:24 -07:00