Commit Graph

271 Commits (b8d4a3887fc12f4fcc769d6ce116e716e792139c)

Author SHA1 Message Date
Sean Purcell ba2ad9f25c ZSTD_decompress now handles multiple frames 2017-02-08 14:50:10 -08:00
Sean Purcell 4e709712e1 Decompressed size functions now handle multiframes and distinguish cases
- Add ZSTD_findDecompressedSize
    - Traverses multiple frames to find total output size
- Add ZSTD_getFrameContentSize
    - Gets the decompressed size of a single frame by reading header
- Deprecate ZSTD_getDecompressedSize
2017-02-08 14:50:10 -08:00
Yann Collet bb0027405a fixed zstdmt corruption issue when enabling overlapped sections
see Asana board for detailed explanation on why and how to fix it
2017-01-25 16:25:38 -08:00
Sean Purcell 57d423c5df Don't create dict in streaming apis if dictSize == 0 2017-01-17 14:31:35 -08:00
Sean Purcell 834ab50fa3 Fixed decompress_usingDict not propagating corrupted dictionary error 2017-01-11 17:31:34 -08:00
Yann Collet aca113f4f5 fixed ZSTD_sizeof_?Dict() 2016-12-23 22:25:03 +01:00
Yann Collet 0819abe3c1 added ZSTD_createDDict_byReference() body 2016-12-21 19:25:15 +01:00
Yann Collet 4e5eea61a8 added ZSTD_createDDict_byReference() 2016-12-21 16:44:35 +01:00
Nick Terrell 8157a4c3cc Fix dictionary loading bug causing an MSAN failure
Offset rep codes must be in the range `[1, dictSize)`.
Fix dictionary loading to reject `0` as a offset rep code.
2016-12-20 10:47:52 -08:00
Yann Collet 35168679bd Merge pull request #478 from terrelln/wildcopy-ub
Fix execSequence wildcopy undefined behavior
2016-12-13 11:33:00 +01:00
Nick Terrell 064a143520 Fix execSequence wildcopy undefined behavior
execSequence relied on pointer overflow to handle cases where
`sequence.matchLength < 8`.  Instead of passing an `size_t` to
wildcopy, pass a `ptrdiff_t`.
2016-12-12 19:01:23 -08:00
Nick Terrell e474aa55b4 Fix decompression buffer overrun
Allows an adversary to write up to 3 bytes beyond the end of the buffer.
Occurs if the match overlaps the `extDict` and `currentPrefix`, and the
match length in the `currentPrefix` is less than `MINMATCH`, and
`op-(16-MINMATCH) >= oMatchEnd > op-16`.
2016-12-12 18:05:30 -08:00
Yann Collet 8f8e2b0b4a fixed initialization warning 2016-12-05 18:00:50 -08:00
Yann Collet e7a41a5955 added : dictID retrieval functions.
added : unit tests for dictID retrieval functions
2016-12-05 16:21:06 -08:00
Yann Collet 9ffbeea875 API : changed : streaming decompression : implicit reset on starting new frames 2016-12-02 18:37:38 -08:00
Yann Collet ff504de391 minor decompression speed improvement 2016-11-29 17:42:46 -08:00
Yann Collet a56ac2815c restored normal decoder speed 2016-11-29 15:30:23 -08:00
Yann Collet 37870d7a66 fixed minor visual warning 2016-11-29 14:31:57 -08:00
Yann Collet 4f5350f610 long matches support overflow 2016-11-29 13:12:24 -08:00
Yann Collet 52e136ed3d long decoder compatible with round and separate buffers 2016-11-28 19:59:11 -08:00
Yann Collet ce3527ca0c combined normal and long decoder 2016-11-28 18:38:52 -08:00
Yann Collet 8993bee997 restored normal mode 2016-11-28 16:11:30 -08:00
Yann Collet 764e70a4f3 added decodeSequencesLong 2016-11-28 15:50:16 -08:00
Yann Collet 73f88a66f1 added prefetch 2016-11-23 15:43:30 -08:00
Yann Collet 50524bf0da delayed decompression 2016-11-23 15:11:07 -08:00
Nick Terrell 4359d21ad7 Merge two memset() calls into one 2016-11-14 17:52:51 -08:00
Nick Terrell 24701de877 Fix uninitialized memory read 2016-11-14 13:57:05 -08:00
Yann Collet 179b19776f fileio.c does no longer need ZSTD_LEGACY_SUPPORT, and does no longer depend on zstd_legacy.h
Added : ZSTD_isFrame() in experimental section
2016-11-02 17:30:49 -07:00
Yann Collet 31e660e7aa more accurate default maximum window size 2016-10-29 03:56:45 -07:00
Yann Collet 2115724c22 Merge pull request #430 from terrelln/exec-sequences
ZSTD_execSequence() accepts match in last 7 bytes
2016-10-28 10:45:05 -07:00
Nick Terrell 10bfd0c0d5 Fix ZSTD_execSequence() performance regression
Commit ae1cb3b3d0 caused the regression.
It is an instruction alignment issue, because if it is `U64 i` instead
of `U32 i`, the regression returns.  This patch fixes the regression
in gcc, but only gets some of the clang performance back.

Benchmarks:
Run on `silesia.tar`.  I only show levels 1-5 because the performance
regression was uniform across all levels.  I did one run on levels
1-19 and it looked good.

| Build | Level | Before | While | After |
|-------|-------|-------:|------:|------:|
| gcc   |     1 |  931.4 | 904.4 | 932.8 |
| gcc   |     2 |  849.1 | 822.6 | 851.2 |
| gcc   |     3 |  815.6 | 790.6 | 818.9 |
| gcc   |     4 |  794.1 | 770.7 | 798.0 |
| gcc   |     5 |  785.7 | 760.7 | 788.8 |
| clang |     1 |  705.5 | 683.2 | 693.8 |
| clang |     2 |  670.0 | 649.2 | 660.7 |
| clang |     3 |  659.6 | 639.8 | 651.4 |
| clang |     4 |  652.5 | 634.7 | 645.9 |
| clang |     5 |  646.9 | 625.5 | 637.7 |
2016-10-27 16:19:57 -07:00
Nick Terrell eb7873a048 ZSTD_execSequence() accepts match in last 7 bytes
The zstd reference compressor will not emit a match in the last 7
bytes of a block.  The decompressor will also not accept a match
in the last 7 bytes.  This patch makes the decompressor accept a
match in the last 7 bytes.
2016-10-25 21:24:15 -07:00
Yann Collet 335ad5d4d4 added ZSTD_initDStream_usingDDict() .
slightly optimized ZSTD_initDStream() when no dictionary .
fixed ZSTD_sizeof_CStream() .
2016-10-25 17:47:02 -07:00
Nick Terrell f698ad6deb Merge remote-tracking branch 'upstream/dev' into fixes
* upstream/dev:
  added doc\zstd_manual.html
  added contrib\gen_html
  zstd_compression_format.md moved to doc/
  Fix small bug in ZSTD_execSequence()
  improved ZSTD_compressBlock_opt_extDict_generic
  protect ZSTD_decodeFrameHeader() from invalid usage, as suggested by @spaskob
  zstd_opt.h: small improvement in compression ratio
  improved dicitonary segment merge
  use implicit rules to compile zstd_decompress.c
  detect early impossible decompression scenario in legacy decoder v0.5
  no repeat mode in legacy v0.5
  fixed invalid invocation of dictionary in legacy decoder v0.5
  fix edge case
  fix command line interpretation
  fixed minor corner case
  zstd.h: added the Introduction section
  fixed clang 3.5 warnings
  zstd.h: updated comments
2016-10-24 13:10:13 -07:00
Yann Collet 4239a207dd Merge pull request #425 from inikep/dev11
Doc
2016-10-24 11:11:40 -07:00
Przemyslaw Skibinski 3ee94a7600 zstd_compression_format.md moved to doc/ 2016-10-24 15:58:07 +02:00
Yann Collet 97611611a3 Merge pull request #423 from terrelln/exec-seq-patch
Fix small bug in ZSTD_execSequence()
2016-10-21 17:02:06 -07:00
Nick Terrell ae1cb3b3d0 Fix small bug in ZSTD_execSequence()
`memmove(op, match, sequence.matchLength)` is not the desired behavior.
Overlap is allowed, and handled as if we did `*op++ = *match++`, which
is not how `memmove()` handles overlap.

Only triggered if both of the following conditions are met:
* The match spans extDict & currentPrefixSegment
* `oLitEnd <= oend_w < oLitEnd + length1 < oMatchEnd <= oend`.

These two conditions imply that the block is less than 15 bytes long.
This bug isn't triggered by the streaming API, because it allocates
enough space for the window size + the block size, so there cannot be
a match that is within 8 bytes of the end and overlaps with itself.
It cannot be triggered by the block decompression API because all of
the decompressed data is in the currentPrefixSegment.

Introduced by commit 7158584399
2016-10-21 12:13:44 -07:00
Yann Collet da3bd8b6de protect ZSTD_decodeFrameHeader() from invalid usage, as suggested by @spaskob 2016-10-20 20:11:00 -07:00
Nick Terrell bb68062c59 Unitialized memory read in ZSTD_decodeSeqHeaders()
Caused by two things:
1. Not checking that `ip` is in range except for the first byte.
2. `ZSTDv0{5,6}_decodeLiteralsBlock()` could return a value larger than `srcSize`.
2016-10-18 16:41:33 -07:00
Yann Collet 06573e17be fixed minor corner case 2016-10-17 17:28:28 -07:00
Nick Terrell 4db751668f Fix buffer overrun in ZSTD_loadEntropy()
The table log set by `FSE_readNCount()` was not checked in
`ZSTD_loadEntropy()`.  This caused `FSE_buildDTable(dctx->MLTable, ...)`
to overwrite the beginning of `dctx->hufTable`.

The benchmarks look good, there is no obvious performance regression:

  > ./zstds/zstd.opt.0 -i10 -b1 -e5 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 268.2 MB/s , 701.0 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 199.5 MB/s , 666.9 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 154.9 MB/s , 655.6 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 128.9 MB/s , 648.4 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  98.4 MB/s , 633.4 MB/s

  > ./zstds/zstd.opt.2 -i10 -b1 -e5 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 266.1 MB/s , 703.7 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 199.0 MB/s , 666.6 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 156.2 MB/s , 656.2 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 133.2 MB/s , 647.4 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  96.3 MB/s , 633.3 MB/s
2016-10-17 15:51:15 -07:00
Yann Collet 7933434fdf Merge branch 'dev' of github.com:facebook/zstd into dev 2016-10-14 13:32:35 -07:00
Yann Collet d4cda27b63 new command -M#, to limit memory usage during decompression (#403) 2016-10-14 13:32:20 -07:00
Nick Terrell 3b9cdf9220 Fix ubsan failures (pass NULL to memcpy) 2016-10-12 20:54:42 -07:00
Yann Collet 5d919e7ac3 added ZSTD_error_frameParameter_windowTooLarge (#403) 2016-10-12 17:29:24 -07:00
Nick Terrell 7158584399 Fix ZSTD_execSequence() edge case 2016-10-12 10:05:26 -07:00
Yann Collet 2f2639438a zstreamtest can fuzztest pledgedSrcSize 2016-09-26 14:06:08 +02:00
Yann Collet 51f4d566c2 small decompression speed boost for very small data 2016-09-22 15:57:28 +02:00
Yann Collet d7c6589df8 support ZSTD_sizeof_*() on NULL
added ZSTD_sizeof_CDict()
2016-09-15 02:57:27 +02:00
Yann Collet e91c4b4cef introduced ZSTD_resetDStream() .
added : ZSTD_sizeof_DDict()
2016-09-14 16:55:44 +02:00
Yann Collet d092d77cfc minor variable renaming 2016-09-14 16:14:57 +02:00
Yann Collet 26ec254066 new strategy for faster DDict decompression 2016-09-13 16:52:16 +02:00
Yann Collet b3060f7a9e changed streaming decoder behavior : now, when all compressed frame is consumed, it means decompression is completed, with regenerated data fully flushed. 2016-09-09 16:44:16 +02:00
Yann Collet 95d07d7447 introduced CHECK_E 2016-09-06 16:38:51 +02:00
Yann Collet 3e21ec5b01 introduced CHECK_F 2016-09-06 15:36:19 +02:00
Yann Collet 5c956d593c FORCE_INLINE common definition 2016-09-06 15:05:19 +02:00
Yann Collet 7c83dfd5c2 ZSTD_frameHeaderSize_prefix (#340), as result of ZSTD_initStream 2016-09-05 19:47:43 +02:00
Yann Collet 1563bfeabc fixing FORCE_INLINE for older compilers (#330) 2016-09-02 11:44:21 -07:00
David Lam e10f7f3dcb merge 2016-08-30 12:03:36 -07:00
Yann Collet 4ded9e591c added boilerplate 2016-08-30 11:06:28 -07:00
David Lam da9d3b7057 Cleanup some errors in typedef comments and remove duplicated HOWTO from zbuff_decompress.c 2016-08-29 17:31:51 -07:00
Yann Collet 23b6e05d8e ZSTD_malloc() and ZSTD_free(), to simplify customMem 2016-08-28 21:05:43 -07:00
Yann Collet 5f53b0335e fixed continuation context 2016-08-28 10:00:49 -07:00
Yann Collet 767d8f66fa legacy contexts can be re-used 2016-08-28 08:19:47 -07:00
Yann Collet 4bf317dd00 first version supporting legacy streams (transparent decoding) 2016-08-28 07:43:34 -07:00
inikep a3a47ec4d0 Merge remote-tracking branch 'refs/remotes/Cyan4973/dev' into Other 2016-08-24 21:25:49 +02:00
inikep e416e30019 remove unnecessary comments 2016-08-24 17:32:09 +02:00
Yann Collet 17e482efdd added ZSTD_setDStreamParameter() 2016-08-23 16:58:10 +02:00
Yann Collet 3071c3e303 STREAM_WINDOW_MAX : protect streaming from unreasonable memory requirements 2016-08-23 01:34:34 +02:00
Yann Collet 70e3b31306 fixed playtests on os-x 2016-08-23 01:18:06 +02:00
Yann Collet cb3276329a added sizeof CStream and DStream 2016-08-23 00:31:59 +02:00
Yann Collet 8baf78a291 minor coding style 2016-08-20 13:04:20 +02:00
Yann Collet 1bee2d5e08 slight decompression speed improvement 2016-08-20 02:59:04 +02:00
Yann Collet 18442c1482 minor refactoring 2016-08-18 01:40:32 +02:00
Yann Collet 53e17fbd5e updated streaming API 2016-08-17 01:39:22 +02:00
Yann Collet 104e5b072d added : streaming decompression API 2016-08-16 15:11:28 +02:00
inikep 038d1497c9 fixed compilation with Visual Studio 2005 2016-08-10 14:30:10 +02:00
Yann Collet 917fe188f1 Implemented repOffset "minus 1" on ll==0 2016-07-31 04:01:57 +02:00
Yann Collet 66f69e58d2 restore decompression speed on fizzle 2016-07-30 15:32:47 +02:00
Yann Collet f714f59c16 fixed visual warning 2016-07-30 12:05:28 +02:00
Yann Collet 761f8dbbd2 back to normal table cell copy 2016-07-30 11:43:53 +02:00
Yann Collet 3c6b808870 minor decompression speed gains 2016-07-30 03:20:47 +02:00
Yann Collet c00d30fbe4 Merge pull request #264 from inikep/dev08
Dev08
2016-07-29 17:42:30 +02:00
Yann Collet 4c5bbf64f9 fixed : frame concatenation without checksum 2016-07-28 20:30:25 +02:00
Yann Collet 60ba31c570 zbuff uses ZSTD_compressEnd() 2016-07-28 19:55:09 +02:00
Yann Collet c991cc1828 new frame end, 32-bits checksums 2016-07-28 00:55:43 +02:00
inikep 003c7a8568 optimal parser: removed ZSTD_REP_INIT 2016-07-27 11:07:13 +02:00
Eric Biggers 0a55e7a0bb ZSTD_decompressFrame(): use remainingSize instead of iend - ip
Same behavior, but no need to have redundant variables.
2016-07-26 13:22:27 -07:00
Eric Biggers aa6c70bf60 ZSTD_decompressFrame(): pass up error code from ZSTD_decodeFrameHeader() 2016-07-26 13:22:27 -07:00
Eric Biggers e4d0265ea9 Replace remaining references to "direct mode" with "single segment mode" 2016-07-26 13:22:27 -07:00
Yann Collet cbc5e9dc19 fixes oob read 2016-07-24 18:02:04 +02:00
Yann Collet 7ed5e33b89 minor comment changes 2016-07-24 14:26:11 +02:00
Yann Collet 10b9c13d07 fixed doc on cLevel default, reported by Oliver Lange 2016-07-24 01:21:53 +02:00
Yann Collet f8e7b5363f unified encoding types 2016-07-23 16:31:49 +02:00
Yann Collet c2e1a68d81 changed streamNb order to 1-4-4-4 2016-07-22 17:30:52 +02:00
Yann Collet 772d912c2f more complete support for literals repeat mode 2016-07-22 15:04:25 +02:00
Yann Collet 9f2d82d4a4 fixed : big-endian decoding 2016-07-22 14:37:10 +02:00
Yann Collet 32faf6c8e7 fixed conversion warnings 2016-07-22 14:37:09 +02:00
Yann Collet 5e45a5fbb3 force loop-align to 32 for zstd_decompress 2016-07-22 14:37:09 +02:00
Yann Collet 5288ac0cb7 changed filed order 2016-07-22 14:37:09 +02:00
Yann Collet 198e6aac44 Literals header fields use little endian convention 2016-07-22 14:37:09 +02:00
Yann Collet 6fa05a2371 cBlockSize uses little-endian convention 2016-07-22 14:37:09 +02:00
Yann Collet cf05b9d477 ZSTD_getBlockSizeMax() 2016-07-18 16:52:10 +02:00
Yann Collet 972e5806ee fixed : premature frame end on zero-sized raw block - reported by @ebiggers 2016-07-17 15:39:24 +02:00
Yann Collet d158c35e9f added ZSTD_estimateDCtxSize() 2016-07-11 13:46:25 +02:00
Yann Collet 8e0ee681b8 added ZSTD_sizeofDCtx() 2016-07-11 13:09:52 +02:00
Yann Collet 3ae543ce75 added ZSTD_estimateCCtxSize() 2016-07-11 03:12:17 +02:00
Yann Collet 722e14bb65 fixed compilation error in decompression module 2016-07-08 19:22:16 +02:00
Yann Collet bd10607063 updated spec 2016-07-08 19:16:57 +02:00
Yann Collet c5fb5b7fcd support offset > 128 MB 2016-07-08 13:13:37 +02:00
Yann Collet 19c27d27f1 simplified legacy functions, no longer need magic number 2016-07-07 14:40:13 +02:00
Yann Collet f323bf7d32 added : ZSTD_getDecompressedSize() 2016-07-07 13:14:21 +02:00
Yann Collet f246cf5423 ZSTD_decompress_usingDDict() compatible with Legacy mode 2016-07-06 20:32:27 +02:00
Yann Collet 517e1ba623 fixed dictBuilder issue with HC levels. Reported by Bartosz Taudul. 2016-07-06 12:35:09 +02:00
Yann Collet fe07eaa972 simplified ZSTD_decodeSequence() 2016-07-06 02:25:44 +02:00
Yann Collet 9ca73364e6 updated spec 2016-07-05 10:53:38 +02:00
Yann Collet f9cac7a734 Added GNU separator `--`, to specifies that all following arguments are necessary file names (and not commands). Suggested by @chipturner (#230) 2016-07-04 18:18:24 +02:00
Yann Collet 23f05ccc6b updated specifications 2016-07-04 16:13:11 +02:00
Yann Collet 2fa9904844 update specification and comments 2016-07-01 20:55:28 +02:00
Yann Collet d4f4e58ee1 fixed ZSTD_decompressBlock() using multiple blocks 2016-06-27 01:31:35 +02:00
Yann Collet e4811ba761 Modified : ZSTD_createDDict() accepts dictionary < 8 bytes in pure content mode (reported by @chipturner) 2016-06-19 23:06:54 +02:00
Yann Collet 06d9a73b48 minor refactor, using `WILDCOPY_OVERLENGTH` macro instead of hard-coded 8 2016-06-19 14:27:21 +02:00
Yann Collet 4948f270b3 make room for reserved "information bit" in frame header 2016-06-16 15:38:51 +02:00
Yann Collet 80d033fb43 fixed ptr arithmetic warning 2016-06-16 01:41:50 +02:00
Yann Collet 736d419289 strengthened dict loading on decompresson side 2016-06-16 01:05:04 +02:00
Yann Collet 8e36a9c169 decoder restores repOffsets from dictionary 2016-06-16 01:05:04 +02:00
Yann Collet d059092897 fixed conversion warnings 2016-06-14 15:34:24 +02:00
Yann Collet 4266c0a2fd adding inter-blocks rep-offsets 2016-06-14 01:49:25 +02:00
Yann Collet cd98f93cff Fixed decompression issue with invalid data 2016-06-11 23:26:22 +02:00
Yann Collet 37fece22e8 enable repeat-entropic-stats mode 2016-06-11 02:52:42 +02:00
Yann Collet d60a5bf900 Literal decompression builds Huffman tables within shared space (for later re-use) 2016-06-11 02:35:31 +02:00
Yann Collet 289bbd52e5 Updated huff0 2016-06-11 01:31:54 +02:00
Yann Collet 9dd12742f3 `litBlockType_t` is an `enum` 2016-06-10 00:12:26 +02:00
Yann Collet 662a541431 updated huff0 - now generates a common HUF_DTable type for all decoding tables 2016-06-08 11:11:02 +02:00
Yann Collet 302fb53a76 Removed `ZSTD_*_usingPrepared?Ctx()` declaration from public space 2016-06-07 12:16:49 +02:00
Yann Collet 81e13ef7cf first implementation of the new dictionary API (untested) 2016-06-07 00:51:51 +02:00
Yann Collet 9d504ae85b Added decoding of RLE blocks 2016-06-06 19:52:35 +02:00
Yann Collet 673f0d7cdc new frame format, allowing custom window size 2016-06-06 00:26:38 +02:00
Yann Collet d0e2cd15cb Merged `fse_static` into `fse.h` . Now requires `FSE_STATIC_LINKING_ONLY` macro. 2016-06-05 00:58:01 +02:00
Yann Collet 130fe11394 merged `huf_static.h` into `huf.h` . Requires `HUF_STATIC_LINKING_ONLY` macro. 2016-06-05 00:42:28 +02:00
Yann Collet 198d127b35 minor comment change (unfinished description of new header format) 2016-06-04 18:40:55 +02:00
Yann Collet f4f5affdf7 restore ZBUFF full-block-size, for better performance on small input 2016-06-03 23:09:28 +02:00
Yann Collet ab7b6f1ece Merge pull request #198 from inikep/dev070
Dev070
2016-06-03 21:37:49 +02:00
inikep 3640396b1a fixed: deallocation of structures in case of error in ZBUFF_createCCtx and ZBUFF_createDCtx 2016-06-03 16:36:50 +02:00
Yann Collet fe48775868 minor decoder code refactoring 2016-06-03 15:41:51 +02:00
inikep 3763c77f6b defaultCustomNULL replaced with defaultCustomMem 2016-06-03 13:28:20 +02:00
inikep 36fac00149 removed calloc calls from lib/ 2016-06-03 13:23:04 +02:00
inikep db2f540414 added defaultCustomNULL 2016-06-03 12:56:56 +02:00
inikep 2866951558 opaque parameter for custom memory allocation functions 2016-06-02 13:04:18 +02:00