facebook/zstd - zstd - Final Minetest

Author	SHA1	Message	Date
Yann Collet	14312d833e	zstdmt : fix : loading prefix from previous segments There used to be a (very small) chance that loading prefix from previous segment would be confused with a real zstd dictionary. For that to happen, the prefix needs to start with the same value as dictionary magic. That's 1 chance in 4 billions if all values have equal probability. But in fact, since some values are more common (0x00000000 for example) others are less common, and dictionary magic was selected to be one of them, so probabilities are likely even lower. Anyway, this risk is no down to zero by adding a new CCtx parameter : ZSTD_p_forceRawDict Current parameter policy : the parameter "stick" to its CCtx, so any dictionary loading after ZSTD_p_forceRawDict is set will be loaded in "raw" ("content only") mode, even if CCtx is re-used multiple times with multiple different dictionary. It's up to the user to reset this value differently if it needs so.	2017-02-23 23:42:12 -08:00
Przemyslaw Skibinski	d8114e5802	zstd_compress.c: fix memory leaks	2017-02-21 18:59:56 +01:00
Anders Oleson	517577bf53	spelling fixes in comments i.e. occurred labeled Huffman	2017-02-20 12:08:59 -08:00
Nick Terrell	ecf90ca24b	[zstdmt] Fix MSAN failure with ZSTD_p_forceWindow Reproduction steps: ``` make zstreamtest CC=clang CFLAGS="-O3 -g -fsanitize=memory -fsanitize-memory-track-origins" ./zstreamtest -vv -t4178 -i4178 -s4531 ``` How to get to the error in gdb (may be a more efficient way): * 2 breaks at zstd_compress.c:2418 -- in ZSTD_compressContinue_internal() * 2 breaks at zstd_compress.c:2276 -- in ZSTD_compressBlock_internal() * 1 break at zstd_compress.c:1547 Why the error occurred: When `zc->forceWindow == 1`, after calling `ZSTD_loadDictionaryContent()` we have `zc->loadedDictEnd == zc->nextToUpdate == 0`. But, we've really loaded up to `iend` into the dictionary. Then in `ZSTD_compressBlock_internal()` we see that `current > zc->nextToUpdate + 384`, so we load the last 192 bytes a second time. In this case the bytes we are loading are a block of all 0s, starting in the previous block. So when we are loading the last 192 bytes, we find a `match` in the future, 183 bytes beyond `ip`. Since the block is all 0s, the match extends to the end of the block. But in `ZSTD_count()` we only check that `pIn < pInLoopLimit`, but since `pMatch > pIn`, `pMatch` eventually points past the end of the buffer, causing the MSAN failure. The fix: The line changed sets sets `zc->nextToUpdate` to the end of the dictionary. This is the behavior that existed before `ZSTD_p_forceWindow` was introduced. This fixes the exposing test case. Since the code doesn't fail without `zc->forceWindow`, it makes sense that this works. I've run the command `./zstreamtest -T2mn` 64 times without failures. CI should also verify nothing obvious broke.	2017-02-13 19:11:22 -08:00
Sean Purcell	2db7249265	Make pledgedSrcSize meaning clear for other functions - Added tests - Moved new size functions to static link only	2017-02-09 11:49:58 -08:00
Sean Purcell	0f5c95af44	Disambiguate pledgedSrcSize == 0 - Modify ZSTD CLI to only set contentSizeFlag if it _knows_ the size - Change pzstd to stop setting contentSizeFlag without accurate pledgedSrcSize	2017-02-08 15:12:46 -08:00
Yann Collet	06e7697f96	added test of new parameter ZSTD_p_forceWindow	2017-01-25 16:39:03 -08:00
Yann Collet	bb0027405a	fixed zstdmt corruption issue when enabling overlapped sections see Asana board for detailed explanation on why and how to fix it	2017-01-25 16:25:38 -08:00
Yann Collet	c593348722	ZSTDMT_initCStream_usingDict() can outlive dict Like ZSTD_initCStream_usingDict(), ZSTDMT_initCStream_usingDict() now keep a copy of dict internally. This way, dict can be released : it does not longer have to outlive all future compression sessions.	2017-01-22 16:44:15 -08:00
Yann Collet	d7e3cb58c5	Resolved merge conflict dev+zstdmt	2017-01-20 16:44:50 -08:00
Yann Collet	b459aad5b4	renamed savedRep into repToConfirm	2017-01-19 17:33:37 -08:00
Yann Collet	32dfae6f98	fixed Multi-threaded compression MT compression generates a single frame. Multi-threading operates by breaking the frames into independent sections. But from a decoder perspective, there is no difference : it's just a suite of blocks. Problem is, decoder preserves repCodes from previous block to start decoding next block. This is also valid between sections, since they are no different than changing block. Previous version would incorrectly initialize repcodes to their default value at the beginning of each section. When using them, there was a mismatch between encoder (default values) and decoder (values from previous block). This change ensures that repcodes won't be used at the beginning of a new section. It works by setting them to 0. This only works with regular (single segment) variants : extDict variants will fail ! Fortunately, sections beyond the 1st one belong to this category. To be checked : btopt strategy. This change was only validated from fast to btlazy2 strategies.	2017-01-19 10:32:55 -08:00
Sean Purcell	57d423c5df	Don't create dict in streaming apis if dictSize == 0	2017-01-17 14:31:35 -08:00
Gregory Szorc	7d6f478d15	Set dictionary ID in ZSTD_initCStream_usingCDict() When porting python-zstandard to use ZSTD_initCStream_usingCDict() so compression dictionaries could be reused, an automated test failed due to compressed content changing. I tracked this down to ZSTD_initCStream_usingCDict() not setting the dictID field of the ZSTD_CCtx attached to the ZSTD_CStream instance. I'm not 100% convinced this is the correct or full solution, as I'm still seeing one automated test failing with this change.	2017-01-14 17:44:54 -08:00
Yann Collet	b05c4828ea	zstdmt : correctly check for cctx and buffer allocation Result from getBuffer and getCCtx could be NULL when allocation fails. Now correctly checks : job creation stop and last job reports an allocation error. releaseBuffer and releaseCCtx are now also compatible with NULL input. Identified a new potential issue : when early job fails, later jobs are not collected for resource retrieval.	2017-01-12 02:01:28 +01:00
Yann Collet	5eb749e734	ZSTDMT_compress() creates a single frame The new strategy involves cutting frame at block level. The result is a single frame, preserving ZSTD_getDecompressedSize() As a consequence, bench can now make a full round-trip, since the result is compatible with ZSTD_decompress(). This strategy will not make it possible to decode the frame with multiple threads since the exact cut between independent blocks is not known. MT decoding needs further discussions.	2017-01-11 18:21:25 +01:00
Yann Collet	aca113f4f5	fixed ZSTD_sizeof_?Dict()	2016-12-23 22:25:03 +01:00
Yann Collet	4e5eea61a8	added ZSTD_createDDict_byReference()	2016-12-21 16:44:35 +01:00
Yann Collet	1f57c2ed32	added : ZSTD_createCDict_byReference()	2016-12-21 16:20:11 +01:00
Nick Terrell	8157a4c3cc	Fix dictionary loading bug causing an MSAN failure Offset rep codes must be in the range `[1, dictSize)`. Fix dictionary loading to reject `0` as a offset rep code.	2016-12-20 10:47:52 -08:00
Yann Collet	d564faa3c6	fix : ZSTD_initCStream_srcSize() correctly set srcSize in frame header	2016-12-18 21:39:15 +01:00
Yann Collet	e795c8a5f6	Added ZSTD_initCStream_srcSize(). Added relevant test cases in zstreamtest	2016-12-13 17:00:14 +01:00
Yann Collet	c3a5c4bef8	introduced cycleLog	2016-12-12 00:47:30 +01:00
Yann Collet	c261f71f6a	minor variation of rescale fix	2016-12-12 00:25:07 +01:00
Nick Terrell	3826207a70	Simplify segfault fix Take advantage of the fact that `chainLog <= windowLog`.	2016-12-10 18:46:55 -08:00
Nick Terrell	0012332ce0	Fix compression segfault When the overflow protection kicks in, it makes sure that ip - ctx->base isn't too large. However, it didn't ensure that saved offsets are still valid. This change ensures that any valid offsets (<= windowLog) are still representable after the update. The bug would shop up on line 1056, when `offset_1 > current + 1`, which causes an underflow. This in turn, would cause a segfault on line 1063. The input must necessarily be longer than 1 GB for this issue to occur. Even then, it only occurs if one of the last 3 matches is larger than the chain size and block size.	2016-12-09 17:15:33 -08:00
Yann Collet	a0d742b1e4	introduced HUF_buildCTable_wksp(), to reduce stack memory usage	2016-12-01 17:47:30 -08:00
Yann Collet	643d9a234b	replaced usage of FSE_buildCTable by FSE_buildCTable_wksp, using less stack space in the process	2016-12-01 16:24:04 -08:00
Yann Collet	e928f7e16d	introduced ext_wksp variants of count to reduce stack memory usage	2016-12-01 16:13:35 -08:00
Yann Collet	d79a9a00d9	Introduced FSE_compress_wksp() and FSE_buildCTable_wksp() to reduce stack memory usage	2016-11-30 15:52:20 -08:00
Yann Collet	25f46dcc0f	minor const	2016-11-29 16:59:27 -08:00
Przemyslaw Skibinski	3d18088b38	updated windres	2016-11-17 18:04:41 +01:00
Yann Collet	407a11f63e	fixed Visual compatibility	2016-11-03 15:52:01 -07:00
Nick Terrell	d82efd8a70	ZSTD_compress_usingDict() when dict gets loaded Specify that when `dict == NULL \|\| dictSize < 8` no dictionary gets loaded. Also add some periods.	2016-11-02 18:07:16 -07:00
Yann Collet	ee5b725823	ZSTD_initCStream() optimization : do not allocate a CDict when no dictionary used	2016-10-27 14:20:55 -07:00
Yann Collet	335ad5d4d4	added ZSTD_initDStream_usingDDict() . slightly optimized ZSTD_initDStream() when no dictionary . fixed ZSTD_sizeof_CStream() .	2016-10-25 17:47:02 -07:00
Yann Collet	9516234e67	first sketch for ZSTD_initCStream_usingCDict()	2016-10-25 16:19:52 -07:00
Yann Collet	62d9a7ddfd	Merge pull request #429 from inikep/btopt2 Btopt2	2016-10-25 14:48:43 -07:00
Przemyslaw Skibinski	5c5f01f3da	added ZSTD_btopt2 strategy	2016-10-25 12:25:07 +02:00
Nick Terrell	b2c39a22b0	Fix compiler narrowing warning	2016-10-24 14:50:13 -07:00
Nick Terrell	f698ad6deb	Merge remote-tracking branch 'upstream/dev' into fixes * upstream/dev: added doc\zstd_manual.html added contrib\gen_html zstd_compression_format.md moved to doc/ Fix small bug in ZSTD_execSequence() improved ZSTD_compressBlock_opt_extDict_generic protect ZSTD_decodeFrameHeader() from invalid usage, as suggested by @spaskob zstd_opt.h: small improvement in compression ratio improved dicitonary segment merge use implicit rules to compile zstd_decompress.c detect early impossible decompression scenario in legacy decoder v0.5 no repeat mode in legacy v0.5 fixed invalid invocation of dictionary in legacy decoder v0.5 fix edge case fix command line interpretation fixed minor corner case zstd.h: added the Introduction section fixed clang 3.5 warnings zstd.h: updated comments	2016-10-24 13:10:13 -07:00
Nick Terrell	f9c9af3c2e	Reject dictionaries with incomplete entropy tables If a dictionary specifies that a symbol has probability zero in its `matchLength`, `literalLength`, or `offset` FSE table, but the symbol appears when compressing input, the compressor fails. Ensure that dictionaries support all `matchLength`, and `literalLength` codes. They must also support all of the `offset` codes required to represent every possible offset that can appear in the first block.	2016-10-24 10:42:44 -07:00
Przemyslaw Skibinski	3ee94a7600	zstd_compression_format.md moved to doc/	2016-10-24 15:58:07 +02:00
Nick Terrell	bfd943ace5	Fix buffer overrun in ZSTD_loadDictEntropyStats() The table log set by `FSE_readNCount()` was not checked in `ZSTD_loadDictEntropyStats()`. This caused `FSE_buildCTable()` to stack/heap overflow in a few places. The benchmarks look good, there is no obvious compression performance regression: > ./zstds/zstd.opt.0 -i10 -b1 -e10 ~/bench/silesia.tar 1#silesia.tar : 211988480 -> 73656930 (2.878), 271.6 MB/s , 716.8 MB/s 2#silesia.tar : 211988480 -> 70162842 (3.021), 204.8 MB/s , 671.1 MB/s 3#silesia.tar : 211988480 -> 66997986 (3.164), 156.8 MB/s , 658.6 MB/s 4#silesia.tar : 211988480 -> 66002591 (3.212), 136.4 MB/s , 665.3 MB/s 5#silesia.tar : 211988480 -> 65008480 (3.261), 98.9 MB/s , 647.0 MB/s 6#silesia.tar : 211988480 -> 62979643 (3.366), 65.2 MB/s , 670.4 MB/s 7#silesia.tar : 211988480 -> 61974560 (3.421), 44.9 MB/s , 688.2 MB/s 8#silesia.tar : 211988480 -> 61028308 (3.474), 32.4 MB/s , 711.9 MB/s 9#silesia.tar : 211988480 -> 60416751 (3.509), 21.1 MB/s , 718.1 MB/s 10#silesia.tar : 211988480 -> 60174239 (3.523), 22.2 MB/s , 721.8 MB/s > ./compress_zstds/zstd.opt.1 -i10 -b1 -e10 ~/bench/silesia.tar 1#silesia.tar : 211988480 -> 73656930 (2.878), 273.8 MB/s , 722.0 MB/s 2#silesia.tar : 211988480 -> 70162842 (3.021), 203.2 MB/s , 666.6 MB/s 3#silesia.tar : 211988480 -> 66997986 (3.164), 157.4 MB/s , 666.5 MB/s 4#silesia.tar : 211988480 -> 66002591 (3.212), 132.1 MB/s , 661.9 MB/s 5#silesia.tar : 211988480 -> 65008480 (3.261), 96.8 MB/s , 641.6 MB/s 6#silesia.tar : 211988480 -> 62979643 (3.366), 63.1 MB/s , 677.0 MB/s 7#silesia.tar : 211988480 -> 61974560 (3.421), 44.3 MB/s , 678.2 MB/s 8#silesia.tar : 211988480 -> 61028308 (3.474), 33.1 MB/s , 708.9 MB/s 9#silesia.tar : 211988480 -> 60416751 (3.509), 21.5 MB/s , 710.1 MB/s 10#silesia.tar : 211988480 -> 60174239 (3.523), 21.9 MB/s , 723.9 MB/s	2016-10-17 16:55:52 -07:00
Yann Collet	2b361cf2f1	minor opt	2016-10-14 16:09:07 -07:00
Nick Terrell	3b9cdf9220	Fix ubsan failures (pass NULL to memcpy)	2016-10-12 20:54:42 -07:00
Yann Collet	cf409a7e2a	fixed : init*_advanced() followed by reset() with different pledgedSrcSiz	2016-09-26 16:41:05 +02:00
Yann Collet	97b378a6f8	Streaming : dictionary compression on multiple files / segments can correctly provide srcSize into header (when provided) using pledgedSrcSize.	2016-09-21 17:20:19 +02:00
Yann Collet	993060e0f2	cli : better adaptation to small files	2016-09-21 16:46:08 +02:00
Yann Collet	a6bdf55759	fixed memory leak	2016-09-15 17:02:06 +02:00

... 2 3 4 5 6 ...

338 Commits