facebook/zstd - zstd - Final Minetest

Commit Graph

Author	SHA1	Message	Date
Nick Terrell	f15a17e19f	Use a single buffer in zstdmt Summary: Allocate a single input buffer large enough to house each job, as well as enough space for the IO thread to write 2 extra buffers. One goes in the `POOL` queue, and one to fill, and then block on a full `POOL` queue. Since we can't overlap with the prefix, we allocate space for 3 extra input buffers. Test Plan: * CI * With and without ASAN/UBSAN run zstdmt with different number of threads on two large binaries, and verify that their checksums match. * Test on the tip of the zstdmt ldm integration. Reviewers: cyan Differential Revision: https://phabricator.intern.facebook.com/D7284007 Tasks: T25664120	2018-03-15 16:21:33 -07:00
Yann Collet	192542b63c	Merge pull request #1047 from facebook/hufCompress removed huf_compress_impl.h	2018-03-15 14:14:03 -07:00
Nick Terrell	a271399c97	Expose reference external sequence API Summary: * Expose the reference external sequences API for zstdmt. Allows external sequences of any length, which get split when necessary. * Reset the LDM window when the context is reset. * Store the maximum number of LDM sequences. * Sequence generation now returns the number of last literals. * Fix sequence generation to not throw out the last literals when blocks of more than 1 MB are encountered. Expose reference external sequence API * Expose the reference external sequences API for zstdmt. * Allows external sequences of any length, which get split when necessary. * Reset the LDM window when the context is reset. * Store the maximum number of LDM sequences. * Sequence generation now returns the number of last literals. * Fix sequence generation to not throw out the last literals when blocks of more than 1 MB are encountered. Test Plan: * CI * Test the zstdmt ldm integration stacked on top of this diff Reviewers: cyan Differential Revision: https://phabricator.intern.facebook.com/D7283968 Tasks: T25664120	2018-03-14 18:07:53 -07:00
Nick Terrell	1908c92c46	Merge remote-tracking branch 'upstream/dev' into extern-seq * upstream/dev: Fix overflow protection with wlog=31	2018-03-14 17:26:31 -07:00
Yann Collet	a909c293c6	Merge branch 'dev' into hufCompress	2018-03-14 16:11:25 -07:00
Nick Terrell	a9a6dcba63	Expose reference external sequence API * Expose the reference external sequences API for zstdmt. Allows external sequences of any length, which get split when necessary. * Reset the LDM window when the context is reset. * Store the maximum number of LDM sequences. * Sequence generation now returns the number of last literals. * Fix sequence generation to not throw out the last literals when blocks of more than 1 MB are encountered.	2018-03-14 12:29:31 -07:00
Nick Terrell	33fb966e56	Fix overflow protection with wlog=31 The overflow protection is broken when the window log is `> (3U << 29)`, so 31. It doesn't work when `current` isn't around `1U << windowLog` ahead of `lowLimit`, and the the assertion `current > newCurrent` fails. This happens when the same context is used many times over, but with a large window log, like in zstdmt. Fix it by triggering correction based on `nextSrc - base` instead of `lowLimit`. The added test fails before the patch, and passes after.	2018-03-14 11:45:44 -07:00
Yann Collet	50f763ec44	fixed several comments are underlined by @terrelln	2018-03-13 14:23:14 -07:00
Yann Collet	a95a88af57	removed huf_compress_impl.h re-imported all functions inside huf_compress.c for easier source editing. Also updated a bunch of code comments for clarification.	2018-03-13 14:14:05 -07:00
Yann Collet	2291b85a1e	changed ZSTD_p_literalCompression into ZSTD_p_compressLiterals prefer verb+object construction	2018-03-12 11:44:10 -07:00
Yann Collet	6a9b41b731	create command --fast[=#] access negative compression levels from command line for both compression and benchmark modes. also : ensure proper propagation of parameters through ZSTD_compress_generic() interface. added relevant cli tests.	2018-03-11 20:01:23 -07:00
Yann Collet	a146ee04ae	added negative compression levels negative compression level trade compression ratio for more compression speed. They turn off huffman compression of literals, and use row 0 as baseline with a stepSize = -cLevel. added associated test in fuzzer also added : new advanced parameter ZSTD_p_literalCompression	2018-03-11 05:21:53 -07:00
Yann Collet	facc09aa03	minor compression level adaptation level 12 compresses slightly more and faster due to better btlazy2 mode	2018-03-11 03:06:52 -07:00
Yann Collet	ccb7184a76	Merge pull request #1026 from terrelln/lrm-window LDM manages its own window round buffer	2018-02-27 17:09:10 -08:00
Nick Terrell	0a0e64c641	LDM manages its own window round buffer	2018-02-27 12:13:23 -08:00
Yann Collet	2c4d3f339a	Merge pull request #1025 from facebook/huf Huf	2018-02-27 09:57:01 -08:00
Yann Collet	33a3f18848	fixed wrong size test	2018-02-26 18:27:51 -08:00
Yann Collet	6cdf690441	minor cleaning of huff0 Update code documentation, and properly names a few "magic constants". Also, HUF_compress_internal() gets a cleaner way to determine size of tables inside workspace.	2018-02-26 14:52:23 -08:00
Nick Terrell	7e5e226cbf	Split the window state into substructure	2018-02-26 13:29:57 -08:00
Yann Collet	50bc2ce95e	Merge pull request #1021 from terrelln/lrm-split Split block compresser out of long range matcher	2018-02-23 17:36:51 -08:00
Yann Collet	653383f74a	minor nit from Mac XCode	2018-02-22 15:44:26 -08:00
Nick Terrell	7e2bf4ebad	Remove long range matcher immediate repcode check The compression ratio gets about 0.01% worse on the files I tested, but the code is much simpler.	2018-02-22 15:18:47 -08:00
Nick Terrell	af866b3a58	Split block compresser out of long range matcher * `ZSTD_ldm_generateSequences()` generates the LDM sequences and stores them in a table. It should work with any chunk size, but is currently only called one block at a time. * `ZSTD_ldm_blockCompress()` emits the pre-defined sequences, and instead of encoding the literals directly, it passes them to a secondary block compressor. The code to handle chunk sizes greater than the block size is currently commented out, since it is unused. The next PR will uncomment exercise this code. * During optimal parsing, ensure LDM `minMatchLength` is at least `targetLength`. Also don't emit repcode matches in the LDM block compressor. Enabling the LDM with the optimal parser now actually improves the compression ratio. * The compression ratio is very similar to before. It is very slightly different, because the repcode handling is slightly different. If I remove immediate repcode checking in both branches the compressed size is exactly the same. * The speed looks to be the same or better than before. Up Next (in a separate PR) -------------------------- Allow sequence generation to happen prior to compression, and produce more than a block worth of sequences. Expose some API for zstdmt to consume. This will test out some currently untested code in `ZSTD_ldm_blockCompress()`.	2018-02-22 15:18:41 -08:00
Yann Collet	9c5a8040a9	fixed huf_compress workspace size	2018-02-21 11:34:49 -08:00
Yann Collet	010ba5f71f	Merge pull request #1017 from terrelln/c-bmi2 [compress] Support BMI2	2018-02-20 15:34:59 -08:00
Nick Terrell	6e128d3534	[BMI2] Add comments to the bmi2 variable in the contexts	2018-02-20 14:12:11 -08:00
Nick Terrell	b58f01537e	[compress] Support BMI2	2018-02-14 19:20:32 -08:00
Yann Collet	5cb1144872	fixed --single-thread was incorrectly set to -T0 (use as many cores as possible) previously	2018-02-13 14:56:35 -08:00
Yann Collet	5f7495371e	Merge branch 'dev' into fasterDec	2018-02-10 14:24:44 -08:00
Yann Collet	9945e60ac4	Merge branch 'dev' into flexibleLevel	2018-02-10 11:54:49 -08:00
Yann Collet	c72091556b	fixed minor nit as per @terrelln's comments	2018-02-09 09:46:08 -08:00
Yann Collet	95424409ea	addBits and baseline into FSE decoding table note : unfinished - need new default tables - need modify long mode	2018-02-09 04:25:15 -08:00
Yann Collet	de68c2ff10	Merged ZSTD_preserveUnsortedMark() into ZSTD_reduceIndex() as it's faster, due to one memory scan instead of two (confirmed by microbenchmark). Note : as ZSTD_reduceIndex() is rarely invoked, it does not translate into a visible gain. Consider it an exercise in auto-vectorization and micro-benchmarking.	2018-02-07 14:22:35 -08:00
Yann Collet	0170cf9a7a	minor : modified ZSTD_preserveUnsortedMark() to be more vectorization friendly	2018-02-05 11:46:02 -08:00
Yann Collet	5188749e1c	ensure compression parameters are updated when only compression level is changed	2018-02-02 16:31:20 -08:00
Yann Collet	4b525af53a	zstdmt: applies new parameters on the fly when invoked from ZSTD_compress_generic()	2018-02-02 15:58:13 -08:00
Yann Collet	90eca318a7	fileio: create dedicated function to generate zstd frames like other formats	2018-02-02 14:24:56 -08:00
Yann Collet	209df52ba2	Changed nbThreads for nbWorkers This makes it easier to explain that nbWorkers=0 --> single-threaded mode, while nbWorkers=1 --> asynchronous mode (one mode thread on top of the "main" caller thread). No need for an additional asynchronous mode flag. nbWorkers>=2 works the same as nbThreads>=2 previously.	2018-02-01 19:29:30 -08:00
Yann Collet	60fa90b6c0	zstdmt: added ability to change compression parameters during compression	2018-02-01 16:13:31 -08:00
Nick Terrell	48acaddff9	Test for incorrect pledgeSrcSize earlier	2018-02-01 12:04:05 -08:00
Yann Collet	727bb7f090	Merge pull request #1008 from terrelln/hlog3 Fix hashLog3 size when copying cdict tables	2018-01-31 12:49:07 -08:00
Nick Terrell	ab3346af07	Fix hashLog3 size when copying cdict tables	2018-01-31 11:12:17 -08:00
Yann Collet	823a28a1f4	Merge pull request #1000 from facebook/progressiveFlush Progressive flush	2018-01-30 22:49:47 -08:00
Yann Collet	2cb0740b6b	zstdmt: changed naming convention to avoid confusion with blocks. also: - jobs are cut into chunks of 512KB now, to reduce nb of mutex calls. - fix function declaration ZSTD_getBlockSizeMax() - fix outdated comment	2018-01-30 14:43:36 -08:00
Yann Collet	ba0cd8cf78	fixed minor conversion warning for C++ compilation mode	2018-01-26 18:18:42 -08:00
Yann Collet	caf9e96dc3	job mutex creation is checked	2018-01-26 18:09:25 -08:00
Yann Collet	9c40ae7ff1	zstdmt: there is now one mutex/cond per job	2018-01-26 17:55:08 -08:00
Yann Collet	77e36273de	zstdmt: minor code refactor for clarity	2018-01-26 17:08:58 -08:00
Yann Collet	27c5853c42	zstdmt: job table correctly cleaned after synchronous ZSTDMT_compress()	2018-01-26 14:35:54 -08:00
Yann Collet	0d426f6b83	zstdmt : refactor a few member names for clarity	2018-01-26 13:00:14 -08:00
Yann Collet	79b6e28b0a	zstdmt : flush() only lock to read shared job members Other job members are accessed directly. This avoids a full job copy, which would access everything, including a few members that are supposed to be used by worker only, uselessly requiring additional locks to avoid race conditions.	2018-01-26 12:15:43 -08:00
Yann Collet	d2b62b6fa5	minor : ZSTDMT_writeLastEmptyBlock() is a void function because it cannot fail	2018-01-26 11:06:34 -08:00
Yann Collet	fca13c6855	zstdmt : fixed memory leak writeLastEmptyBlock() must release srcBuffer as mtctx assumes it's done by job worker. minor : changed 2 job member names (src->srcBuffer, srcStart->prefixStart) for clarity	2018-01-26 10:44:09 -08:00
Yann Collet	8e128eaf05	zstdmt : refactor job members grouped by sharing properties	2018-01-26 10:20:38 -08:00
Yann Collet	777d3c1559	fixed minor declaration-after-statement warning	2018-01-25 17:45:18 -08:00
Yann Collet	a1d4041e69	zstdmt: removed job->jobCompleted replaced by equivalent signal job->consumer == job->srcSize. created additional functions ZSTD_writeLastEmptyBlock() and ZSTDMT_writeLastEmptyBlock() required when it's necessary to finish a frame with a last empty job, to create an "end of frame" marker. It avoids creating a job with srcSize==0.	2018-01-25 17:35:49 -08:00
Yann Collet	1272d8e760	zstdmt:: renamed mutex and cond to underline they are context-global	2018-01-25 14:52:34 -08:00
Yann Collet	5f349b129c	zstdmt : correctly set end of frame	2018-01-23 15:52:40 -08:00
Yann Collet	c1cc57f270	zstdmt : fix end condition (ZSTD_e_end) When ZSTD_e_end directive is provided, the question is not only "are internal buffers completely flushed", it is also "is current frame completed". In some rare cases, it was possible for internal buffers to be completely flushed, triggering a @return == 0, but frame was not completed as it needed a last null-size block to mark the end, resulting in an unfinished frame.	2018-01-23 15:19:11 -08:00
Yann Collet	de5e38a7a6	zstdmt: fixed minor race condition no real consequence, but pollute tsan tests : job->dstBuff is being modified inside worker, while main thread might read it accidentally because it copies whole job. But since it doesn't used dstBuff, there is no real consequence. Other potential solution : only copy useful data, instead of whole job	2018-01-23 14:03:07 -08:00
Yann Collet	ebd955e26a	zstdmt : fixed ending frame with 0-size block	2018-01-23 13:12:40 -08:00
Yann Collet	6711396d97	zstreamtest : fixed test 32 : multi-thread compression using ZSTD_compress_generic(,,ZSTD_e_end) Since it already provides ZSTD_e_end as directive, it should not be followed by ZSTDMT_endStream().	2018-01-19 22:20:53 -08:00
Yann Collet	a7ef3a219c	zstdmt : fixed last job size	2018-01-19 18:19:09 -08:00
Yann Collet	3ad7d4951c	zstdmt : finally vanquished an elusive and rare race condition	2018-01-19 17:35:08 -08:00
Yann Collet	940634a610	zstdmt : simplify job creation job will not be created when not enough room within job Table	2018-01-19 13:25:06 -08:00
Yann Collet	dc69623453	zstdmt: fixed corruption issue in ZSTDMT_endStream() when invoked directly.	2018-01-19 12:41:56 -08:00
Yann Collet	70f81d6030	zstdmt uses POOL_tryAdd() to call a new worker so that it's no longer a blocking call. This makes it possible to stream out data gradually, while waiting for a worker to become available.	2018-01-19 10:01:40 -08:00
Yann Collet	d19dc1903c	Merge pull request #995 from facebook/progressiveMT Progressive mt	2018-01-18 17:59:49 -08:00
Yann Collet	6f7280fb33	fixed frame checksum issue and race conditions	2018-01-18 16:20:26 -08:00
Yann Collet	4f43ef731d	Merge branch 'dev' into constCDict	2018-01-18 13:36:43 -08:00
Yann Collet	ef97d5a287	Merge branch 'progressiveMT' into progressiveFlush	2018-01-18 13:35:24 -08:00
Yann Collet	b6ab232f2d	Merge branch 'dev' into progressiveMT	2018-01-18 13:34:56 -08:00
Nick Terrell	9d96761520	Set repcodes for empty ZSTD_CDict When the dictionary is <= 8 bytes, no data is loaded from the dictionary. In this case the repcodes weren't set, because they were inserted after the size check. Fix this problem in general by first setting the cdict state to a clean state of an empty dictionary, then filling the state from there.	2018-01-18 13:28:30 -08:00
Yann Collet	c7190c69cc	fixes for @terrelln comments	2018-01-18 11:15:23 -08:00
Yann Collet	1b5d80d633	zstdmt: added ability to flush current job before it's completed however, zstdmt may still wait on next available worker, so it's not smooth yet.	2018-01-18 11:03:27 -08:00
Yann Collet	aa79c18e3f	fixed a few access contention passes thread sanitizer test	2018-01-17 17:18:19 -08:00
Yann Collet	394eec697b	Introduce ZSTD_getFrameProgression() Produces 3 statistics for ongoing frame compression : - ingested - consumed (effectively compressed) - produced Ingested can be larger than consumed due to buffering effect. For the time being, this patch mostly fixes the % ratio issue, since it computes consumed / produced, instead of ingested / produced. That being said, update is not "smooth", because on a slow enough setting, fileio spends most of its time waiting for a worker to complete its job. This could be improved thanks to more granular flushing i.e. start flushing before ongoing job is fully completed.	2018-01-17 16:39:02 -08:00
Yann Collet	f3b8f90b6d	changed initStatic?Dict() return type to const ZSTD_?Dict* ZSTD_create?Dict() is required to produce a ?Dict* return type because `free()` does not accept a `const type` argument. If it wasn't for this restriction, I would have preferred to create a `const ?Dict` object to emphasize the fact that, once created, a dictionary never changes (hence can be shared concurrently until the end of its lifetime). There is no such limitation with initStatic?Dict() : as stated in the doc, there is no corresponding free() function, since `workspace` is provided, hence allocated, externally, it can only be free() externally. Which means, ZSTD_initStatic?Dict() can return a `const ZSTD_?Dict*` pointer. Tested with `make all`, to catch initStatic's users, which, incidentally, also updated zstd.h documentation.	2018-01-17 14:08:48 -08:00
Yann Collet	b86865323a	Merge branch 'dev' into progressiveMT fixed minor conflict on cdict	2018-01-17 13:51:03 -08:00
Yann Collet	d14cc881b0	zstdmt : fixed very large window sizes would create too large buffers, since default job size == window size * 4. This would crash on 32-bit systems. Also : jobSize being a 32-bit unsigned, it cannot be >= 4 GB, so the formula was failing for large window sizes >= 1 GB. Fixed now : max job Size is 2 GB, whatever the window size.	2018-01-17 12:39:58 -08:00
Yann Collet	58dd7de640	zstdmt: fixed an endless loop on allocation failure this happened on 32-bits build when requiring a too large input buffer, typically on wlog=29, creating jobs of 2 GB size. also : zstd32 now compiles with multithread support enabled by default (can be disabled with HAVE_THREAD=0)	2018-01-17 12:10:15 -08:00
Nick Terrell	16bd0fd4df	Reduce size of ZSTD_CDict Shaves 492,076 B off of the `ZSTD_CDict`. The size of a `ZSTD_CDict` created from a 112,640 B dictionary is: \| Level \| Before (B) \| After (B) \| \|-------\|------------\|-----------\| \| 1 \| 648,448 \| 156,412 \| \| 3 \| 1,140,008 \| 647,932 \|	2018-01-17 11:50:49 -08:00
Yann Collet	cb57c107ff	zstdmt: minor variable renaming, for clarity	2018-01-17 11:39:07 -08:00
Yann Collet	1dba98d563	introduced parameter ZSTD_p_nonBlockingMode This new parameter makes it possible to call streaming ZSTDMT with a single thread set which is non blocking. It makes it possible for the main thread to do other tasks in parallel while the worker thread does compression. Typically, for zstd cli, it means it can do I/O stuff. Applied within fileio.c, this patch provides non-negligible gains during compression. Tested on my laptop, with enwik9 (1000000000 bytes) : time zstd -f enwik9 With traditional single-thread blocking mode : real 0m9.557s user 0m8.861s sys 0m0.538s With new single-worker non blocking mode : real 0m7.938s user 0m8.049s sys 0m0.514s => 20% faster	2018-01-16 16:15:47 -08:00
Yann Collet	6025465e42	ZSTDMT : minor CCtx memory optimization can be useful when a compression job only has small amount of data to compress.	2018-01-16 15:34:41 -08:00
Yann Collet	2e23333094	ZSTDMT can now work in non-blocking mode with 1 thread it still fallbacks to single-thread blocking invocation when input is small (<1job) or when invoking ZSTDMT_compress(), which is blocking. Also : fixed a bug in new block-granular compression routine.	2018-01-16 15:28:43 -08:00
Yann Collet	8e83c5c910	Merge branch 'dev' into progressiveMT	2018-01-16 12:54:33 -08:00
Nick Terrell	aae267a2e1	Reorganize block state	2018-01-16 11:17:50 -08:00
Nick Terrell	887cd4e35e	Split ZSTD_CCtx into smaller sub-structures	2018-01-16 11:17:50 -08:00
Yann Collet	9477f6529d	Merge pull request #984 from terrelln/dict-load Load more dictionary positions into table if empty	2018-01-13 13:20:42 -08:00
Yann Collet	58ecf13e02	zstdmt : can compress at block granularity offering perspective of more accurate progression report.	2018-01-13 13:18:57 -08:00
Nick Terrell	9a211d1f05	Load more dictionary positions into table if empty If the hash table is empty load positions into the hash table that we would otherwise skip. \| Level \| Data Set \| Improvement \| \|-------\|--------------\|-------------\| \| 1 \| github \| 0.44% \| \| 1 \| hg-changelog \| 0.13% \| \| 1 \| hg-commands \| 1.28% \| \| 1 \| hg-manifest \| 0.70% \| \| 3 \| github \| 0.74% \| \| 3 \| hg-changelog \| 0.87% \| \| 3 \| hg-commands \| 1.74% \| \| 3 \| hg-manifest \| 0.23% \|	2018-01-12 16:17:22 -08:00
Yann Collet	863b2f8db4	Merge pull request #983 from terrelln/dict-wlog Increase windowLog from CDict based on the srcSize when known	2018-01-12 07:47:43 -08:00
Nick Terrell	b610b777d3	Increase windowLog from CDict based on the srcSize when known	2018-01-11 16:23:21 -08:00
Yann Collet	cacf47cbee	Merge branch 'dev' into dubtlazy and fixed conflicts	2018-01-11 13:25:08 -08:00
Yann Collet	b9a14900ff	changed function name to ZSTD_DUBT_findBestMatch()	2018-01-11 12:38:31 -08:00
Yann Collet	e8093dde09	fixed #304 Pathological samples may result in literal section being incompressible. This case is now detected, and literal distribution is replaced by one that can be written into the dictionary.	2018-01-11 11:16:32 -08:00
Yann Collet	218e9fe0fc	added a test case for dictBuilder failure cyclic data set makes the entropy stage fails now, onto a fix for #304 ...	2018-01-11 09:42:38 -08:00
Yann Collet	3ea156368c	API doc : grouped ZSTD_initStatic*() together within "memory management" category.	2018-01-10 08:49:50 -08:00
Yann Collet	b17fb488b0	fixed msan test a pointer calculation was wrong in a corner case	2018-01-06 20:50:36 +01:00
Yann Collet	a927fae2a1	fixed ZSTD_reduceIndex() following suggestions from @terrelln. Also added some comments to present logic behind ZSTD_preserveUnsortedMark().	2018-01-06 12:31:26 +01:00
Yann Collet	00db4dbbb3	fixed minor argument property for Visual	2017-12-30 15:42:28 +01:00
Yann Collet	f597f55675	improved btlazy2 : list of unsorted candidates can reach extDict It used to stop on reaching extDict, for simplification. As a consequence, there was a small loss of performance each time the round buffer would restart from beginning. It's not a large difference though, just several hundreds of bytes on silesia. This patch fixes it.	2017-12-30 15:12:59 +01:00
Yann Collet	a68b76afef	updated compression level table for btlazy2 now selected for levels 13, 14 and 15. Also : dropped the requirement for monotonic memory budget increase of compression levels,, which was required for ZSTD_estimateCCtxSize() in order to ensure that a memory budget for level L is large enough for any level <= L. This condition is now ensured at run time inside ZSTD_estimateCCtxSize().	2017-12-30 11:40:35 +01:00
Yann Collet	eb52e2f45e	simplify ZSTD_preserveUnsortedMark() implementation since no compiler attempts to auto-vectorize it.	2017-12-30 11:13:52 +01:00
Yann Collet	d228b6b0d0	btlazy2 : optimization for dictionary compression we want the dictionary table to be fully sorted, not just lazily filled. Dictionary loading is a bit more intensive, but it saves cpu cycles for match search during compression.	2017-12-29 19:14:18 +01:00
Yann Collet	02f64ef955	btlazy2: fixed interaction between unsortedMark and reduceTable	2017-12-29 19:08:51 +01:00
Yann Collet	64482c2c97	fixed bug in dubt the chain of unsorted candidates could grow beyond lowLimit.	2017-12-29 17:04:37 +01:00
Yann Collet	f36da5b4d9	minor speed optimization : index overflow prevention new code supposed to be easier to auto-vectorize	2017-12-29 14:40:33 +01:00
Yann Collet	5235d8d6ba	first implementation of delayed update for btlazy2 This is a pretty nice speed win. The new strategy consists in stacking new candidates as if it was a hash chain. Then, only if there is a need to actually consult the chain, they are batch-updated, before starting the match search itself. This is supposed to be beneficial when skipping positions, which happens a lot when using lazy strategy. The baseline performance for btlazy2 on my laptop is : 15#calgary.tar : 3265536 -> 955985 (3.416), 7.06 MB/s , 618.0 MB/s 15#enwik7 : 10000000 -> 3067341 (3.260), 4.65 MB/s , 521.2 MB/s 15#silesia.tar : 211984896 -> 58095131 (3.649), 6.20 MB/s , 682.4 MB/s (only level 15 remains for btlazy2, as this strategy is squeezed between lazy2 and btopt) After this patch, and keeping all parameters identical, speed is increased by a pretty good margin (+30-50%), but compression ratio suffers a bit : 15#calgary.tar : 3265536 -> 958060 (3.408), 9.12 MB/s , 621.1 MB/s 15#enwik7 : 10000000 -> 3078318 (3.249), 6.37 MB/s , 525.1 MB/s 15#silesia.tar : 211984896 -> 58444111 (3.627), 9.89 MB/s , 680.4 MB/s That's because I kept `1<<searchLog` as a maximum number of candidates to update. But for a hash chain, this represents the total number of candidates in the chain, while for the binary, it represents the maximum depth of searches. Keep in mind that a lot of candidates won't even be visited in the btree, since they are filtered out by the binary sort. As a consequence, in the new implementation, the effective depth of the binary tree is substantially shorter. To compensate, it's enough to increase `searchLog` value. Here is the result after adding just +1 to searchLog (level 15 setting in this patch): 15#calgary.tar : 3265536 -> 956311 (3.415), 8.32 MB/s , 611.4 MB/s 15#enwik7 : 10000000 -> 3067655 (3.260), 5.43 MB/s , 535.5 MB/s 15#silesia.tar : 211984896 -> 58113144 (3.648), 8.35 MB/s , 679.3 MB/s aka, almost the same compression ratio as before, but with a noticeable speed increase (+20-30%). This modification makes btlazy2 more competitive. A new round of paramgrill will be necessary to determine which levels are impacted and could adopt the new strategy.	2017-12-28 16:58:57 +01:00
Yann Collet	473362e922	Merge pull request #958 from facebook/continueCCtx fix a subtle issue in continue mode	2017-12-20 00:12:50 +01:00
Yann Collet	cafedcbbe4	ZSTD_resetCCtx_internal: fixed order of arguments params1 was swapped with params2. This used to be a non-issue when testing for strict equality, but now that some tests look for "sufficient size" `<=`, order matters.	2017-12-19 21:49:04 +01:00
Yann Collet	9096088f45	changed variable name for clarity, suggested by @terrelln	2017-12-19 21:20:46 +01:00
Yann Collet	f299fa39ac	fix a subtle issue in continue mode The deep fuzzer tests caught a subtle bug that was probably there for a long time. The impact of the bug is not a crash, or any other clear error signal, rather, it reduces performance, by cutting data into smaller blocks. Eventually, the following test would fail because it produces too many 1-byte blocks, requiring more space than buffer can provide : `./zstreamtest_asan --mt -s3514 -t1678312 -i1678314` The root scenario is as follows : - Create context, initialize it using explicit parameters or a `cdict` to pin them down, set `pledgedSrcSize=1` - The compression parameters will not be adapted, but `windowSize` and `blockSize` will be automatically set to `1`. `windowSize` and `blockSize` are dynamic values, set within `ZSTD_resetCCtx_internal()`. The automatic adaptation makes it possible to generate smaller contexts for smaller input sizes. - Complete compression - New compression with same context, using same parameters, but `pledgedSrcSize=ZSTD_CONTENTSIZE_UNKNOWN` trigger "continue mode" - Continue mode doesn't modify blockSize, because it used to depend on `windowLog` only, but in fact, it also depends on `pledgedSrcSize`. - The "old" blocksize (1) is still there, next compression will use this value to cut input into blocks, resulting in more blocks and worse performance than necessary performance. Given the scenario, and its possible variants, I'm surprised it did not show up before. But I suspect it did show up, it's just that it never triggered an error, because "worse performance" is not a trigger. The above test is a special corner case, where performance is so impacted that it reaches an error case. The fix works, but I'm not completely pleased. I think the current code relies too much on implied relations between variables. This will likely break again in the future when some related part of the code change. Unfortunately, no time to make larger changes if we want to keep the release target for zstd v1.3.3. So a longer term fix will have to be considered after the release. To do : create a reliable test case which triggers this scenario for CI tests.	2017-12-19 09:43:03 +01:00
Yann Collet	5c2f2ebfdb	zstdmt via compress_generic: reduce opportunity to free/create mtctx `zstreamtest --newapi` (and `--opaqueapi`) create and destroy way too many threads resulting in failure of tsan tests, and potentially connected to the qemu flaky tests. This is because, at each test, the nb of threads can be changed (random). The `--no-big-tests` directive reduce this choice to 1/2 threads, in order to limit memory usage, especially for qemu and 32-bits builds. Unfortunately, swapping between 1 and 2 threads is enough to constantly create/destroy new mtctx. This patch takes advantage of the following property : via compress_generic, no internal mtctx is needed for nbThreads < 2. As a consequence, when nbThreads == 2, the currently active mtctx is necessarily good. This dramatically reduces the nb of thread creations when invoking `zstreamtest --newapi --no-big-tests` (only when parent cctx itself is created, which is randomized to 1/256 tests). Expected outcome : - at a minimum : tsan tests shall now work continuously without exploding the thread counter - at best : flaky qemu tests on `zstreamtest --newapi --no-big-tests` may stop being flaky, due to less stress from constant thread creation/destruction Real world impact : minimal, I don't expect users to constantly change `nbThreads` between each invocation. If `nbThreads` remains stable, existing implementation re-uses existing mtctx. Also : `zstreamtest --newapi` but without `--no-big-tests` doesn't benefit as much, since this test can select a random `nbThreads` value between 1 and 4. The current patch only reduces opportunity to free/create mtctx (for example : 2->1->2 doesn't need a new mtctx) but doesn't completely eliminate it, since `nbThreads` can still change between 2/3/4. A more complete solution could be to only use 2 out of 4 allocated threads, thus keeping the pool at a constant size. This would require a larger change to `POOL_*` api though.	2017-12-16 12:48:13 -08:00
Yann Collet	3cbfac1cdb	updated levels 15-20 taking advantage of `btopt` improved speed to tune parameters. Levels 16-19 are stronger than previous release, making the graph more favorable. In theory, I should also update small-size tables, but I got lazy on that one ...	2017-12-14 23:29:00 -08:00
Yann Collet	8c41a9cb1e	Merge pull request #951 from facebook/lastBlock saves 3-bytes on small input with streaming API	2017-12-14 15:39:50 -08:00
Yann Collet	a0ac8c895c	Merge pull request #950 from facebook/srcSizeAdaptation fix adaptation on srcSize	2017-12-14 14:48:31 -08:00
Yann Collet	281f06e01f	saves 3-bytes on small input with streaming API zstd streaming API was adding a null-block at end of frame for small input. Reason is : on small input, a single block is enough. ZSTD_CStream would size its input buffer to expect a single block of this size, automatically triggering a flush on reaching this size. Unfortunately, that last byte was generally received before the "end" directive (at least in `fileio`). The later "end" directive would force the creation of a 3-bytes last block to indicate end of frame. The solution is to not flush automatically, which is btw the expected behavior. It happens in this case because blocksize is defined with exactly the same size as input. Just adding one-byte is enough to stop triggering the automatic flush. I initially looked at another solution, solving the problem directly in the compression context. But it felt awkward. Now, the underlying compression API `ZSTD_compressContinue()` would take the decision the close a frame on reaching its expected end (`pledgedSrcSize`). This feels awkward, a responsability over-reach, beyond the definition of this API. ZSTD_compressContinue() is clearly documented as a guaranteed flush, with ZSTD_compressEnd() generating a guaranteed end. I faced similar issue when trying to port a similar mechanism at the higher streaming layer. Having ZSTD_CStream end a frame automatically on reaching `pledgedSrcSize` can surprise the caller, since it did not explicitly requested an end of frame. The only sensible action remaining after that is to end the frame with no additional input. This adds additional logic in the ZSTD_CStream state to check this condition. Plus some potential confusion on the meaning of ZSTD_endStream() with no additional input (ending confirmation ? new 0-size frame ?) In the end, just enlarging input buffer by 1 byte feels the least intrusive change. It's also a contract remaining inside the streaming layer, so the logic is contained in this part of the code. The patch also introduces a new test checking that size of small frame is as expected, without additional 3-bytes null block.	2017-12-14 11:47:02 -08:00
Yann Collet	c005df136f	Merge pull request #947 from facebook/fix944 Fix #944	2017-12-14 10:01:52 -08:00
Yann Collet	2e97a6d464	fixed minor declaration-after-statement warning	2017-12-13 18:50:05 -08:00
Yann Collet	5432ef6921	fixes adaptation on srcSize This patch restores capability for each file to receive adapted compression parameters depending on its size. The bug breaking this feature was relatively silly : setting a parameter with a value "0" is supposed to be a no-op. Unfortunately, it would pin down compression parameters as if they were manually set, preventing later automatic adaptation. Unfortunately, I'm currently short of a test case that could check this situation and trigger an error. Compression parameters selection between tableID 0,1,2,3 is largely internal, leaving no trace to outside world, not even in frame header.	2017-12-13 17:45:26 -08:00
Yann Collet	d23eb9a098	zstreamtest : added missing CHECK_Z()	2017-12-13 15:35:49 -08:00
Nick Terrell	22727a7467	Fix cdict compressor repcodes	2017-12-13 11:31:20 -08:00
Yann Collet	e28305fcca	fix #944 : ZSTDMT with large files and dictionary now works correctly windowLog is now enforced from provided compression parameters, instead of being copied blindly from `cdict` where it could be smaller. also : - fix a minor bug in zstreamtest --mt : advanced parameters must be set before init - changed advanced parameter name to ZSTDMT_jobSize	2017-12-12 18:04:58 -08:00
Yann Collet	03832b7aa5	re-added test case messing with revert ... :(	2017-12-12 14:01:54 -08:00
Yann Collet	8a104fda05	Revert "Created a test case which reliably reproduces bug #944" This reverts commit `5098d1fbe2`.	2017-12-12 12:51:49 -08:00
Yann Collet	5098d1fbe2	Created a test case which reliably reproduces bug #944 in zstreamtest.	2017-12-12 12:48:31 -08:00
Yann Collet	ac8e022806	Merge pull request #943 from facebook/fix942 Fix #942	2017-12-08 13:53:08 -05:00
Yann Collet	dfc697e967	comment clarification	2017-12-08 12:16:49 -05:00
Yann Collet	c029ee1f0b	ZSTD_initCStream_srcSize() considers "0" to mean "unknown" to not break existing programs relying on this behavior. Might be changed to mean "empty" in the future.	2017-12-07 17:13:10 -05:00
Yann Collet	3aa2b27a89	fix #942 : streaming interface does not compress after ZSTD_initCStream() While the final result is still, technically, a frame, the resulting frame expands initial data instead of compressing it. This is because the streaming API creates a tiny 1-byte buffer for input, because it believes input is empty (0-bytes), because in the past, 0 used to mean "unknown" instead. This patch fixes the issue. Todo : add a test which traps the issue.	2017-12-07 02:52:50 -05:00
Yann Collet	c173dbd6e7	no longer supported starting C++17	2017-12-04 18:00:53 -08:00
Yann Collet	7e05ef851a	Merge branch 'dev' into qemu32panic	2017-12-03 11:14:36 -08:00
Yann Collet	5e1f34b7e4	setParameter : no side-effect on setting a compression parameter last such side-effect was modifying cctx->loadedDictEnd on setting forceWindow. It is no a useless operation, so it's removed. No side-effect left when setting a compression parameter.	2017-12-01 21:17:09 -08:00
Yann Collet	78290874a5	fixed Visual warning on minor interface discrepancy	2017-11-29 17:01:14 -08:00
Yann Collet	d3c59edac9	removed long-range-mode tests from `zstreamtest --no-big-tests`	2017-11-29 16:42:20 -08:00
Yann Collet	998a93b784	simplified ZSTD_CCtx_setParametersUsingCCtxParams() Any ZSTD_CCtx_setParameter() shall just write the requested parameter, without further action. Any action shall be taken at parameter application only (during init). It makes it possible to just copy CCtxParams from external container to internal state, and get rid of the more complex code which was trying to compensate for missing actions.	2017-11-29 16:13:05 -08:00
Yann Collet	f98ee994c4	zstd_opt: added comments, as requested by @terrelln	2017-11-29 15:19:00 -08:00
Yann Collet	bc42bc3b1d	removed one invocation of SET_PRICE() macro	2017-11-28 16:08:56 -08:00
Yann Collet	0a0a212934	zstd_opt: changed cost formula There was a flaw in the formula which compared literal cost with match cost : at a given position, a non-null literal suite is going to be part of next sequence, while if position ends a previous match, to immediately start another match, next sequence will have a litlength of zero. A litlength of zero has a non-null cost. It follows that literals cost should be compared to match cost + litlength==0. Not doing so gave a structural advantage to matches, which would be selected more often. I believe that's what led to the creation of the strange heuristic which added a complex cost to matches. The heuristic was actually compensating. It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar). The problem with this heuristic is that it's hard to understand, and unfortunately, any future change in the parser would impact the way it should be calculated and its effects. The "proper" formula makes it possible to remove this heuristic. Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse. Note that all differences are small (< 0.01 ratio). In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7). I suspect that's because starting statistics are pretty poor (another area of improvement). However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...). It's a pity that zstd -22 gets worse on silesia.tar. That being said, I like that the new code gets rid of strange variables, which were introducing complexity for any future evolution (faster variants being in mind). Therefore, in spite of this detrimental side effect, I tend to be in favor of it.	2017-11-28 14:07:03 -08:00
Yann Collet	b71405dc51	removed a bunch of code related to cached literal price optState was used both to evaluate price and to cache cost of previously calculated literals. This created a strong dependency, forcing parser to request cost in a strict order. This limitation is forbids future parser with skipping capabilities. After this patch, caching literals price still exists, but is now explicit, in a stack structure.	2017-11-28 12:32:24 -08:00
Yann Collet	03f30d9dcb	separate rawLiterals, fullLiterals and match costs removed one SET_PRICE() macro invocation	2017-11-28 12:14:46 -08:00
Yann Collet	eee87cd6f2	btopt: minor refactor : removed one SET_PRICE() macro invocation direct assignment makes operation cleaner. Also allows some (very minor) optimization (non-measurable)	2017-11-27 17:18:57 -08:00
Yann Collet	e9d1987fd7	btopt: minor speed optimization matchPrice is always right at beginning	2017-11-27 17:01:51 -08:00
Yann Collet	f8d5c478af	fixed comment, reported by @gyscos	2017-11-21 10:36:14 -08:00
Yann Collet	4154aec679	fixed comment, as suggested by @terrelln	2017-11-21 10:26:17 -08:00
Yann Collet	899f2a29f6	strategy ZSTD_btopt pinned to (0) variant (faster one)	2017-11-20 11:53:20 -08:00
Yann Collet	3f457264d1	slightly improved compression speed	2017-11-19 14:40:21 -08:00
Yann Collet	42c1e64270	slightly improved ratio at -22 merging of repcode search into btsearch introduced a small compression ratio regressio at max level : 1.3.2 : 52728769 after repMerge patch : 52760789 (+32020) A few minor changes have produced this difference. They can be hard to spot. This patch buys back about half of the difference, by no longer inserting position at hc3 when a long match is found there. It feels strangely counter-intuitive, but works : after this patch : 52742555 (-18234)	2017-11-19 14:00:55 -08:00
Yann Collet	99435dbbab	minor : search early-out on sufficient_len for hc3 and rep very very small speed and ratio increases	2017-11-19 12:58:04 -08:00
Yann Collet	d100670045	btopt0 : a bit faster and weaker	2017-11-19 10:38:02 -08:00
Yann Collet	e6da37c430	created (hidden) new strategy btopt0 about ~+10% faster but losing ~0.01 compression ratio (note : amplitude vary a lot depending on files, but direction remains the same)	2017-11-19 10:21:21 -08:00
Yann Collet	e717a5b0dd	zstd_opt: minor speed optimization Calculate reference log2sums only once per serie of sequence (as opposed to once per sequence) Also: improved code comments	2017-11-18 16:24:02 -08:00
Yann Collet	a4a20a4b2f	fix un-initialized memory warning harmless, but cleaner	2017-11-17 15:51:52 -08:00
Yann Collet	23767e950a	fix one UB pointer arithmetic in encoder Instead of calculating distance between 2 memory objects, which is UB, we extract the offset from object 1, and transfer it into object 2.	2017-11-17 13:24:51 -08:00
Yann Collet	11e58d9ba4	fixed minor warning warning: void function returning a value (even if the return value is void)	2017-11-16 15:21:30 -08:00
Yann Collet	15768cabb5	fixed some complex scenarios Fixed : multithreading to compress some small data with dictionary Fixed : ZSTD_initCStream_usingCDict() Improved streaming memory usage when pledgedSrcSize is known.	2017-11-16 15:18:18 -08:00
Yann Collet	05dffe43a7	Fixed Btree update ZSTD_updateTree() expected to be followed by a Bt match finder, which would update zc->nextToUpdate. With the new optimal match finder, it's not necessarily the case : a match might be found during repcode or hash3, and stops there because it reaches sufficient_len, without even entering the binary tree. Previous policy was to nonetheless update zc->nextToUpdate, but the current position would not be inserted, creating "holes" in the btree, aka positions that will no longer be searched. Now, when current position is not inserted, zc->nextToUpdate is not update, expecting ZSTD_updateTree() to fill the tree later on. Solution selected is that ZSTD_updateTree() takes care of properly setting zc->nextToUpdate, so that it no longer depends on a future function to do this job. It took time to get there, as the issue started with a memory sanitizer error. The pb would have been easier to spot with a proper `assert()`. So this patch add a few of them. Additionnally, I discovered that `make test` does not enable `assert()` during CLI tests. This patch enables them. Unfortunately, these `assert()` triggered other (unrelated) bugs during CLI tests, mostly within zstdmt. So this patch also fixes them. - Changed packed structure for gcc memory access : memory sanitizer would complain that a read "might" reach out-of-bound position on the ground that the `union` is larger than the type accessed. Now, to avoid this issue, each type is independent. - ZSTD_CCtxParams_setParameter() : @return provides the value of parameter, clamped/fixed appropriately. - ZSTDMT : changed constant name to ZSTDMT_JOBSIZE_MIN - ZSTDMT : multithreading is automatically disabled when srcSize <= ZSTDMT_JOBSIZE_MIN, since only one thread will be used in this case (saves memory and runtime). - ZSTDMT : nbThreads is automatically clamped on setting the value.	2017-11-16 12:18:56 -08:00
Yann Collet	dfc14579f5	removed wrong assertion	2017-11-15 15:35:56 -08:00
Yann Collet	c55e35b2fc	removed a few specialized traces	2017-11-15 15:04:53 -08:00
Yann Collet	61c2d70c86	shortened repcode match finder implementation	2017-11-15 14:37:40 -08:00
Yann Collet	d7e9805028	fixed corruption issue	2017-11-15 13:44:24 -08:00
Yann Collet	046ea53bef	still fighting data corruption due to messed up tree. Seems to happen when reaching end of buffer.	2017-11-15 11:29:24 -08:00
Yann Collet	4202b2e8a6	merged rep search into btMatchSearch but there is a tree corruption somewhere ... bug hunt ongoing	2017-11-14 20:38:52 -08:00
Yann Collet	9a11f70dc3	merged repcode search into BT match search this version has same speed as branch `opt` which is itself 5-10% slower than branch `dev` (no identified reason) It does not compress exactly the same as `opt` or `dev`, maybe because it doesn't stop search after repcodes, leading to sometimes better compression, sometimes worse (by a small margin). warning : _extDict path does not work for the time being This means that benchmark module works, but file module will fail with large files (and high compression level). Objective is to fuse _extDict path into current one, in order to have a single parser to maintain.	2017-11-13 02:23:48 -08:00
Yann Collet	eb47705b18	reduced scope of multiple variables renamed some variables for better understanding	2017-11-10 08:31:12 -08:00
Yann Collet	100d8ad6be	lib/compress: created ZSTD_LLcode() and ZSTD_MLcode() transform length into code. Since transformation is needed in several places throughout the code, better write the logic in one place.	2017-11-08 12:43:05 -08:00
Yann Collet	5aa0352742	zstd_opt: simplified ZSTD_getPrice() and ZSTD_updatePrice() interface ZSTD_getPrice() and ZSTD_updatePrice() accept normal matchlength as argument instead of matchlength-MINMATCH, which makes them easier / more logical to use and read. Conversion is simply done internally.	2017-11-08 12:23:27 -08:00
Yann Collet	bf730e2044	zstd_opt: refactor code for improved readability renamed variables to be more meaningful reduced scope of multiple variables removed some useless var attribution	2017-11-08 12:07:39 -08:00
Yann Collet	4191efa993	zstd_opt: ensure sufficient_len < ZSTD_OPT_NUM to simplify some tests	2017-11-08 11:24:00 -08:00
Yann Collet	ee441d5d2b	renamed zstd_compress.h into zstd_compress_internal.h to emphasize the fact that all definitions it contains must remain private, accross lib/compress modules.	2017-11-07 16:15:23 -08:00
Yann Collet	8b6aecf2cb	moved a few structures from `zstd_internal.h` to `zstd_compress.h` which is a more precise scope	2017-11-07 16:03:14 -08:00
Yann Collet	150354c5fe	minor refactor added some traces and assert related to hunting a potential ubsan error in 32-bits more (it ends up being a compiler-side issue : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82802). Modified one pointer arithmetic expression for a more conformant way.	2017-11-01 16:57:48 -07:00
Yann Collet	428e8b3bf4	fix : ZSTD_compress_generic(,,,ZSTD_e_end) automatically sets pledgedSrcSize as per documentation, on ZSTD_setPledgedSrcSize() : > If all data is provided and consumed in a single round, > this value (pledgedSrcSize) is overriden by srcSize instead. This wasn't applied before compression level is transformed into compression parameters. As a consequence, small input missed compression parameters adaptation. It seems to work fine now : compression was compared with ZSTD_compress_advanced(), results were the same.	2017-11-01 13:15:23 -07:00
Nick Terrell	86b8134cad	[libzstd] Fix parameter selection for empty input ZSTD_compress() and friends would treat an empty input as an unknown size when selecting parameters. Thus, they would drastically overallocate the context. Tell ZSTD_getParams() that the source size is 1 when it is empty.	2017-10-25 17:24:15 -07:00
Yann Collet	1ff8a8c109	Merge pull request #891 from facebook/contentSize Content size	2017-10-17 17:24:51 -07:00
Yann Collet	32c9f715ae	fixed : Visual build compressing stdin with multi-threading enabled fails It was multiple reasons stacked : - Visual use a different code path, because ZSTD_NEWAPI is not defined - fileio.c sends `0` as `pledgedSrcSize` to mean `ZSTD_CONTENTSIZE_UNKNOWN` (fixed) - ZSTDMT_resetCCtx() interpreted `0` as "empty" instead of "unknown" (fixed)	2017-10-17 14:07:43 -07:00
Yann Collet	13bfe885aa	edited ZSTD_initCStream_advanced() comment	2017-10-16 14:06:22 -07:00
Nick Terrell	7f961ba6cd	Don't allow default tables to repeat It isn't useful in any case to repeat default tables. Saves a few bytes on Silesia, since we don't trigger the dictionary heuristic. Before: 211988480 => 73651998 bytes After: 211988480 => 73651721 bytes	2017-10-16 11:37:56 -07:00
Yann Collet	fc8d293460	dictionary compression use correct file size estimation when determining compression parameters to compress one file only. For multiple files, it still "bets" that files are going to be small. There was also a bug recently added in ZSTD_CCtx_loadDictionary_advanced() making it incapable to use pledgedSrcSize to determine compression parameters.	2017-10-14 01:21:43 -07:00
Yann Collet	beb9b4b398	fixed ZSTDMT_initCStream() when contentSizeFlag==1 by default and a wrong test in zstreamtest --mt	2017-10-13 19:09:30 -07:00
Yann Collet	213ef3b510	fixed ZSTD_initCStream_advanced() behavior, which depends on contentSizeFlag, and a stream fuzzer test, which was incorrect (relied on 0 being unconditionnally transformed into `ZSTD_CONTENTSIZE_UNKNOWN`)	2017-10-13 19:01:58 -07:00
Yann Collet	3c1e3f8ec9	contentSizeFlag enabled by default would also fail for streaming and MT operations fixed	2017-10-13 18:32:06 -07:00
Yann Collet	fb44516641	ensure fParams.contentSizeFlag starts at 1 such default was failing for ZSTD_compressBegin/ZSTD_compressContinue fixed too	2017-10-13 17:39:13 -07:00
Yann Collet	dd18d73e7e	fileio: content size is enabled by default	2017-10-13 16:32:18 -07:00
Nick Terrell	ced6e6189c	Add DEBUGLOG() that prints FSE encoding types	2017-10-13 14:55:23 -07:00
Nick Terrell	24ac2dbd2a	Fix invalid use of dictionary offcode table Fixes #888.	2017-10-13 12:47:03 -07:00
Yann Collet	a9e5705077	minor code formatting added a trace during sequence encoding	2017-10-13 02:36:16 -07:00
Nick Terrell	a86a7097ec	Ensure dictionary Huff table can encode any symbol * Ensure that the dictionary Huffman CTable has maxSymbolValue 255. * Fix a stack buffer overflow during compression dictionary loading.	2017-10-03 13:22:13 -07:00
Yann Collet	67478f4cb0	fixed minor conversion warnings for printf in debug mode	2017-10-02 17:28:57 -07:00
Yann Collet	004fd34fd9	Merge pull request #876 from facebook/srcSize CLI Fix : srcSize written in frame headers when compressing multiple files	2017-10-02 15:02:05 -07:00
Nick Terrell	86e83e926f	[libzstd] Set CLEVEL_CUSTOM correctly In `ZSTD_compressBegin_advanced()`, `ZSTD_parameters` are used to set the compression parameters, but the level didn't get set to `CLEVEL_CUSTOM`, so `ZSTD_compressBlock()` used the wrong parameters when checking the source size.	2017-10-02 13:43:30 -07:00
Yann Collet	6e930c13d1	Merge branch 'dev' into compressBound	2017-10-01 11:24:02 -07:00
Yann Collet	dc404119e5	ZSTD_adjustCParams_internal : minor optimization	2017-09-30 15:02:40 -07:00
Nick Terrell	c5d6dde502	Don't `size -= 1` in ZSTD_adjustCParams() The window size could end up too small if the source size is 2^n + 1. Credit to OSS-Fuzz	2017-09-30 14:20:06 -07:00
Yann Collet	5b10345b26	added ZSTD_COMPRESSBOUND() as a macro ZSTD_compressBound() works fine, but is only useful for dynamic allocation. For static allocation, only a macro can provide the amount during compilation time.	2017-09-29 23:17:41 -07:00
Yann Collet	8afb151c9b	cli: fixed wrong initialization in MT mode It's not good to mix old and new API ZSTD_resetCStream() doesn't just set pledgedSrcSize : it also sets the CCtx for a single thread compression. Problem is, when 2+ threads are defined in cctx->requestedParams, ZSTD_compress_generic() will want to start MT compression, since initialization is supposed to have already happened (thanks to ZSTD_resetCStream()) except that the underlying ZSTDMT_CCtx* object is not created, resulting in a segfault. This is an invalid construction (correct one is to use ZSTD_CCtx_setPledgedSrcSize()). I haven't found a nice way to mitigate this impact if someone makes the same mistake. At some point, removing the old API to keep only the new API within fileio.c will limit these risks.	2017-09-29 22:14:37 -07:00
Yann Collet	fbd5ab7027	minor fix : no longer use fake srcSize during resource creation srcSize is read and provided at each file, not at resource creation. This used to be useful with older API, because it could not re-adapt parameters between sessions. At some point, it will be better to remove the old code, and only keep the new_api. It works fine by now.	2017-09-29 19:40:27 -07:00
Yann Collet	db1668a43b	fix : srcSize written in frame header when multiple files compressed This information used to be disabled when nbFiles>1. It was badly initialized later in the code, resulting in an error.	2017-09-29 18:05:18 -07:00
Yann Collet	7c9669f272	Merge pull request #873 from facebook/shorterTests Leaner tests	2017-09-29 17:26:46 -07:00
Yann Collet	1416bc0f07	erase existence of a buffer when it's sent out of the pool In some complex scenario, the buffer would be freed because it's too large, another buffer would be allocated, but fail, trigger an error, and the general buffer pool would then be freed, where the definition of the already freed buffer would be found (beyond total index, but still), and freed again, resulting in double-free error.	2017-09-29 16:27:47 -07:00
Yann Collet	e963800e27	zstdmt : fixed : buffer dst0 wasn't properly set to null after usage now it's possible to unconditionnally invoke ZSTD_releaseAllJobRessources() wether previous compression was completed correctly or not.	2017-09-28 23:01:31 -07:00
Yann Collet	754ae5cc0b	removed ZSTDMT_waitForAllJobsCompleted() from ZSTDMT_freeCCtx() as per @terrelln comment	2017-09-28 20:45:31 -07:00
Yann Collet	86b4fe5b45	adjustCParams : restored previous behavior unknowns srcSize presumed small if there is a dictionary (dictSize>0) and presumed large otherwise.	2017-09-28 18:14:28 -07:00
Yann Collet	b93598d6a4	zstdmt : reduced maximum nb of threads to avoid memory address space issues on 32-bits systems (see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=876416#17)	2017-09-28 13:49:12 -07:00
Yann Collet	e4ec427720	Merge branch 'dev' into shorterTests fixed conflicts	2017-09-28 12:19:28 -07:00
Yann Collet	8074261d00	zstdmt : move on when not enough memory for a new input buffer just continue operations without input forward progress, instead of an error that stops current compression session.	2017-09-28 11:46:19 -07:00
Yann Collet	2cd15dd9a4	fixed minor Visual conversion warning	2017-09-28 02:33:41 -07:00
Yann Collet	377abcc02c	zstdmt : better behavior when freeing a context right after a memory allocation error wait for all jobs to be completed, so that freeing can happen safely	2017-09-28 02:23:44 -07:00
Yann Collet	d6770f80af	minor : rewrite unit tests using CHECK_Z macro	2017-09-28 02:14:48 -07:00
Yann Collet	9b5b47ac93	ensure adjustCParams adjust hLog and cLog even without srcSize It would previously exit when srcSize is unknown. But in the case of custom parameters, hLog and cLog can still be too large in comparison with windowLog. Reduces maximum memory allocated during zstreamtest --newapi	2017-09-28 01:25:40 -07:00
Yann Collet	54a827fff0	Merge branch 'dev' into newFormats Fixed conflicts in zstdmt_compress.c	2017-09-27 16:39:40 -07:00
Yann Collet	e45a2aea9b	Merge pull request #869 from terrelln/dev [libzstd] pthread function prefixed with ZSTD_	2017-09-27 16:35:08 -07:00
Nick Terrell	b555b7ef41	[libzstd][opt] Simplify repcode logic	2017-09-27 15:30:12 -07:00
Yann Collet	c994932788	fixed ZSTD_format_e value validation	2017-09-27 12:22:22 -07:00
Nick Terrell	6c41adfb28	[libzstd] pthread function prefixed with ZSTD_ * `sed -i 's/pthread_/ZSTD_pthread_/g' lib/{,common,compress,decompress,dictBuilder}/.[hc]` Fix up `lib/common/threading.[hc]` * `sed -i s/PTHREAD_MUTEX_LOCK/ZSTD_PTHREAD_MUTEX_LOCK/g lib/compress/zstdmt_compress.c`	2017-09-27 11:48:48 -07:00
Yann Collet	ecf1778e23	updated ZSTD_format_e value validation also updated manual	2017-09-27 11:19:21 -07:00
Yann Collet	4791561c4a	silence minor gcc warning -Wempty-body also silence fuzz test artefacts	2017-09-26 17:57:38 -07:00
Yann Collet	9f0b8dfbe9	Merge branch 'dev' into newFormats	2017-09-26 14:22:39 -07:00
Nick Terrell	c233bdbaee	Increase maximum window size * Maximum window size in 32-bit mode is 1GB, since allocations for 2GB fail on my Mac. * Maximum window size in 64-bit mode is 2GB, since that is the largest power of 2 that works with the overflow prevention. * Allow `--long=windowLog` to set the window log, along with `--zstd=wlog=#`. These options also set the window size during decompression, but don't override `--memory=#` if it is set. * Present a helpful error message when the window size is too large during decompression. * The long range matcher defaults to a hash log 7 less than the window log, which keeps it at 20 for window log 27. * Keep the default long range matcher window size and the default maximum window size at 27 for the API and CLI. * Add tests that use the maximum window size and hash size for compression and decompression.	2017-09-26 14:00:01 -07:00
Yann Collet	586df82a78	Merge pull request #862 from terrelln/static [zstd] Backport kernel patch from @ColinIanKing	2017-09-25 17:02:40 -07:00
Yann Collet	5d8fdd1641	Merge pull request #855 from terrelln/maxoff [libzstd] Increase MaxOff	2017-09-25 16:34:29 -07:00
Nick Terrell	76cb38d085	[zstd] Backport kernel patch from @ColinIanKing * Make the U32 table in `FSE_normalizeCount()` static. * Patch from https://lkml.kernel.org/r/20170922145946.14316-1-colin.king@canonical.com. * Clang makes non-static tables static anyways. gcc however, does [weird things](https://godbolt.org/g/fvTcED). * Benchmarks showed no difference in speed.	2017-09-25 16:18:23 -07:00
Yann Collet	6ee05a02b8	added ZSTD_decompress_generic() same as ZSTD_decompressStream(), just for a similar feeling as the compression side, which uses ZSTD_compress_generic()	2017-09-25 15:41:48 -07:00
Yann Collet	62568c9a42	added capability to generate magic-less frames decoder not implemented yet	2017-09-25 14:26:26 -07:00
Nick Terrell	bbe77212ef	[libzstd] Increase MaxOff	2017-09-25 13:36:18 -07:00
Yann Collet	96f0cde31a	minor function rename ZSTD_estimateCStreamSize_advanced_usingCParams -> ZSTD_estimateCStreamSize_usingCParams _usingX is clear. _advanced feels redundant	2017-09-24 16:47:02 -07:00
Yann Collet	7c3dea42ce	added prototypes for advanced parameters for decompression API required to decode custom formats	2017-09-24 15:57:29 -07:00
Nick Terrell	d6abb28951	Prepare for ZSTD_WINDOWLOG_MAX == 31	2017-09-21 17:18:41 -07:00
Yann Collet	da74aabc00	Merge pull request #850 from terrelln/fse-optimal [fse] Fix FSE_optimalTableLog() for srcSize==1	2017-09-19 14:59:21 -07:00
Nick Terrell	6c9ed76676	[ldm] Fix corner case where minMatch < 8 There is a potential read buffer overflow when minMatch < 8. fix-fuzz-failure	2017-09-19 13:49:37 -07:00
Yann Collet	7d1ff3817b	fix ZSTD_sizeof_CCtx() / ZSTD_sizeof_CStream() previous result was over-estimated by counting streaming buffers twice	2017-09-18 14:47:34 -07:00
Nick Terrell	cae3e3c652	[fse] Fix FSE_optimalTableLog() for srcSize==1	2017-09-18 14:11:18 -07:00
Yann Collet	539b91ee9b	minor : added assert in bt	2017-09-16 23:41:58 -07:00
Yann Collet	335780c427	fixed too strong alignment assert in ZSTD_initStaticCCtx() 64-bits fields are only 32-bits aligned on 32-bits CPU	2017-09-13 16:35:29 -07:00
Yann Collet	f1571dad8f	Merge pull request #838 from stellamplau/ldm-mergeDev Add long distance matcher	2017-09-13 13:24:08 -07:00
Yann Collet	3306bcb0e6	fix #820 : GCC v3.x 32-bits doesn't define 64-bits intrinsic resulting in undefined symbol error. Push the requirement to GCC 4 for now. Another solution, proposed by @NWilson, is to use __LONG_MAX__ instead. __LONG_MAX__ is a GCC-specific constant, which value is supposed to depend on underlying target hardware (32/64 bits) Might be better, but seems also more complex, hence more prone to side effects. Keeping the simple solution for now (just rely on __GNUC__)	2017-09-11 15:17:31 -07:00
Stella Lau	eb3327c10a	Merge branch 'dev' of https://github.com/facebook/zstd into ldm-mergeDev	2017-09-11 15:00:01 -07:00
Stella Lau	f902bf9676	Merge branch 'ldm-integrate' into ldm-mergeDev	2017-09-11 14:55:29 -07:00
Yann Collet	f325ee4e84	fixed pass-through warning	2017-09-11 14:37:03 -07:00
Stella Lau	0d1b54db61	Explicitly cast raw numerals when left-shifting	2017-09-11 14:28:18 -07:00
Yann Collet	0d6ecc72a3	makes it possible to compile libzstd in single-thread mode without zstdmt_compress.c (#819 )	2017-09-11 14:09:34 -07:00
Yann Collet	3128e03be6	updated license header to clarify dual-license meaning as "or"	2017-09-08 00:09:23 -07:00
Stella Lau	360428c5d9	Move ldm functions to their own file	2017-09-06 18:09:26 -07:00
Stella Lau	2b99d696de	Remove debug code	2017-09-06 15:57:26 -07:00
Stella Lau	eeff55dfa8	Merge remote-tracking branch 'upstream/dev' into ldm-mergeDev	2017-09-06 15:56:32 -07:00
Yann Collet	ad0046244f	Merge pull request #831 from terrelln/split-compress Split parsers out of zstd_compress.c	2017-09-06 10:01:27 -07:00
Stella Lau	9e4060200b	Add tests and fix pointer alignment	2017-09-06 09:14:05 -07:00
Stella Lau	c706de5395	Rename and add short ldm parameters in cli	2017-09-05 21:11:18 -07:00

... 3 4 5 6 7 ...

1017 Commits (a55ffbb31b8ea43cb043232a6d96d4506528ee6e)