Commit Graph

230 Commits (1be8660eca9045db4498702784dae6032ea9f943)

Author SHA1 Message Date
Yann Collet a99c4a3621 Merge branch 'dev' into advancedDecompress 2018-03-21 06:08:28 -07:00
Yann Collet 87b0cf05bd
Merge pull request #1057 from facebook/lrmSettings
LRM parameters
2018-03-21 05:59:39 -07:00
Yann Collet 6873fec658 changed dictMore for dictContentType
which seems clearer to describe what the variable/argument is about.
2018-03-20 15:13:14 -07:00
Nick Terrell d19f803a3b Fix window size for 1 worker + flushing 2018-03-19 18:56:39 -07:00
Nick Terrell 24d9edbdd8 Set ldmParams to 0 when disabled 2018-03-19 18:23:54 -07:00
Nick Terrell 4b92574feb Fix corner cases exposed by zstreamtest 2018-03-19 17:54:04 -07:00
Nick Terrell 94c77710a9 Integrate ldm with zstdmt
Integrate ldm into zstdmt by running it in serial and in order in the first
step of each job, in the same place as the hash gets updated. The input
buffer is sized to fit the whole LDM window and 2 full buffers of slack.
Input buffers cannot be reused until the LDM step is done with them.
After the LDM step is finished, the jobs don't actually have access to the
full window, only the overlap.

Tested on a few different multi-GB files with and without sanitizers,
and with different numbers of threads.
2018-03-19 16:29:03 -07:00
Nick Terrell aa4dbd09a1
Pull job/overlap log logic into common function (#1055)
Prepares for LDM integration by separating the job size and overlap logic
into helper functions.
2018-03-19 15:56:36 -07:00
Yann Collet 6f4d0778a5 make it possible to express compression parameters in any order 2018-03-19 14:41:23 -07:00
Nick Terrell 2253d01b27 Move XXH64_update() into worker threads
* Computes the XXH hash in the worker threads.
* Workers get a sequence number and wait until ther number shows up. On
  error, ensures that its sequence is finished, so future threads don't
  get blocked.
* Sets up for ldm integration, which will go in the same spot.
2018-03-19 11:08:27 -07:00
Nick Terrell f15a17e19f Use a single buffer in zstdmt
Summary:
Allocate a single input buffer large enough to house each job, as well as
enough space for the IO thread to write 2 extra buffers. One goes in the
`POOL` queue, and one to fill, and then block on a full `POOL` queue.
Since we can't overlap with the prefix, we allocate space for 3 extra
input buffers.

Test Plan:
* CI
* With and without ASAN/UBSAN run zstdmt with different number of threads
  on two large binaries, and verify that their checksums match.
* Test on the tip of the zstdmt ldm integration.

Reviewers: cyan

Differential Revision: https://phabricator.intern.facebook.com/D7284007

Tasks: T25664120
2018-03-15 16:21:33 -07:00
Yann Collet 50f763ec44 fixed several comments are underlined by @terrelln 2018-03-13 14:23:14 -07:00
Yann Collet 6a9b41b731 create command --fast[=#]
access negative compression levels from command line
for both compression and benchmark modes.

also : ensure proper propagation of parameters
through ZSTD_compress_generic() interface.

added relevant cli tests.
2018-03-11 20:01:23 -07:00
Yann Collet 5cb1144872 fixed --single-thread
was incorrectly set to -T0 (use as many cores as possible) previously
2018-02-13 14:56:35 -08:00
Yann Collet c72091556b fixed minor nit as per @terrelln's comments 2018-02-09 09:46:08 -08:00
Yann Collet 5188749e1c ensure compression parameters are updated when only compression level is changed 2018-02-02 16:31:20 -08:00
Yann Collet 4b525af53a zstdmt: applies new parameters on the fly
when invoked from ZSTD_compress_generic()
2018-02-02 15:58:13 -08:00
Yann Collet 90eca318a7 fileio: create dedicated function to generate zstd frames
like other formats
2018-02-02 14:24:56 -08:00
Yann Collet 209df52ba2 Changed nbThreads for nbWorkers
This makes it easier to explain that nbWorkers=0 --> single-threaded mode,
while nbWorkers=1 --> asynchronous mode (one mode thread on top of the "main" caller thread).
No need for an additional asynchronous mode flag.
nbWorkers>=2 works the same as nbThreads>=2 previously.
2018-02-01 19:29:30 -08:00
Yann Collet 60fa90b6c0 zstdmt: added ability to change compression parameters during compression 2018-02-01 16:13:31 -08:00
Yann Collet 2cb0740b6b zstdmt: changed naming convention
to avoid confusion with blocks.

also:
- jobs are cut into chunks of 512KB now, to reduce nb of mutex calls.
- fix function declaration ZSTD_getBlockSizeMax()
- fix outdated comment
2018-01-30 14:43:36 -08:00
Yann Collet ba0cd8cf78 fixed minor conversion warning for C++ compilation mode 2018-01-26 18:18:42 -08:00
Yann Collet caf9e96dc3 job mutex creation is checked 2018-01-26 18:09:25 -08:00
Yann Collet 9c40ae7ff1 zstdmt: there is now one mutex/cond per job 2018-01-26 17:55:08 -08:00
Yann Collet 77e36273de zstdmt: minor code refactor for clarity 2018-01-26 17:08:58 -08:00
Yann Collet 27c5853c42 zstdmt: job table correctly cleaned after synchronous ZSTDMT_compress() 2018-01-26 14:35:54 -08:00
Yann Collet 0d426f6b83 zstdmt : refactor a few member names
for clarity
2018-01-26 13:00:14 -08:00
Yann Collet 79b6e28b0a zstdmt : flush() only lock to read shared job members
Other job members are accessed directly.
This avoids a full job copy, which would access everything,
including a few members that are supposed to be used by worker only,
uselessly requiring additional locks to avoid race conditions.
2018-01-26 12:15:43 -08:00
Yann Collet d2b62b6fa5 minor : ZSTDMT_writeLastEmptyBlock() is a void function
because it cannot fail
2018-01-26 11:06:34 -08:00
Yann Collet fca13c6855 zstdmt : fixed memory leak
writeLastEmptyBlock() must release srcBuffer
as mtctx assumes it's done by job worker.

minor : changed 2 job member names (src->srcBuffer, srcStart->prefixStart) for clarity
2018-01-26 10:44:09 -08:00
Yann Collet 8e128eaf05 zstdmt : refactor job members
grouped by sharing properties
2018-01-26 10:20:38 -08:00
Yann Collet a1d4041e69 zstdmt: removed job->jobCompleted
replaced by equivalent signal job->consumer == job->srcSize.

created additional functions
ZSTD_writeLastEmptyBlock()
and
ZSTDMT_writeLastEmptyBlock()
required when it's necessary to finish a frame with a last empty job, to create an "end of frame" marker.

It avoids creating a job with srcSize==0.
2018-01-25 17:35:49 -08:00
Yann Collet 1272d8e760 zstdmt:: renamed mutex and cond to underline they are context-global 2018-01-25 14:52:34 -08:00
Yann Collet 5f349b129c zstdmt : correctly set end of frame 2018-01-23 15:52:40 -08:00
Yann Collet c1cc57f270 zstdmt : fix end condition (ZSTD_e_end)
When ZSTD_e_end directive is provided,
the question is not only "are internal buffers completely flushed",
it is also "is current frame completed".

In some rare cases,
it was possible for internal buffers to be completely flushed,
triggering a @return == 0,
but frame was not completed as it needed a last null-size block to mark the end,
resulting in an unfinished frame.
2018-01-23 15:19:11 -08:00
Yann Collet de5e38a7a6 zstdmt: fixed minor race condition
no real consequence, but pollute tsan tests :

job->dstBuff is being modified inside worker,
while main thread might read it accidentally
because it copies whole job.
But since it doesn't used dstBuff, there is no real consequence.

Other potential solution : only copy useful data, instead of whole job
2018-01-23 14:03:07 -08:00
Yann Collet ebd955e26a zstdmt : fixed ending frame with 0-size block 2018-01-23 13:12:40 -08:00
Yann Collet a7ef3a219c zstdmt : fixed last job size 2018-01-19 18:19:09 -08:00
Yann Collet 3ad7d4951c zstdmt : finally vanquished an elusive and rare race condition 2018-01-19 17:35:08 -08:00
Yann Collet 940634a610 zstdmt : simplify job creation
job will not be created when not enough room within job Table
2018-01-19 13:25:06 -08:00
Yann Collet dc69623453 zstdmt: fixed corruption issue in ZSTDMT_endStream()
when invoked directly.
2018-01-19 12:41:56 -08:00
Yann Collet 70f81d6030 zstdmt uses POOL_tryAdd() to call a new worker
so that it's no longer a blocking call.
This makes it possible to stream out data gradually,
while waiting for a worker to become available.
2018-01-19 10:01:40 -08:00
Yann Collet 6f7280fb33 fixed frame checksum issue
and race conditions
2018-01-18 16:20:26 -08:00
Yann Collet ef97d5a287 Merge branch 'progressiveMT' into progressiveFlush 2018-01-18 13:35:24 -08:00
Yann Collet c7190c69cc fixes for @terrelln comments 2018-01-18 11:15:23 -08:00
Yann Collet 1b5d80d633 zstdmt: added ability to flush current job before it's completed
however, zstdmt may still wait on next available worker,
so it's not smooth yet.
2018-01-18 11:03:27 -08:00
Yann Collet aa79c18e3f fixed a few access contention
passes thread sanitizer test
2018-01-17 17:18:19 -08:00
Yann Collet 394eec697b Introduce ZSTD_getFrameProgression()
Produces 3 statistics for ongoing frame compression :
- ingested
- consumed (effectively compressed)
- produced

Ingested can be larger than consumed due to buffering effect.

For the time being, this patch mostly fixes the % ratio issue,
since it computes consumed / produced,
instead of ingested / produced.

That being said, update is not "smooth",
because on a slow enough setting,
fileio spends most of its time waiting for a worker to complete its job.

This could be improved thanks to more granular flushing
i.e. start flushing before ongoing job is fully completed.
2018-01-17 16:39:02 -08:00
Yann Collet d14cc881b0 zstdmt : fixed very large window sizes
would create too large buffers,
since default job size == window size * 4.

This would crash on 32-bit systems.

Also : jobSize being a 32-bit unsigned, it cannot be >= 4 GB,
so the formula was failing for large window sizes >= 1 GB.
Fixed now : max job Size is 2 GB, whatever the window size.
2018-01-17 12:39:58 -08:00
Yann Collet 58dd7de640 zstdmt: fixed an endless loop on allocation failure
this happened on 32-bits build when requiring a too large input buffer,
typically on wlog=29, creating jobs of 2 GB size.

also : zstd32 now compiles with multithread support enabled by default
(can be disabled with HAVE_THREAD=0)
2018-01-17 12:10:15 -08:00