Commit Graph

4958 Commits (eba13dba2e2bac09bef231164b3cc08ffda999d7)

Author SHA1 Message Date
Yann Collet 940634a610 zstdmt : simplify job creation
job will not be created when not enough room within job Table
2018-01-19 13:25:06 -08:00
Yann Collet dc69623453 zstdmt: fixed corruption issue in ZSTDMT_endStream()
when invoked directly.
2018-01-19 12:41:56 -08:00
Yann Collet cb5eba8e20 add `zcat` symlink support, suggested by @wtarreau
added some test
also updated relevant doc

+ fixed a mistake in `lz4` symlink support :
  lz4 utility doesn't remove source files by default (like zstd, but unlike gzip).
  The symlink must behave the same.
2018-01-19 11:26:35 -08:00
Yann Collet 70f81d6030 zstdmt uses POOL_tryAdd() to call a new worker
so that it's no longer a blocking call.
This makes it possible to stream out data gradually,
while waiting for a worker to become available.
2018-01-19 10:01:40 -08:00
Yann Collet d19dc1903c
Merge pull request #995 from facebook/progressiveMT
Progressive mt
2018-01-18 17:59:49 -08:00
Yann Collet 31f45c98f8
Merge pull request #994 from facebook/constCDict
changed initStatic?Dict() return type to const ZSTD_?Dict*
2018-01-18 17:57:53 -08:00
Yann Collet 6f7280fb33 fixed frame checksum issue
and race conditions
2018-01-18 16:20:26 -08:00
Yann Collet 997e4d0ccd added POOL_tryAdd() 2018-01-18 14:39:51 -08:00
Yann Collet 4f43ef731d Merge branch 'dev' into constCDict 2018-01-18 13:36:43 -08:00
Yann Collet ef97d5a287 Merge branch 'progressiveMT' into progressiveFlush 2018-01-18 13:35:24 -08:00
Yann Collet b6ab232f2d Merge branch 'dev' into progressiveMT 2018-01-18 13:34:56 -08:00
Yann Collet 6f3b54dc8b
Merge pull request #997 from terrelln/empty-dict
Set repcodes for empty ZSTD_CDict
2018-01-18 13:34:31 -08:00
Nick Terrell 9d96761520 Set repcodes for empty ZSTD_CDict
When the dictionary is <= 8 bytes, no data is loaded from the dictionary.
In this case the repcodes weren't set, because they were inserted after the
size check. Fix this problem in general by first setting the cdict state to
a clean state of an empty dictionary, then filling the state from there.
2018-01-18 13:28:30 -08:00
Yann Collet 4d08ba8b77 fileio: READY_FOR_UPDATE() is now a function-like macro
as suggested by @terrelln
2018-01-18 11:27:13 -08:00
Yann Collet c7190c69cc fixes for @terrelln comments 2018-01-18 11:15:23 -08:00
Yann Collet 1b5d80d633 zstdmt: added ability to flush current job before it's completed
however, zstdmt may still wait on next available worker,
so it's not smooth yet.
2018-01-18 11:03:27 -08:00
Yann Collet aa79c18e3f fixed a few access contention
passes thread sanitizer test
2018-01-17 17:18:19 -08:00
Yann Collet 394eec697b Introduce ZSTD_getFrameProgression()
Produces 3 statistics for ongoing frame compression :
- ingested
- consumed (effectively compressed)
- produced

Ingested can be larger than consumed due to buffering effect.

For the time being, this patch mostly fixes the % ratio issue,
since it computes consumed / produced,
instead of ingested / produced.

That being said, update is not "smooth",
because on a slow enough setting,
fileio spends most of its time waiting for a worker to complete its job.

This could be improved thanks to more granular flushing
i.e. start flushing before ongoing job is fully completed.
2018-01-17 16:39:02 -08:00
Yann Collet 592ce5a042
Merge pull request #991 from facebook/progressiveMT
Non-blocking compression
2018-01-17 14:35:23 -08:00
Yann Collet f3b8f90b6d changed initStatic?Dict() return type to const ZSTD_?Dict*
ZSTD_create?Dict() is required to produce a ?Dict* return type
because `free()` does not accept a `const type*` argument.
If it wasn't for this restriction, I would have preferred to create a `const ?Dict*` object
to emphasize the fact that, once created, a dictionary never changes
(hence can be shared concurrently until the end of its lifetime).

There is no such limitation with initStatic?Dict() :
as stated in the doc, there is no corresponding free() function,
since `workspace` is provided, hence allocated, externally,
it can only be free() externally.

Which means, ZSTD_initStatic?Dict() can return a `const ZSTD_?Dict*` pointer.

Tested with `make all`, to catch initStatic's users,
which, incidentally, also updated zstd.h documentation.
2018-01-17 14:08:48 -08:00
Yann Collet 5cde87ca4a
Merge pull request #993 from krab/dev-cmake-windows-mt
CMake: fixed multithreading build on Windows
2018-01-17 13:52:56 -08:00
Yann Collet b86865323a Merge branch 'dev' into progressiveMT
fixed minor conflict on cdict
2018-01-17 13:51:03 -08:00
Yann Collet fbb703ea5e
Merge pull request #992 from terrelln/small-cdict
Reduce size of ZSTD_CDict
2018-01-17 13:47:42 -08:00
Yann Collet d14cc881b0 zstdmt : fixed very large window sizes
would create too large buffers,
since default job size == window size * 4.

This would crash on 32-bit systems.

Also : jobSize being a 32-bit unsigned, it cannot be >= 4 GB,
so the formula was failing for large window sizes >= 1 GB.
Fixed now : max job Size is 2 GB, whatever the window size.
2018-01-17 12:39:58 -08:00
Yann Collet 58dd7de640 zstdmt: fixed an endless loop on allocation failure
this happened on 32-bits build when requiring a too large input buffer,
typically on wlog=29, creating jobs of 2 GB size.

also : zstd32 now compiles with multithread support enabled by default
(can be disabled with HAVE_THREAD=0)
2018-01-17 12:10:15 -08:00
Nick Terrell 16bd0fd4df Reduce size of ZSTD_CDict
Shaves 492,076 B off of the `ZSTD_CDict`.
The size of a `ZSTD_CDict` created from a 112,640 B dictionary is:

| Level | Before (B) | After (B) |
|-------|------------|-----------|
| 1     | 648,448    | 156,412   |
| 3     | 1,140,008  | 647,932   |
2018-01-17 11:50:49 -08:00
Yann Collet cb57c107ff zstdmt: minor variable renaming, for clarity 2018-01-17 11:39:07 -08:00
Alexey Ivanov 22303da601 CMake: fixed multithreading build on Windows
`ZSTD_MULTITHREAD_SUPPORT` option fixed for Windows.

Signed-off-by: Alexey Ivanov <alexey.ivanes@gmail.com>
2018-01-17 10:27:52 +03:00
Yann Collet 3e1e57db27 fix fileio progression status update
The compression % is no longer correct,
since it's no longer possible to make direct correlation
between nb bytes read and nb bytes written
due to large internal buffer inside CCtx
(exacerbated with --long).

The current "fix" is to no longer display the %.

A more complex solution will have to count exactly how much data has been consumed and compressed internally, within CCtx buffers.
2018-01-16 17:35:00 -08:00
Yann Collet 10c213761a cli: fix for no-MT mode
when cli is compiled without MT support,
invoking ZSTD_p_nonBlockingMode result in an error code.

This patch only sets ZSTD_p_nonBlockingMode when ZSTD_MULTITHREAD is set, meaning there is MT support.

The error code could also be intentionnally ignored (there is no side effect).
2018-01-16 17:28:11 -08:00
Yann Collet 1dba98d563 introduced parameter ZSTD_p_nonBlockingMode
This new parameter makes it possible to call
streaming ZSTDMT with a single thread set
which is non blocking.

It makes it possible for the main thread to do other tasks in parallel
while the worker thread does compression.
Typically, for zstd cli, it means it can do I/O stuff.

Applied within fileio.c, this patch provides non-negligible gains during compression.

Tested on my laptop, with enwik9 (1000000000 bytes) : time zstd -f enwik9

With traditional single-thread blocking mode :
real    0m9.557s
user    0m8.861s
sys     0m0.538s

With new single-worker non blocking mode :
real    0m7.938s
user    0m8.049s
sys     0m0.514s

=> 20% faster
2018-01-16 16:15:47 -08:00
Yann Collet 6025465e42 ZSTDMT : minor CCtx memory optimization
can be useful when a compression job only has small amount of data to compress.
2018-01-16 15:34:41 -08:00
Yann Collet 2e23333094 ZSTDMT can now work in non-blocking mode with 1 thread
it still fallbacks to single-thread blocking invocation
when input is small (<1job)
or when invoking ZSTDMT_compress(), which is blocking.

Also : fixed a bug in new block-granular compression routine.
2018-01-16 15:28:43 -08:00
Yann Collet 8e83c5c910 Merge branch 'dev' into progressiveMT 2018-01-16 12:54:33 -08:00
Yann Collet f0d703243f
Merge pull request #956 from terrelln/mf-struct
Split ZSTD_CCtx into smaller sub-structures
2018-01-16 12:37:44 -08:00
Yann Collet 0bc7a75468
Merge pull request #987 from facebook/checkTag
Check tag
2018-01-16 11:52:32 -08:00
Yann Collet f4e58455f6 ensure MOREFLAGS are not lost in root->tests Makefile invocation 2018-01-16 11:50:16 -08:00
Yann Collet 19cb37f837 travis ci : added gcc-7 test
also added `-Werror` to sanitizer tests
2018-01-16 11:40:42 -08:00
Nick Terrell aae267a2e1 Reorganize block state 2018-01-16 11:17:50 -08:00
Nick Terrell 887cd4e35e Split ZSTD_CCtx into smaller sub-structures 2018-01-16 11:17:50 -08:00
Yann Collet a187486e3c
Merge pull request #990 from krab/dev-cmake-libdir
CMake: use GNUInstallDirs for library install dir
2018-01-15 18:55:21 -08:00
Alexey Ivanov 403e2db139 CMake: use GNUInstallDirs for library install dir
Libraries now will be installed in the correct directory on x86_64 linux systems,
and can be changed with `-DCMAKE_INSTALL_LIBDIR=<dirname>` option.
2018-01-15 22:48:46 +03:00
Yann Collet 815eddeda4 added tag-triggered test
ensure tag and libzstd version are compatible
2018-01-14 17:06:21 -08:00
Yann Collet f6e17b8c7a added tests/checkTag
compared provided tag with current libzstd version
2018-01-14 17:03:45 -08:00
Yann Collet 4792ac6689 make -C tests legacy : minor flag alteration
ZSTD_LEGACY_SUPPORT is a macro constant,
so it should be part of CPPFLAGS, instead of CFLAGS
2018-01-14 14:09:17 -08:00
Yann Collet 99a0a8bbd2 tests/Makefile : fixed target allnothread
ensures MT is disabled
2018-01-13 22:00:05 -08:00
Yann Collet f8a05932b9 added `make list` capability to `tests/Makefile` 2018-01-13 21:54:21 -08:00
Yann Collet 9477f6529d
Merge pull request #984 from terrelln/dict-load
Load more dictionary positions into table if empty
2018-01-13 13:20:42 -08:00
Yann Collet 58ecf13e02 zstdmt : can compress at block granularity
offering perspective of more accurate progression report.
2018-01-13 13:18:57 -08:00
Nick Terrell 9a211d1f05 Load more dictionary positions into table if empty
If the hash table is empty load positions into the hash table
that we would otherwise skip.

| Level | Data Set     | Improvement |
|-------|--------------|-------------|
| 1     | github       | 0.44%       |
| 1     | hg-changelog | 0.13%       |
| 1     | hg-commands  | 1.28%       |
| 1     | hg-manifest  | 0.70%       |
| 3     | github       | 0.74%       |
| 3     | hg-changelog | 0.87%       |
| 3     | hg-commands  | 1.74%       |
| 3     | hg-manifest  | 0.23%       |
2018-01-12 16:17:22 -08:00