Commit Graph

1511 Commits (f9802d80a0d53797f19152334893ec63ade0089c)

Author SHA1 Message Date
Nick Terrell 58476bcf7f Don't shrink window log in ZSTD_getCParams()
Treat ZSTD_getCParams() and ZSTD_adjustCParams() in the same way
we treat streaming compression. Choose parameters based on the
dictionary size + source size, and assume the source size is small
if unkown. But, don't shrink the window log down in
ZSTD_adjustCParams_internal().
2021-01-04 15:54:09 -08:00
Nick Terrell 9d31c704d5 Don't shrink window log when streaming with a dictionary
Fixes #2442.

1. When creating a dictionary keep the same behavior as before.
   Assume the source size is 513 bytes when adjusting parameters.
2. When calling ZSTD_getCParams() or ZSTD_adjustCParams() keep
   the same behavior as before.
3. When attaching a dictionary keep the same behavior of ignoring
   the dictionary size. When streaming this will select the
   largest parameters and not adjust them down. But, the CDict
   will use the correctly sized parameters, which seems like the
   right tradeoff.
4. When not attaching a dictionary (either forced not to, or
   using a prefix dictionary) we select parameters based on the
   dictionary size + source size, and assume the source size is
   small, which is the same behavior as before. But, now we don't
   adjust the window log (and hash and chain log) down when the
   source size is unknown.

When the source size is unknown all cdicts should attach, except
when the user disables attaching, or `forceWindow` is used. This
means that when streaming with a CDict we end up in the good case
where we get small CDict parameters, and large source parameters.

TODO: Add a streaming + dictionary regression test case.
2021-01-04 15:54:09 -08:00
Nick Terrell a98a6e2091 [test][regression] Add no source size with dictionary test
* Add a test that runs without a pledgedSrcSize and with a dictionary.
* Add github.tar data with uses the github dictionary while compressing
  github.tar, instead of each file individually.
2021-01-04 15:54:09 -08:00
Nick Terrell 66e811d782 [license] Update year to 2021 2021-01-04 17:53:52 -05:00
Yann Collet ff2f888d56 fixed one more minor cast issue
can't use address calculation with `void*`
2020-12-29 11:44:37 -08:00
Yann Collet 7f8be046b9 fixed minor warnings introduced in #2439 2020-12-28 14:07:31 -08:00
Yann Collet cfff4c1cd5
Merge pull request #2439 from senhuang42/skippable_frame_api
Generate skippable frame API
2020-12-28 11:22:07 -08:00
senhuang42 5c41490bfe Use pre-defined constants 2020-12-21 11:52:05 -05:00
senhuang42 339d8ba103 Add unit test 2020-12-21 11:33:27 -05:00
Yann Collet 9648bf027b try to keep libzstd.a "as is" once created
to be compatible with scenarios such as
`make -j allmost`
2020-12-20 17:10:57 -08:00
Yann Collet 3536e9d5ff removing tests using too much resources for 32-bit address space 2020-12-17 15:44:54 -08:00
Yann Collet 0b39531d75 moving all references to `release` branch
was previously `master`
2020-12-16 23:00:35 -08:00
Nick Terrell 0be843b200 [tests] Fix playTests.sh with spaces in path 2020-12-10 11:03:47 -08:00
senhuang42 b9ab6bc061 Fix various conversion warnings 2020-12-08 10:07:28 -05:00
Yann Collet 69a04ccf68
Merge pull request #2413 from senhuang42/paramgrill_windows
Paramgrill for windows
2020-12-04 21:38:39 -08:00
Yann Collet b86e3c9304
Merge pull request #2415 from facebook/fix_aliasing
fix gcc-10 strict aliasing warnings
2020-12-04 21:30:57 -08:00
Yann Collet 5c0a3489a5 fix aliasing warning in decodecorpus 2020-12-04 19:21:40 -08:00
Nick Terrell c238db046f
Merge pull request #2414 from terrelln/mt-progress
[lib] Ensure that multithreaded compression always makes some progress
2020-12-04 16:30:08 -08:00
Nick Terrell 4c58cb8383 [lib] Ensure that multithreaded compression always makes some progress 2020-12-03 20:25:14 -08:00
senhuang42 260b85acf5 Fix MSVC 2019 warnings 2020-12-03 10:36:45 -05:00
Yann Collet 5de5c1d759 fixed fuzzer multithreading tests 2020-12-02 10:34:12 -08:00
Yann Collet db21d383b5 fixed fuzzer32 to support multithreading tests
though it still fails on test33:
`test 33: superblock uncompressible data, too many nocompress superblocks`
2020-12-02 09:13:55 -08:00
Yann Collet f69d8c027d removed fullbench-lib from tests/all
this build works fine on all my systems,
but since to fail on CI environment.
Unclear why there is a difference.
This build test is not relevant anyway.
2020-12-02 00:21:29 -08:00
Yann Collet 9f8b180d5d fixed API documentation 2020-12-02 00:15:07 -08:00
Yann Collet f8d0b46a9f streamline fuzzer
from fuzzer32
2020-12-01 23:44:16 -08:00
Yann Collet 37165f66b7 better usage of default build rules 2020-12-01 23:36:05 -08:00
Yann Collet 343a75d2ef simplified test makefile
removed gzstd target:
relevant tests are unused and broken anyway
2020-12-01 22:33:45 -08:00
senhuang42 4c5f337248 Use cctx's minMatch instead of global MINMATCH, make fuzzer use validation 2020-11-30 15:41:20 -05:00
Yann Collet 4b5d7e9ddb fix lz4 test messed by console detection 2020-11-30 06:47:16 -08:00
senhuang42 23554ff25f Force CCtx minmatch to be same as generated minmatch 2020-11-23 13:29:20 -05:00
senhuang42 c502cd33e5 Fix generating 1 too few characters in random string generator 2020-11-20 16:58:25 -05:00
senhuang42 5b0c8f0a7c Add appropriate bound to matchlengths, and reduce srcSize max 2020-11-20 16:58:25 -05:00
senhuang42 a73a07b189 Add a bound for matchlength dependent on window size 2020-11-20 16:58:25 -05:00
senhuang42 5c68c5e31e Variety of minor fixups, reduce allocation, make deterministic 2020-11-20 16:58:25 -05:00
senhuang42 59c021f501 Add built binary to .gitignore 2020-11-20 16:58:25 -05:00
senhuang42 26bc0bfdf6 Add new fuzzer to build targets 2020-11-20 16:58:25 -05:00
senhuang42 ed575963c5 Implement new fuzzer for sequence compression 2020-11-20 16:58:25 -05:00
senhuang42 7742f076b4 Add experimental param for sequence validation 2020-11-20 11:57:41 -05:00
senhuang42 05c0229668 Clean up visual conversion warnings 2020-11-18 15:36:29 -05:00
senhuang42 d6d7ba2a1f Modification to offset validation to include entire sequence 2020-11-17 10:13:22 -05:00
senhuang42 55b90ef010 Fix unit tests to agree with new changes 2020-11-16 11:36:37 -05:00
senhuang42 3d26615c84 Adjust unit tests to agree with new sequence generation API 2020-11-16 10:49:17 -05:00
senhuang42 2db8441245 Add RLE support 2020-11-16 10:49:17 -05:00
senhuang42 2bbdddf24e Add test case to roundtrip using ZSTD_getSequences() and ZSTD_compressSequences() 2020-11-16 10:49:16 -05:00
senhuang42 9d936d61d2 Reduce number of memcpy() calls 2020-11-13 19:43:30 -05:00
senhuang42 1a8af0de73 Improve unit test 2020-11-12 11:09:09 -05:00
sen f62edf0fe9
Merge pull request #2381 from senhuang42/expand_sequence_extraction_api
Add enum to define ZSTD_Sequence type and update sequence extraction API
2020-11-06 13:00:31 -05:00
senhuang42 7d1dea070c Update unit tests 2020-11-06 11:10:37 -05:00
senhuang42 51abd58208 Rename getSequences() to generateSequences() 2020-11-06 10:53:22 -05:00
Luke Pitt eac309c71b Add ZSTD_getDictID_fromCDict function to experimental section 2020-11-04 11:37:37 +00:00