1307 Commits

Author SHA1 Message Date
Sen Huang
e398744a35 Add ZSTD_defaultCLevel() function to public API 2021-03-25 08:04:00 -07:00
Nick Terrell
f8ac0ea7ef
Merge pull request #2539 from terrelln/linux-kernel-fixes
Fixes for the next linux kernel patch version
2021-03-24 10:34:29 -07:00
sen
bf542c8a8d
Merge pull request #2447 from senhuang42/block_splitter_v2
Recursive block splitting
2021-03-24 12:27:22 -04:00
Sen Huang
5b566ebe08 Rename *compressSequences*() functions for clarity 2021-03-24 08:21:29 -07:00
Sen Huang
0ef1f935b7 Add a fallback in case the total blocksize of split blocks exceeds raw block size 2021-03-24 08:21:29 -07:00
Sen Huang
c90e81a692 Enable block splitter by default when applicable 2021-03-24 08:21:29 -07:00
Sen Huang
e34332834a Clean up various functions, add debuglogging for estimate vs. actual sizes 2021-03-24 08:21:29 -07:00
Sen Huang
41c3eae6d9 Fix various fuzzer failures: repcode history, superblocks 2021-03-24 08:21:29 -07:00
senhuang42
0633bf17c3 Change 1.3.4 bugfix to be cross-compatible with superblocks and normal compression 2021-03-24 08:21:29 -07:00
senhuang42
eb1ee8686d Refactor buildSequencesStatistics() to avoid pointer increment for superblocks 2021-03-24 08:21:29 -07:00
senhuang42
e2bb215117 Add unit tests and fuzzer param 2021-03-24 08:21:09 -07:00
senhuang42
de52de1347 Add recursive block split algorithm 2021-03-24 08:21:09 -07:00
senhuang42
f06f6626ed Update function names for consistency 2021-03-24 08:20:54 -07:00
senhuang42
c56d6e49e8 Add block splitter to experimental params 2021-03-24 08:20:54 -07:00
senhuang42
2949a95224 Refactor block compression logic into single function 2021-03-24 08:20:54 -07:00
senhuang42
c05c090cc2 Centralize entropy statistics calculations to zstd_compress.c 2021-03-24 08:20:29 -07:00
sen
c48889f097
Merge pull request #2538 from senhuang42/monotonicity_test
Add memory monotonicity test over srcSize
2021-03-22 16:54:34 -04:00
Sen Huang
dff4a0e867 Make ZSTD_estimateCCtxSize_internal() loop through all srcSize parameter sets as well 2021-03-21 16:15:31 -07:00
Sen Huang
77ae664ba6 Fix ZSTD_dedicatedDictSearch_isSupported() requirements 2021-03-16 17:36:05 -07:00
senhuang42
386111adec Add a nbSeq argument to compressSequences()
Refactor ZSTD_compressBlock_internal() to do the block header write within and add nbSeq argument to compressSequences()
2021-03-16 14:04:22 -07:00
senhuang42
98764493cf Move block header write into compressBlock_internal() 2021-03-16 14:04:22 -07:00
Nick Terrell
cd1551d261 [lib][tracing] Add ZSTD_NO_TRACE macro
When defined, it disables tracing, and avoids including the header.
2021-03-16 11:47:27 -07:00
Nick Terrell
7736549bea [bug-fix] Make simple single-pass functions ignore advanced parameters
The simple compression functions are intended to ignore the advanced
parameters, but they were accidentally using them. All the
`ZSTD_parameters` were set correctly, but any extra parameters were
used as-is. E.g. `ZSTD_c_format`.

This PR makes all the simple single-pass functions listed below ignore
the advanced parameters, as intended.

* `ZSTD_compressCCtx()`
* `ZSTD_compress_usingDict()`
* `ZSTD_compress_usingCDict()`
* `ZSTD_compress_advanced()`
* `ZSTD_compress_usingCDict_advanced()`

It also adds a test case that ensures that each of these functions
ignore the advanced parameters.
2021-02-12 19:11:23 -08:00
Nick Terrell
c62eb05964 [lib] Set appliedParams.compressionLevel correctly
Forward the correct compressionLevel to the appliedParams in all cases.
It was already correct for the advanced API, so only the old single-pass
functions needed to be fixed.

This compression level is unused by the library, but is set so that the
tracing framework can consume it.
2021-02-12 15:00:14 -08:00
Nick Terrell
f520f6dfbe [trace] Minor fixes found during integration
* Mark `ZSTD_CCtx_getParameter()` as const
* Add `extern "C"` guards to `zstd_trace.h`
2021-02-11 16:20:04 -08:00
Yann Collet
8884cb887d
Merge pull request #2483 from mpu/ldmgear
New algorithms for the long distance matcher
2021-02-11 08:38:23 -08:00
Quentin Carbonneaux
552efcac2d relocate large arrays from the stack to ldmState_t 2021-02-10 16:16:54 +01:00
Nick Terrell
e59c9459a5 [trace] Keep track of a uint64_t tracing context
The most common information that you want to track between begin() and
end() is the timestamp of the begin function, so you can measure the
duration of the (de)compression call. Allow the tracing library to put
this information inside the `ZSTD_TraceCtx`, so it doesn't need to keep
a global map in this case. If a single uint64_t is not enough, the
tracing library can return a unique identifier (like the context
pointer) instead, and use it as a key in a map.

This keeps the simple case simple.
2021-02-09 11:37:05 -08:00
Nick Terrell
54a4998a80 Add basic tracing functionality 2021-02-05 16:28:52 -08:00
Quentin Carbonneaux
1e65711ca5 a couple performance improvement changes for ldm 2021-01-20 00:54:20 -08:00
Nick Terrell
58476bcf7f Don't shrink window log in ZSTD_getCParams()
Treat ZSTD_getCParams() and ZSTD_adjustCParams() in the same way
we treat streaming compression. Choose parameters based on the
dictionary size + source size, and assume the source size is small
if unkown. But, don't shrink the window log down in
ZSTD_adjustCParams_internal().
2021-01-04 15:54:09 -08:00
Nick Terrell
9d31c704d5 Don't shrink window log when streaming with a dictionary
Fixes #2442.

1. When creating a dictionary keep the same behavior as before.
   Assume the source size is 513 bytes when adjusting parameters.
2. When calling ZSTD_getCParams() or ZSTD_adjustCParams() keep
   the same behavior as before.
3. When attaching a dictionary keep the same behavior of ignoring
   the dictionary size. When streaming this will select the
   largest parameters and not adjust them down. But, the CDict
   will use the correctly sized parameters, which seems like the
   right tradeoff.
4. When not attaching a dictionary (either forced not to, or
   using a prefix dictionary) we select parameters based on the
   dictionary size + source size, and assume the source size is
   small, which is the same behavior as before. But, now we don't
   adjust the window log (and hash and chain log) down when the
   source size is unknown.

When the source size is unknown all cdicts should attach, except
when the user disables attaching, or `forceWindow` is used. This
means that when streaming with a CDict we end up in the good case
where we get small CDict parameters, and large source parameters.

TODO: Add a streaming + dictionary regression test case.
2021-01-04 15:54:09 -08:00
Nick Terrell
66e811d782 [license] Update year to 2021 2021-01-04 17:53:52 -05:00
senhuang42
5c41490bfe Use pre-defined constants 2020-12-21 11:52:05 -05:00
senhuang42
7e11bd012b Implement skippable frame function 2020-12-21 11:13:22 -05:00
Yann Collet
0b39531d75 moving all references to release branch
was previously `master`
2020-12-16 23:00:35 -08:00
W. Felix Handte
9dab03db90 Create Enum to Represent Static/Dynamic Allocation Distinction in cwksp 2020-12-09 14:57:37 -05:00
W. Felix Handte
db9e73cb07 Don't ASAN-Poison Statically-Allocated Workspaces
Addresses #2286.
2020-12-09 13:00:47 -05:00
Nick Terrell
c238db046f
Merge pull request #2414 from terrelln/mt-progress
[lib] Ensure that multithreaded compression always makes some progress
2020-12-04 16:30:08 -08:00
Nick Terrell
4c58cb8383 [lib] Ensure that multithreaded compression always makes some progress 2020-12-03 20:25:14 -08:00
Nick Terrell
6672689e7e
Merge pull request #2406 from terrelln/linux-wrapper-api
[linux] Add the linux wrapper API
2020-12-02 16:49:03 -08:00
Nick Terrell
894ae36675
Merge pull request #2390 from animalize/clamp_level
Clamp compression level
2020-12-02 14:35:58 -08:00
senhuang42
2cbd038528 Move max nb seq check to per-block 2020-12-02 12:11:32 -05:00
Nick Terrell
3cda5fae77 [minor][lib] Remove double semicolon 2020-12-02 01:08:08 -08:00
senhuang42
3efe9c902b Add sequence nb validation to compressSequences(), adjust minMatch comparisons 2020-12-01 10:54:45 -05:00
senhuang42
4c5f337248 Use cctx's minMatch instead of global MINMATCH, make fuzzer use validation 2020-11-30 15:41:20 -05:00
sen
c5fbd55dac
Merge pull request #2387 from senhuang42/compress_sequence_API
[RFC] New sequence compression API
2020-11-20 16:54:20 -05:00
senhuang42
7742f076b4 Add experimental param for sequence validation 2020-11-20 11:57:41 -05:00
senhuang42
0e32928b7d Remove unnecessary repcode backup, apply style choices, use function pointer 2020-11-20 11:02:19 -05:00
sen
e924a0fa51
Explicit cast for visual warnings
Github has automatic commits now! Cool

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2020-11-19 17:32:40 -05:00