1251 Commits

Author SHA1 Message Date
Yann Collet
70d89e5a12 minor rebalancing of level 13
This new setup is slighly better on `silesia.tar` :
Ratio : 3.649 -> 3.655
Speed : 11.9 MB/s -> 12.2 MB/s
At the cost of more memory : 24 MB -> 32 MB
The new memory budget is a reasonable interpolation between neighboring levels 12 and 14:
level 12 : 24 MB
level 13 : 32 MB (increased from 24 MB)
level 14 : 48 MB
Window size remains unaffected (4 MB)
2021-09-01 13:05:10 -07:00
Yann Collet
2de42174bb make ZSTD_HASHLOG3_MAX private
This is an implementation detail,
it doesn't belong to public space (zstd.h).
2021-08-20 09:52:42 -07:00
senhuang42
e411040ea1 Add 64 row entry support for lazy 2021-08-04 16:19:12 -04:00
senhuang42
31820e032c Rebalance clevels for lazy 2021-08-04 16:18:52 -04:00
Sen Huang
5ec7897a26 Fix static analyzer warnings 2021-07-29 09:11:12 -07:00
Binh Vo
dc5b693f1e Proactively skip huffman compression based on sampling where non-compressibility is suspected 2021-06-30 11:02:47 -04:00
sen
45d707e908
Merge pull request #2715 from senhuang42/sequence_api_3
[RFC] Add internal API for converting ZSTD_Sequence into seqStore
2021-06-24 13:02:11 -04:00
senhuang42
76466dfadf Add simple API for converting ZSTD_Sequence into seqStore 2021-06-23 12:10:48 -04:00
Nick Terrell
05b6773fbc [fix] Add missing bounds checks during compression
* The block splitter missed a bounds check, so when the buffer is too small it
  passes an erroneously large size to `ZSTD_entropyCompressSeqStore()`, which
  can then write the compressed data past the end of the buffer. This is a new
  regression in v1.5.0 when the block splitter is enabled. It is either enabled
  explicitly, or implicitly when using the optimal parser and `ZSTD_compress2()`
  or `ZSTD_compressStream*()`.
* `HUF_writeCTable_wksp()` omits a bounds check when calling
  `HUF_compressWeights()`. If it is called with `dstCapacity == 0` it will pass
  an erroneously large size to `HUF_compressWeights()`, which can then write
  past the end of the buffer. This bug has been present for ages. However, I
  believe that zstd cannot trigger the bug, because it never calls
  `HUF_compress*()` with `dstCapacity == 0` because of [this check][1].

Credit to: Oss-Fuzz

[1]: 89127e5ee2/lib/compress/zstd_compress_literals.c (L100)
2021-06-14 11:35:33 -07:00
sen
d5f3568c4b
Merge pull request #2697 from senhuang42/entropy_repeat_fix
[bug] Fix entropy repeat mode bug
2021-06-10 16:39:17 +03:00
aqrit
dd4f6aa9e6
Flatten ZSTD_row_getMatchMask (#2681)
* Flatten ZSTD_row_getMatchMask

* Remove the SIMD abstraction layer.
* Add big endian support.
* Align `hashTags` within `tagRow` to a 16-byte boundary. 
* Switch SSE2 to use aligned reads.
* Optimize scalar path using SWAR.
* Optimize neon path for `n == 32`
* Work around minor clang issue for NEON (https://bugs.llvm.org/show_bug.cgi?id=49577)

* replace memcpy with MEM_readST

* silence alignment warnings

* fix neon casts

* Update zstd_lazy.c

* unify simd preprocessor detection (#3)

* remove duplicate asserts

* tweak rotates

* improve endian detection

* add cast

there is a fun little catch-22 with gcc: result from pmovmskb has to be cast to uint32_t to avoid a zero-extension
but must be uint16_t to get gcc to generate a rotate instruction..

* more casts

* fix casts

better work-around for the (bogus) warning: unary minus on unsigned
2021-06-09 08:50:25 +03:00
Sen Huang
923e5ad3f5 Fix entropy repeat mode bug 2021-06-07 00:32:03 -07:00
senhuang42
939276cd0c Add ldm and block splitter auto-enable to old api 2021-05-24 13:09:32 -04:00
Yann Collet
02ece5d59f
Merge pull request #2653 from TrianglesPCT/dev
Enable SSE2 compression path to work on MSVC
2021-05-17 11:20:50 -07:00
TrianglesPCT
77d54eb3b3
Add files via upload 2021-05-14 16:40:32 -06:00
TrianglesPCT
52f44bb365
Add files via upload
msvc
2021-05-14 16:33:07 -06:00
Nick Terrell
03c4111299 [lib] Fix dictionary invalidation logic
Call `ZSTD_enforceMaxDist()` before each block with the beginning of the
block. This ensures that `lowLimit` is updated to `dictLimit` whenever
the ext-dict is out of range, so we can use prefix mode for speed.

This can cause non-determinism because prefix mode and ext-dict mode
match finders can return different results. It can also hurt speed
because ext-dict match finders are slower.

The scenario is:
1. Compress large data with a dictionary.
2. The dictionary goes out of bounds, so we invalidate it.
3. However, we still have `lowLimit < dictLimit`, since it is
   never updated.
4. We will call the ext-dict match finder instead of the prefix one.
2021-05-13 17:05:59 -07:00
sen
9e94b7cac5
Assert no divison by 0, correct superblocks 0 sequences case (#2592) 2021-05-07 13:26:56 -04:00
sen
698f261b35
[1.5.0] Deprecate some functions (#2582)
* Add deprecated macro to zstd.h, mark certain functions as deprecated

* Remove ZSTD_compress.c dependencies on deprecated functions
2021-05-06 17:59:32 -04:00
Nick Terrell
207e33bb61
Merge pull request #2616 from terrelln/deterministic-dict
[lib] Add ZSTD_c_deterministicRefPrefix
2021-05-06 11:09:22 -07:00
Nick Terrell
172b4b6ac4 [lib] Add ZSTD_c_deterministicRefPrefix
This flag forces zstd to always load the prefix in ext-dict mode, even
if it happens to be contiguous, to force determinism. It also applies to
dictionaries that are re-processed.

A determinism test case is also added, which fails without
`ZSTD_c_deterministicRefPrefix` and passes with it set.

Question: Should this be the default behavior? It isn't in this PR.
2021-05-05 18:49:56 -07:00
Nick Terrell
eb7e74ccb7 [tests] Set DEBUGLEVEL=2 by default
This allows us to quickly check for compile errors in debug log
messages, which are compiled out when `DEBUGLEVEL < 2`.
2021-05-05 13:29:06 -07:00
Nick Terrell
c2183d7cdf [lib] Move some ZSTD_CCtx_params off the stack
* Take `params` by const reference in `ZSTD_resetCCtx_internal()`.
* Add `simpleApiParams` to the CCtx and use them in the simple API
  functions, instead of creating those parameters on the stack.

I think this is a good direction to move in, because we shouldn't need
to worry about adding parameters to `ZSTD_CCtx_params`, since it should
always be on the heap (unless they become absoultely gigantic).

Some `ZSTD_CCtx_params` are still on the stack in the CDict functions,
but I've left them for now, because it was a little more complex, and we
don't use those functions in stack-constrained currently.
2021-05-05 13:25:16 -07:00
Nick Terrell
0b88c2582c [test] Add large dict/data --patch-from test
Dictionary size must be > `ZSTD_CHUNKSIZE_MAX`.
2021-05-04 17:31:32 -07:00
Nick Terrell
94db4398a0 [lib] Always load the dictionary in one go
Dictionaries larger than `ZSTD_CHUNKSIZE_MAX` used to have to be loaded
in multiple segments. Instead, when we detect large dictionaries, ensure
that we reset the context's indicies. Then, for dictionaries larger than
`ZSTD_CURRENT_MAX - 1`, only load the suffix of the dictionary. Finally,
enable DDS for large dictionaries, since we no longer load in multiple
segments.

This simplifes the dictionary loading code, and reduces opportunities
for non-determinism to slip in.
2021-05-04 16:45:25 -07:00
Nick Terrell
1ffa80a09e [easy] Rewrite rowHashLog computation
`ZSTD_highbit32(1u << x) == x` when it isn't undefined behavior.
2021-05-04 11:43:20 -07:00
Nick Terrell
34aff7ea06 Bug fix & run overflow correction much more frequently in tests
* Fix overflow correction when `windowLog < cycleLog`. Previously, we
  got the correction wrong in this case, and our chain tables and binary
  trees would be corrupted. Now, we work as long as `maxDist` is a power
  of two, by adding `MAX(maxDist, cycleSize)` to our indices.
* When `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` is defined to non-zero
  run overflow correction as frequently as allowed without impacting
  compression ratio.
* Enable `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` in `fuzzer` and
  `zstreamtest` as well as all the OSS-Fuzz fuzzers. This has a 5-10%
  speed penalty at most, which seems reasonable.
2021-05-03 15:21:47 -07:00
senhuang42
61fe571af6 Fix chaintable check to include rowhash in ZSTD_reduceIndex() 2021-04-30 19:52:04 -04:00
Nick Terrell
6cee3c2c4f [trace] Remove default definitions of weak symbols
Instead of providing a default no-op implementation, check the symbols
for `NULL` before accessing them. Providing a default implementation
doesn't reliably work with dynamic linking. Depending on link order the
default implementations may not be overridden. By skipping the default
implementation, all link order issues are resolved. If the symbols
aren't provided the weak function will be `NULL`.
2021-04-26 16:05:39 -07:00
felixhandte
efa6dfa729 Apply DDS adjustments to avoid assert failures 2021-04-23 16:41:00 -04:00
sen
12c045f74d
Merge pull request #2574 from senhuang42/repcode_mismatch_detector_fix
Correct the block splitter mismatched repcodes detection.
2021-04-12 23:27:43 -04:00
Sen Huang
550f76f131 Correct the detection of mismatched repcodes 2021-04-09 09:08:51 -07:00
Nick Terrell
4694423c4f Add and integrate lazy row hash strategy 2021-04-07 09:53:34 -07:00
sen
f71aabb5b5
Move clevel override to after initLocalDict() (#2571) 2021-04-06 21:05:37 -04:00
sen
f1e8b565c2
Maintain two repcode histories for block splitting, replace invalid repcodes (#2569) 2021-04-06 17:25:55 -04:00
sen
e38124555e
Fix dictionary force reloading clevel selection (#2570)
* Move cdict clevel override to before localdict init

* Update results.csv after dict load changes
2021-04-06 15:35:09 -04:00
sen
980f3bbf83
[cwksp] Align all allocated "tables" and "aligneds" to 64 bytes (#2546)
* Perform 64-byte alignment of wksp tables and aligneds internally

* Clean up cwskp_finalize() function to only do two allocs

* Refactor aligned/buffer reservation code, remove ASAN req for alignment reservations

* Change from allocating 128 bytes always to allocating only buffer space as needed for tables/aligned

* Back out aligned/table reservation order restriction

* Add stricter bounds for new/resized wksps, fix comment in zstd_cwksp.h
2021-04-01 20:07:19 -04:00
sen
255925c231
Fix repcode-related OSS-fuzz issues in block splitter (#2560)
* Do not emit last partitions of blocks as RLE/uncompressed

* Fix repcode updates within block splitter

* Add a entropytables confirm function, redo ZSTD_confirmRepcodesAndEntropyTables() for better function signature

* Add a repcode updater to block splitter, no longer need to force emit compressed blocks
2021-03-31 15:14:59 -04:00
Nick Terrell
a494308ae9 [copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files
* Switch to yearless copyright per FB policy
* Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources
* Add zstd copyright/license header to the `contrib/linux-kernel` sources
* Update the `tests/test-license.py` to check for yearless copyright
* Improvements to `tests/test-license.py`
* Check `contrib/linux-kernel` in `tests/test-license.py`
2021-03-30 10:30:43 -07:00
sen
84ccb81e7c
Merge pull request #2561 from senhuang42/longlength_enum
Add enum for representing long length ID
2021-03-26 15:55:12 -04:00
Sen Huang
b1a43455f8 Add enum for representing long length ID 2021-03-26 10:41:09 -07:00
sen
4fe2e7ae14
Merge pull request #2558 from senhuang42/msan_block_splitter_fix
Fix block splitter minor MSAN warning.
2021-03-25 13:51:43 -04:00
sen
b0407b9f0e
Merge pull request #2555 from senhuang42/default_clevel_func
Add ZSTD_defaultCLevel() function to public API
2021-03-25 13:07:28 -04:00
Sen Huang
2a907bf4aa Move lastCountSize into a returned struct, fix MSAN error 2021-03-25 09:11:15 -07:00
Sen Huang
e398744a35 Add ZSTD_defaultCLevel() function to public API 2021-03-25 08:04:00 -07:00
Nick Terrell
f8ac0ea7ef
Merge pull request #2539 from terrelln/linux-kernel-fixes
Fixes for the next linux kernel patch version
2021-03-24 10:34:29 -07:00
sen
bf542c8a8d
Merge pull request #2447 from senhuang42/block_splitter_v2
Recursive block splitting
2021-03-24 12:27:22 -04:00
Sen Huang
5b566ebe08 Rename *compressSequences*() functions for clarity 2021-03-24 08:21:29 -07:00
Sen Huang
0ef1f935b7 Add a fallback in case the total blocksize of split blocks exceeds raw block size 2021-03-24 08:21:29 -07:00
Sen Huang
c90e81a692 Enable block splitter by default when applicable 2021-03-24 08:21:29 -07:00