Commit Graph

8868 Commits (6fad35c6a13919db9ab3b4f768f624220da50426)

Author SHA1 Message Date
Yann Collet 8f86c29c06 allow jobSize to be as low as 512 KB
previous lower limit was 1 MB.

Note : by default, the lowest job size is 2 MB, achieved at level 1.
Even lower job sizes can be achieved by manipulating this value directly,
or manually modifying window sizes to lower amounts.

Updated unit test to ensure that this new limit works fine
(test would fail with previous 1 MB limit).
2021-05-04 11:02:55 -07:00
Nick Terrell 32823bc150 [LDM] Speed optimization on repetitive data
LDM does especially poorly on repetitive data when that data's hash happens
to have `(hash & stopMask) == 0`. Either because the `stopMask == 0` or
random chance. Optimize this case by skipping over repetitive patterns.
The detection is very simplistic, but should catch most of the offending
cases.

```
head -c 1G /dev/zero | perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long
      21.187881087 seconds time elapsed

head -c 1G /dev/zero | perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long
       1.149707921 seconds time elapsed

```
2021-05-04 10:57:42 -07:00
W. Felix Handte ee122baacf Detect Presence of `md5` on Darwin
This fixes #2568.
2021-05-04 12:33:19 -04:00
Yann Collet 8aafbd3604 Documented minimum version numbers
Any stable API entry point introduced after v1.0
should be documented with its minimum version number.

Since PR fixes this requirement
updating mostly new entry points since v1.4.0
and newly introduced ones for future v1.5.0.
2021-05-04 09:05:22 -07:00
Nick Terrell 6f40571ae2
Merge pull request #2606 from terrelln/test-memory
[tests] Reduce memory usage of MT CLI tests
2021-05-03 21:16:28 -07:00
Nick Terrell 0b370e9da8
Merge pull request #2603 from terrelln/reduce-indices-fuzzer
Bug fix & run overflow correction much more frequently in tests
2021-05-03 19:24:55 -07:00
Nick Terrell 2e4fca38d8 [tests] Reduce memory usage of MT CLI tests
Switch from `-T0` to the default `-T1` which significantly reduces
memory usage for level 19 when there are many cores. This fixes
32-bit issues of running out of address space.

Fixes #2603.
2021-05-03 16:29:11 -07:00
Nick Terrell 34aff7ea06 Bug fix & run overflow correction much more frequently in tests
* Fix overflow correction when `windowLog < cycleLog`. Previously, we
  got the correction wrong in this case, and our chain tables and binary
  trees would be corrupted. Now, we work as long as `maxDist` is a power
  of two, by adding `MAX(maxDist, cycleSize)` to our indices.
* When `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` is defined to non-zero
  run overflow correction as frequently as allowed without impacting
  compression ratio.
* Enable `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` in `fuzzer` and
  `zstreamtest` as well as all the OSS-Fuzz fuzzers. This has a 5-10%
  speed penalty at most, which seems reasonable.
2021-05-03 15:21:47 -07:00
sen cc31bb8b66
Merge pull request #2598 from senhuang42/reduce_index_rowhash_fix
Fix chaintable check to include rowhash in ZSTD_reduceIndex()
2021-05-03 17:34:39 -04:00
sen 4c5cc345fb
Merge pull request #2581 from senhuang42/lcm_stable
[1.5.0] Promote ZSTD_c_literalCompressionMode to stable params
2021-05-03 11:59:19 -04:00
sen cdc979ddb3
Merge pull request #2580 from senhuang42/defaultclevel_to_stable
[1.5.0] Promote ZSTD_defaultCLevel() into stable API
2021-05-03 11:59:05 -04:00
Nick Hutchinson 0dabbd4ed7 Add clang-cl build jobs to appveyor.yml 2021-05-02 15:01:53 +01:00
senhuang42 61fe571af6 Fix chaintable check to include rowhash in ZSTD_reduceIndex() 2021-04-30 19:52:04 -04:00
Nick Terrell 09149beaf8 [1.5.0] Move `zstd_errors.h` and `zdict.h` to `lib/` root
`zstd_errors.h` and `zdict.h` are public headers, so they deserve to be
in the root `lib/` directory with `zstd.h`, not mixed in with our private
headers.
2021-04-30 15:13:54 -07:00
Nick Terrell 0e2345b859
Merge pull request #2593 from terrelln/linux-comments
[linux-kernel] Replace kernel-style comments
2021-04-29 17:15:40 -07:00
Nick Terrell fbb9006e18 [linux-kernel] Replace kernel-style comments
Replace kernel-style comments with regular comments.

E.g.

```
/** Before */

/* After */

/**
 * Before
 */

/*
 * After
 */

/***********************************
 * Before
 ***********************************/

/* *********************************
 * After
 ***********************************/
```
2021-04-29 15:50:23 -07:00
Nick Terrell 333dd60bff
Merge pull request #2589 from terrelln/tracing
[trace] Remove default definitions of weak symbols
2021-04-27 10:39:12 -07:00
sen f8afc66573
Merge pull request #2588 from senhuang42/update_regressiontest
[regressiontest] Update results.csv
2021-04-27 12:40:07 -04:00
Nick Terrell 6cee3c2c4f [trace] Remove default definitions of weak symbols
Instead of providing a default no-op implementation, check the symbols
for `NULL` before accessing them. Providing a default implementation
doesn't reliably work with dynamic linking. Depending on link order the
default implementations may not be overridden. By skipping the default
implementation, all link order issues are resolved. If the symbols
aren't provided the weak function will be `NULL`.
2021-04-26 16:05:39 -07:00
senhuang42 33abda4400 Update results.csv 2021-04-26 15:55:23 -04:00
sen 3e2fbfd056
Merge pull request #2579 from senhuang42/getcdictID_to_stable
[1.5.0] Promote ZSTD_getDictID_fromCDict() into stable API
2021-04-26 09:55:43 -04:00
Sen Huang 3c595a4a79 Add literalCompressionMode to stable cParams 2021-04-26 09:55:06 -04:00
sen 14c11c7459
Merge pull request #2586 from senhuang42/dds_fuzz
Add DDS to oss fuzzer
2021-04-24 12:39:16 -04:00
felixhandte efa6dfa729 Apply DDS adjustments to avoid assert failures 2021-04-23 16:41:00 -04:00
senhuang42 f80dec66b0 Add DDS to oss fuzzer 2021-04-22 18:21:43 -04:00
sen 3eb3845898
Merge pull request #2583 from senhuang42/remove_zbuff
[1.5.0] Remove ZBUFF
2021-04-20 17:05:32 -04:00
senhuang42 a423305e7b Remove ZBUFF tests 2021-04-19 17:27:05 -04:00
senhuang42 3b98987496 Remove building of ZBUFF/deprecated folder by default 2021-04-19 17:12:00 -04:00
Sen Huang c5869677d9 Moved ZSTD_defaultCLevel() into stable API 2021-04-16 10:15:40 -07:00
Sen Huang 9c1ca3c00b Moved ZSTD_getDictID_fromCDict() into stable API 2021-04-16 10:14:29 -07:00
sen 12c045f74d
Merge pull request #2574 from senhuang42/repcode_mismatch_detector_fix
Correct the block splitter mismatched repcodes detection.
2021-04-12 23:27:43 -04:00
sen ebd41ebe56
Merge pull request #2572 from senhuang42/row_hash_hashcache_one_off_error_fix
Adjust nb elements to prefetch in ZSTD_row_fillHashCache()
2021-04-12 15:38:53 -04:00
Sen Huang 8844f93957 Adjust nb elements to prefetch in ZSTD_row_fillHashCache() 2021-04-12 14:24:58 -04:00
Sen Huang 550f76f131 Correct the detection of mismatched repcodes 2021-04-09 09:08:51 -07:00
sen 56421f34e4
Merge pull request #2494 from senhuang42/row_hash2
SIMD Row Based Matchfinder 🚀
2021-04-08 10:55:14 -04:00
Sen Huang 4d63d6e8aa Update results.csv, add Row hash to regression test 2021-04-07 10:31:41 -07:00
Nick Terrell 4694423c4f Add and integrate lazy row hash strategy 2021-04-07 09:53:34 -07:00
sen f71aabb5b5
Move clevel override to after initLocalDict() (#2571) 2021-04-06 21:05:37 -04:00
sen f1e8b565c2
Maintain two repcode histories for block splitting, replace invalid repcodes (#2569) 2021-04-06 17:25:55 -04:00
sen e38124555e
Fix dictionary force reloading clevel selection (#2570)
* Move cdict clevel override to before localdict init

* Update results.csv after dict load changes
2021-04-06 15:35:09 -04:00
Nick Terrell 8383fc828d
Merge pull request #2541 from ihsinme/patch-1
simple fix for using bit operator.
2021-04-02 13:01:09 -07:00
Niclas Rosenvik e7647180cd Stop complaining about hash tool not found
If build_dir is set the zstd build complains about md5sum not being found.
Fix this by checking if build_dir is set before checking and using the hash tool
just like in lib/Makefile .
2021-04-02 13:00:19 -07:00
sen 980f3bbf83
[cwksp] Align all allocated "tables" and "aligneds" to 64 bytes (#2546)
* Perform 64-byte alignment of wksp tables and aligneds internally

* Clean up cwskp_finalize() function to only do two allocs

* Refactor aligned/buffer reservation code, remove ASAN req for alignment reservations

* Change from allocating 128 bytes always to allocating only buffer space as needed for tables/aligned

* Back out aligned/table reservation order restriction

* Add stricter bounds for new/resized wksps, fix comment in zstd_cwksp.h
2021-04-01 20:07:19 -04:00
sen 255925c231
Fix repcode-related OSS-fuzz issues in block splitter (#2560)
* Do not emit last partitions of blocks as RLE/uncompressed

* Fix repcode updates within block splitter

* Add a entropytables confirm function, redo ZSTD_confirmRepcodesAndEntropyTables() for better function signature

* Add a repcode updater to block splitter, no longer need to force emit compressed blocks
2021-03-31 15:14:59 -04:00
Nick Terrell d334ad2ff4 [contrib][linux-kernel] Add zstd_min_clevel() and zstd_max_clevel() 2021-03-30 10:30:59 -07:00
Nick Terrell a494308ae9 [copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files
* Switch to yearless copyright per FB policy
* Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources
* Add zstd copyright/license header to the `contrib/linux-kernel` sources
* Update the `tests/test-license.py` to check for yearless copyright
* Improvements to `tests/test-license.py`
* Check `contrib/linux-kernel` in `tests/test-license.py`
2021-03-30 10:30:43 -07:00
sen 84ccb81e7c
Merge pull request #2561 from senhuang42/longlength_enum
Add enum for representing long length ID
2021-03-26 15:55:12 -04:00
Sen Huang b1a43455f8 Add enum for representing long length ID 2021-03-26 10:41:09 -07:00
sen ab216bc2c5
Merge pull request #2559 from senhuang42/add_dict_regression_tests_backup
Add different dict modes to compression ratio regression test, update results.csv
2021-03-25 19:26:06 -04:00
Sen Huang bbbd578f45 Update results.csv 2021-03-25 11:16:37 -07:00