8929 Commits

Author SHA1 Message Date
Nick Terrell
c2555f8c6f [lib] Fix fuzzer timeouts by backing off overflow correction
Linearly back off the frequency of overflow correction based on the
number of times the `ZSTD_window_t` has been overflow corrected. This
will still allow the fuzzer to quickly find overflow correction bugs,
while also keeping good speed for larger inputs.

Additionally, the `nbOverflowCorrections` variable can be useful for
debugging coredumps, since we can inspect the `ZSTD_CCtx` to see if
overflow correction has happened yet.

I've verified this fixes the timeouts in OSS-Fuzz (176 seconds -> 6
seconds). I've also verified that fuzzers and `fuzzer` and `zstreamtest`
still catch the row-hash overflow correction bug.
2021-05-06 22:03:41 -07:00
Yann Collet
17b9e43c7d do not install g++ 2021-05-06 21:53:30 -07:00
Yann Collet
ee425faaa7 Merge branch 'dev' into d_prefetch_refactor 2021-05-06 19:49:26 -07:00
Yann Collet
0d05846952
Merge pull request #2626 from facebook/codingStyle1
added a paragraph on coding style
2021-05-06 19:46:05 -07:00
Nick Terrell
f36fbddbfa
Merge pull request #2625 from terrelln/ubsan-failure
[lib] Fix UBSAN warning in ZSTD_decompressSequences()
2021-05-06 19:22:25 -07:00
Yann Collet
f44c720fa8 added a paragraph on coding style 2021-05-06 18:40:25 -07:00
Yann Collet
bd547232bc switch to clang 2021-05-06 16:07:44 -07:00
Yann Collet
ee65162655 Merge branch 'dev' into fasterCygwin 2021-05-06 16:06:00 -07:00
Nick Terrell
b052b583e5 [lib] Fix UBSAN warning in ZSTD_decompressSequences() 2021-05-06 15:31:30 -07:00
sen
698f261b35
[1.5.0] Deprecate some functions (#2582)
* Add deprecated macro to zstd.h, mark certain functions as deprecated

* Remove ZSTD_compress.c dependencies on deprecated functions
2021-05-06 17:59:32 -04:00
Nick Terrell
e7e4b74b2b
Merge pull request #2600 from nickhutchinson/clang-cl
Fix for excessive compiler warnings when building with clang-cl
2021-05-06 12:43:15 -07:00
Nick Terrell
2b82948e58
Merge pull request #2622 from terrelln/zdict-api
[zdict] Add a FAQ to the top of zdict.h
2021-05-06 12:42:56 -07:00
Nick Terrell
1874f0844d [zdict] Add a FAQ to the top of zdict.h
The FAQ covers the questions asked in Issue #2566. It first covers why
you would want to use a dictionary, then what a dictionary is, and
finally it tells you how to train a dictionary, and clarifies some of
the parameters.

There is definitely more that could be said about some of the advanced
trainers, but this should be a good start.
2021-05-06 12:48:19 -07:00
sen
6030cdfede
Add --progress flag (#2595) 2021-05-06 14:50:28 -04:00
Yann Collet
2f7bbd6539
Merge pull request #2620 from facebook/winFilelist
fix --filelist compatibility with Windows cr+lf line ending
2021-05-06 11:35:16 -07:00
Felix Handte
909925785a
Merge pull request #2618 from felixhandte/single-file-build-mv
Move Single-File Build Script from `contrib/` to `build/`
2021-05-06 14:09:42 -04:00
Nick Terrell
207e33bb61
Merge pull request #2616 from terrelln/deterministic-dict
[lib] Add ZSTD_c_deterministicRefPrefix
2021-05-06 11:09:22 -07:00
Nick Hutchinson
2d34062836 CMake: fix excessive build warnings when building with clang-cl 2021-05-06 18:46:37 +01:00
Nick Terrell
fc8330b885
Merge pull request #2621 from terrelln/regression-test
[test][regression] Update results.csv
2021-05-06 10:32:07 -07:00
Yann Collet
2e76bd7d10 attempt to make Appveyor's Cygwin test faster
Cygwin is the longest Appveyor test
Appveyor is typically the CI which finish last
2021-05-06 08:47:45 -07:00
sen
d6be7659b0
Add seekable roundtrip fuzzer (#2617) 2021-05-06 10:08:21 -04:00
Yann Collet
26c7b0038e
Merge pull request #2619 from facebook/winbench
improved benchmark experience on Windows
2021-05-05 20:34:31 -07:00
Nick Terrell
d2925de98a
Merge pull request #2615 from terrelln/stack-space
[lib] Move some ZSTD_CCtx_params off the stack
2021-05-05 19:43:39 -07:00
Nick Terrell
ce615d7fba [test][regression] Update results.csv
The LDM change in PR #2602 changed the algorithm slightly.
The compressed size is generally positive, and when it is worse,
it is only a few bytes.
2021-05-05 19:00:36 -07:00
Nick Terrell
172b4b6ac4 [lib] Add ZSTD_c_deterministicRefPrefix
This flag forces zstd to always load the prefix in ext-dict mode, even
if it happens to be contiguous, to force determinism. It also applies to
dictionaries that are re-processed.

A determinism test case is also added, which fails without
`ZSTD_c_deterministicRefPrefix` and passes with it set.

Question: Should this be the default behavior? It isn't in this PR.
2021-05-05 18:49:56 -07:00
Yann Collet
df05b2ba7c fix --filelist compatibility with Windows cr+lf line ending 2021-05-05 18:01:55 -07:00
Yann Collet
fed8589430
Merge pull request #2614 from facebook/dlong8
faster speed for decompressSequencesLong
2021-05-05 16:55:40 -07:00
Yann Collet
9750f3c87b improved benchmark experience on Windows
benchmark results are not progressively displayed on Windows terminal.
For long benchmark sessions, nothing is displayed,
until the end, where everything is flushed.

Force display to be flushed after each update.
Updates happen roughtly every second, or even less,
so it's not a substantial workload.
2021-05-05 16:52:21 -07:00
W. Felix Handte
4ba49af665 Rewrite References to Location 2021-05-05 18:03:48 -04:00
Felix Handte
b062d97520
Merge pull request #2525 from felixhandte/fix-file-permissions-again
Improve Setting Permissions of Created Files
2021-05-05 17:59:13 -04:00
Nick Terrell
eb7e74ccb7 [tests] Set DEBUGLEVEL=2 by default
This allows us to quickly check for compile errors in debug log
messages, which are compiled out when `DEBUGLEVEL < 2`.
2021-05-05 13:29:06 -07:00
Nick Terrell
c2183d7cdf [lib] Move some ZSTD_CCtx_params off the stack
* Take `params` by const reference in `ZSTD_resetCCtx_internal()`.
* Add `simpleApiParams` to the CCtx and use them in the simple API
  functions, instead of creating those parameters on the stack.

I think this is a good direction to move in, because we shouldn't need
to worry about adding parameters to `ZSTD_CCtx_params`, since it should
always be on the heap (unless they become absoultely gigantic).

Some `ZSTD_CCtx_params` are still on the stack in the CDict functions,
but I've left them for now, because it was a little more complex, and we
don't use those functions in stack-constrained currently.
2021-05-05 13:25:16 -07:00
W. Felix Handte
1d65917323 Move Single-File Build Script from contrib/ to build/ 2021-05-05 16:07:51 -04:00
W. Felix Handte
4f9c6fdb7f Attempt to Fix Windows Build Error 2021-05-05 13:13:56 -04:00
W. Felix Handte
da61918c75 Also Pass Mode Bits in on Windows
I think in some unix emulation environments on Windows, (cygwin?) mode bits
are somehow respected. So we might as well pass them in. Can't hurt.
2021-05-05 13:10:34 -04:00
W. Felix Handte
bea1b2ba70 rm -f in playTests.sh 2021-05-05 13:10:34 -04:00
W. Felix Handte
45c4918ccf Fix Build for Windows 2021-05-05 13:10:34 -04:00
W. Felix Handte
018ed6552a Attempt to Fix stat Format for BSDs 2021-05-05 13:10:34 -04:00
W. Felix Handte
1fb10ba831 Don't Block Removing File on Being Able to Read It
`open()`'s mode bits are only applied to files that are created by the call.
If the output file already exists, but is not readable, the `fopen()` would
fail, preventing us from removing it, which would mean that the file would
not end up with the correct permission bits.

It's not clear to me why the `fopen()` is there at all. `UTIL_isRegularFile()`
should be sufficient, AFAICT.
2021-05-05 13:10:34 -04:00
W. Felix Handte
b87f97b3ea Create Files with Desired Permissions; Avoid chmod(); Remove UTIL_chmod() 2021-05-05 13:10:34 -04:00
W. Felix Handte
4e10ff15f5 Add Tests Checking File Permissions of Created Files 2021-05-05 13:10:34 -04:00
Felix Handte
2d10544b84
Merge pull request #2613 from felixhandte/allow-block-device
Allow Reading from Block Devices with `--force`
2021-05-05 13:06:32 -04:00
Yann Collet
7ef6d7b36c deeper prefetching pipeline for decompressSequencesLong
pipeline increased from 4 to 8 slots.
This change substantially improves decompression speed when there are long distance offsets.
example with enwik9 compressed at level 22 :
gcc-9 : 947 -> 1039 MB/s
clang-10: 884 -> 946 MB/s

I also checked the "cold dictionary" scenario,
and found a smaller benefit, around ~2%
(measurements are more noisy for this scenario).
2021-05-05 10:04:03 -07:00
Yann Collet
8cde167a27 Merge branch 'dev' into d_prefetch_refactor 2021-05-05 09:13:38 -07:00
Yann Collet
455fd1a067 updated documentation regarding minimum job size 2021-05-05 09:03:11 -07:00
Azat Khuzhin
53a60e98de
seekable decompression fixes (#2594)
* seekable_format: fix from-file reading (not in-memory)

It tries to check the buffer boundary, but there is no buffer for
from-file reading.

* seekable_decompression: break when ZSTD_seekable_decompress() returns zero

* seekable_decompression_mem: break when ZSTD_seekable_decompress() returns zero

* seekable_format: cap the offset+len up to the last dOffset

This will allow to read the whole file w/o gotting corruption error if
the offset is more then the data left in file, i.e.:

    $ ./seekable_compression seekable_compression.c 8192 | head
    $ zstd -cdq seekable_compression.c.zst | wc -c
    4737

Before this patch:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    ZSTD_seekable_decompress() error : Corrupted block detected
    0

After:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    4737
2021-05-05 10:05:41 -04:00
Yann Collet
c077f257b4
Merge pull request #2611 from facebook/smallerJobs
allow jobSize to be as low as 512 KB
2021-05-05 00:03:29 -07:00
Nick Terrell
8389a5122b
Merge pull request #2602 from terrelln/ldm-opt
[LDM] Speed optimization on repetitive data
2021-05-04 23:13:09 -07:00
Nick Terrell
d40f55cd95
Merge pull request #2610 from senhuang42/lazy_underflow_fix
Fix bad integer wraparound in repcode index for fast, dfast, lazy
2021-05-04 23:10:23 -07:00
Nick Terrell
10e5513113
Merge pull request #2607 from terrelln/deterministic-dict
[lib] Always load the dictionary in one go
2021-05-04 22:48:48 -07:00