Commit Graph

8771 Commits (0d058469526a07febed7026c90f5212f601bb963)

Author SHA1 Message Date
Yann Collet 0d05846952
Merge pull request #2626 from facebook/codingStyle1
added a paragraph on coding style
2021-05-06 19:46:05 -07:00
Nick Terrell f36fbddbfa
Merge pull request #2625 from terrelln/ubsan-failure
[lib] Fix UBSAN warning in ZSTD_decompressSequences()
2021-05-06 19:22:25 -07:00
Yann Collet f44c720fa8 added a paragraph on coding style 2021-05-06 18:40:25 -07:00
Nick Terrell b052b583e5 [lib] Fix UBSAN warning in ZSTD_decompressSequences() 2021-05-06 15:31:30 -07:00
sen 698f261b35
[1.5.0] Deprecate some functions (#2582)
* Add deprecated macro to zstd.h, mark certain functions as deprecated

* Remove ZSTD_compress.c dependencies on deprecated functions
2021-05-06 17:59:32 -04:00
Nick Terrell e7e4b74b2b
Merge pull request #2600 from nickhutchinson/clang-cl
Fix for excessive compiler warnings when building with clang-cl
2021-05-06 12:43:15 -07:00
Nick Terrell 2b82948e58
Merge pull request #2622 from terrelln/zdict-api
[zdict] Add a FAQ to the top of zdict.h
2021-05-06 12:42:56 -07:00
Nick Terrell 1874f0844d [zdict] Add a FAQ to the top of zdict.h
The FAQ covers the questions asked in Issue #2566. It first covers why
you would want to use a dictionary, then what a dictionary is, and
finally it tells you how to train a dictionary, and clarifies some of
the parameters.

There is definitely more that could be said about some of the advanced
trainers, but this should be a good start.
2021-05-06 12:48:19 -07:00
sen 6030cdfede
Add --progress flag (#2595) 2021-05-06 14:50:28 -04:00
Yann Collet 2f7bbd6539
Merge pull request #2620 from facebook/winFilelist
fix --filelist compatibility with Windows cr+lf line ending
2021-05-06 11:35:16 -07:00
Felix Handte 909925785a
Merge pull request #2618 from felixhandte/single-file-build-mv
Move Single-File Build Script from `contrib/` to `build/`
2021-05-06 14:09:42 -04:00
Nick Terrell 207e33bb61
Merge pull request #2616 from terrelln/deterministic-dict
[lib] Add ZSTD_c_deterministicRefPrefix
2021-05-06 11:09:22 -07:00
Nick Hutchinson 2d34062836 CMake: fix excessive build warnings when building with clang-cl 2021-05-06 18:46:37 +01:00
Nick Terrell fc8330b885
Merge pull request #2621 from terrelln/regression-test
[test][regression] Update results.csv
2021-05-06 10:32:07 -07:00
sen d6be7659b0
Add seekable roundtrip fuzzer (#2617) 2021-05-06 10:08:21 -04:00
Yann Collet 26c7b0038e
Merge pull request #2619 from facebook/winbench
improved benchmark experience on Windows
2021-05-05 20:34:31 -07:00
Nick Terrell d2925de98a
Merge pull request #2615 from terrelln/stack-space
[lib] Move some ZSTD_CCtx_params off the stack
2021-05-05 19:43:39 -07:00
Nick Terrell ce615d7fba [test][regression] Update results.csv
The LDM change in PR #2602 changed the algorithm slightly.
The compressed size is generally positive, and when it is worse,
it is only a few bytes.
2021-05-05 19:00:36 -07:00
Nick Terrell 172b4b6ac4 [lib] Add ZSTD_c_deterministicRefPrefix
This flag forces zstd to always load the prefix in ext-dict mode, even
if it happens to be contiguous, to force determinism. It also applies to
dictionaries that are re-processed.

A determinism test case is also added, which fails without
`ZSTD_c_deterministicRefPrefix` and passes with it set.

Question: Should this be the default behavior? It isn't in this PR.
2021-05-05 18:49:56 -07:00
Yann Collet df05b2ba7c fix --filelist compatibility with Windows cr+lf line ending 2021-05-05 18:01:55 -07:00
Yann Collet fed8589430
Merge pull request #2614 from facebook/dlong8
faster speed for decompressSequencesLong
2021-05-05 16:55:40 -07:00
Yann Collet 9750f3c87b improved benchmark experience on Windows
benchmark results are not progressively displayed on Windows terminal.
For long benchmark sessions, nothing is displayed,
until the end, where everything is flushed.

Force display to be flushed after each update.
Updates happen roughtly every second, or even less,
so it's not a substantial workload.
2021-05-05 16:52:21 -07:00
W. Felix Handte 4ba49af665 Rewrite References to Location 2021-05-05 18:03:48 -04:00
Felix Handte b062d97520
Merge pull request #2525 from felixhandte/fix-file-permissions-again
Improve Setting Permissions of Created Files
2021-05-05 17:59:13 -04:00
Nick Terrell eb7e74ccb7 [tests] Set `DEBUGLEVEL=2` by default
This allows us to quickly check for compile errors in debug log
messages, which are compiled out when `DEBUGLEVEL < 2`.
2021-05-05 13:29:06 -07:00
Nick Terrell c2183d7cdf [lib] Move some ZSTD_CCtx_params off the stack
* Take `params` by const reference in `ZSTD_resetCCtx_internal()`.
* Add `simpleApiParams` to the CCtx and use them in the simple API
  functions, instead of creating those parameters on the stack.

I think this is a good direction to move in, because we shouldn't need
to worry about adding parameters to `ZSTD_CCtx_params`, since it should
always be on the heap (unless they become absoultely gigantic).

Some `ZSTD_CCtx_params` are still on the stack in the CDict functions,
but I've left them for now, because it was a little more complex, and we
don't use those functions in stack-constrained currently.
2021-05-05 13:25:16 -07:00
W. Felix Handte 1d65917323 Move Single-File Build Script from `contrib/` to `build/` 2021-05-05 16:07:51 -04:00
W. Felix Handte 4f9c6fdb7f Attempt to Fix Windows Build Error 2021-05-05 13:13:56 -04:00
W. Felix Handte da61918c75 Also Pass Mode Bits in on Windows
I think in some unix emulation environments on Windows, (cygwin?) mode bits
are somehow respected. So we might as well pass them in. Can't hurt.
2021-05-05 13:10:34 -04:00
W. Felix Handte bea1b2ba70 `rm -f` in playTests.sh 2021-05-05 13:10:34 -04:00
W. Felix Handte 45c4918ccf Fix Build for Windows 2021-05-05 13:10:34 -04:00
W. Felix Handte 018ed6552a Attempt to Fix `stat` Format for BSDs 2021-05-05 13:10:34 -04:00
W. Felix Handte 1fb10ba831 Don't Block Removing File on Being Able to Read It
`open()`'s mode bits are only applied to files that are created by the call.
If the output file already exists, but is not readable, the `fopen()` would
fail, preventing us from removing it, which would mean that the file would
not end up with the correct permission bits.

It's not clear to me why the `fopen()` is there at all. `UTIL_isRegularFile()`
should be sufficient, AFAICT.
2021-05-05 13:10:34 -04:00
W. Felix Handte b87f97b3ea Create Files with Desired Permissions; Avoid chmod(); Remove UTIL_chmod() 2021-05-05 13:10:34 -04:00
W. Felix Handte 4e10ff15f5 Add Tests Checking File Permissions of Created Files 2021-05-05 13:10:34 -04:00
Felix Handte 2d10544b84
Merge pull request #2613 from felixhandte/allow-block-device
Allow Reading from Block Devices with `--force`
2021-05-05 13:06:32 -04:00
Yann Collet 7ef6d7b36c deeper prefetching pipeline for decompressSequencesLong
pipeline increased from 4 to 8 slots.
This change substantially improves decompression speed when there are long distance offsets.
example with enwik9 compressed at level 22 :
gcc-9 : 947 -> 1039 MB/s
clang-10: 884 -> 946 MB/s

I also checked the "cold dictionary" scenario,
and found a smaller benefit, around ~2%
(measurements are more noisy for this scenario).
2021-05-05 10:04:03 -07:00
Yann Collet 455fd1a067 updated documentation regarding minimum job size 2021-05-05 09:03:11 -07:00
Azat Khuzhin 53a60e98de
seekable decompression fixes (#2594)
* seekable_format: fix from-file reading (not in-memory)

It tries to check the buffer boundary, but there is no buffer for
from-file reading.

* seekable_decompression: break when ZSTD_seekable_decompress() returns zero

* seekable_decompression_mem: break when ZSTD_seekable_decompress() returns zero

* seekable_format: cap the offset+len up to the last dOffset

This will allow to read the whole file w/o gotting corruption error if
the offset is more then the data left in file, i.e.:

    $ ./seekable_compression seekable_compression.c 8192 | head
    $ zstd -cdq seekable_compression.c.zst | wc -c
    4737

Before this patch:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    ZSTD_seekable_decompress() error : Corrupted block detected
    0

After:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    4737
2021-05-05 10:05:41 -04:00
Yann Collet c077f257b4
Merge pull request #2611 from facebook/smallerJobs
allow jobSize to be as low as 512 KB
2021-05-05 00:03:29 -07:00
Nick Terrell 8389a5122b
Merge pull request #2602 from terrelln/ldm-opt
[LDM] Speed optimization on repetitive data
2021-05-04 23:13:09 -07:00
Nick Terrell d40f55cd95
Merge pull request #2610 from senhuang42/lazy_underflow_fix
Fix bad integer wraparound in repcode index for fast, dfast, lazy
2021-05-04 23:10:23 -07:00
Nick Terrell 10e5513113
Merge pull request #2607 from terrelln/deterministic-dict
[lib] Always load the dictionary in one go
2021-05-04 22:48:48 -07:00
Nick Terrell 0b88c2582c [test] Add large dict/data --patch-from test
Dictionary size must be > `ZSTD_CHUNKSIZE_MAX`.
2021-05-04 17:31:32 -07:00
Sen Huang e6c8a5dd40 Fix incorrect usages of repIndex across all strategies 2021-05-04 19:50:55 -04:00
Nick Terrell 94db4398a0 [lib] Always load the dictionary in one go
Dictionaries larger than `ZSTD_CHUNKSIZE_MAX` used to have to be loaded
in multiple segments. Instead, when we detect large dictionaries, ensure
that we reset the context's indicies. Then, for dictionaries larger than
`ZSTD_CURRENT_MAX - 1`, only load the suffix of the dictionary. Finally,
enable DDS for large dictionaries, since we no longer load in multiple
segments.

This simplifes the dictionary loading code, and reduces opportunities
for non-determinism to slip in.
2021-05-04 16:45:25 -07:00
Yann Collet 1026b9fa10 fix rsyncable mode 2021-05-04 15:59:27 -07:00
W. Felix Handte e58e9c7928 Add Test Case (Behind Flag); Run in GitHub Action 2021-05-04 18:43:39 -04:00
Nick Terrell 8a8899fc08
Merge pull request #2612 from terrelln/minor-fix
[easy] Rewrite rowHashLog computation
2021-05-04 15:02:00 -07:00
W. Felix Handte 33f3e293e8 Allow Reading from Block Devices with `--force` 2021-05-04 16:25:26 -04:00