Since we're now hashing the position ahead even if we find a long match and
don't search that next position, we can write it back into the hashtable even
in long matches. This seems to cost us no speed, and improves compression
ratio slightly!
Aside from maybe a latency win in the loop, this means that when we find a
short match, we've already done the hash we need to check the next long match.
* Limit training samples size to 2GB
* simplified DISPLAYLEVEL() macro to use global vqriable instead of local.
* refactored training samples loading
* fixed compiler warning
* addressed comments from the pull request
* addressed @terrelln comments
* missed some fixes
* fixed type mismatch
* Fixed bug passing estimated number of samples rather insted of the loaded number of samples.
Changed unit conversion not to use bit-shifts.
* fixed a declaration after code
* fixed type conversion compile errors
* fixed more type castting
* fixed more type mismatching
* changed sizes type to size_t
* move type casting
* more type cast fixes
PR #2784 introduced a bug in the decompressor that caused some valid
inputs to fail to decompress. The bitstream isn't reloaded after the 4X*
loop if the number of elements remaining is small enough, causing us to
read more bits than are available in the bitcontainer.
This was caught by the MSAN fuzzer in OSS-Fuzz because the assembly
implementation isn't used in the MSAN build.
Credit to OSS-Fuzz.
Multiple ZSTD_createDCtx* functions call other (public)
ZSTD_createDCtx* functions, this makes it harder for humans
and compilers to throw out code that is not used.
This farms out the logic into a static function, if a program
only uses a single ZSTD_createDCtx variant, all others can be easily
dropped and the remaining implementation can be specialized.
Commit d7ef97a013b5
("[build] Fix oss-fuzz build with the dataflow sanitizer") broke
build inside Linux-kernel after 'import', as it no longer can
conditionally remove ZSTD_MEMORY_SANITIZER definition from
the #if DEF_A || DEF_B block. This emits -Wundef warning which
can be treated as error.
Split this preprocessor condition into two separate conditions
to fix this.
Fixes: d7ef97a013b5 ("[build] Fix oss-fuzz build with the dataflow sanitizer")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Switch to a macro `ZSTD_FALLTHROUGH;` instead of a comment. On supported
compilers this uses an attribute, otherwise it becomes a comment.
This is necessary to be compatible with clang's `-Wfall-through`, and
gcc's `-Wfall-through=2` which don't support comments. Without this the
linux build emits a bunch of warnings.
Also add a test to CI to ensure that we don't regress.
turns out, it's possible to constify MatchState* parameter
in some parts of the binary tree algorithm,
making it a pure read-only parameter,
as opposed to a mutable state.
This is supposed to be helpful for both maintenance and the compiler.
* Extract out common portion of `lib/Makefile` into `lib/libzstd.mk`.
Most relevantly, the way we find library files.
* Use `lib/libzstd.mk` in the other Makefiles instead of repeating the
same code.
* Add a test `tests/test-variants.sh` that checks that the builds of
`make -C programs allVariants` are correct, and run it in Actions.
* Adds support for ASM files in the CMake build.
The Meson build is not updated because it lists every file in zstd,
and supports ASM off the bat, so the Huffman ASM commit will just add
the ASM file to the list.
The Visual Studios build is not updated because I'm not adding ASM
support to Visual Studios yet.
Test failures showed up on the daily cron job. They didn't show up
in CI because the condition is somewhat rare, and didn't trigger
during the CI tests.
This PR fixes up the logic in `findSynchronizationPoint()` to correctly
handle the edge case. It also un-comments an assert that helps catch the
issue, and verify that rsyncable mode is calculating the correct hash.
After the fix, the test that failed passes:
```
./zstreamtest --newapi -t1 --no-big-tests -s9680
```