As a library, the default shouldn't be to write anything on console.
`cover` and `fastcover` have a `g_displayLevel` variable to control this behavior.
It's now set to 0 (no display) by default.
Setting notification to a higher level should be an explicit operation by a console application.
This new setup is slighly better on `silesia.tar` :
Ratio : 3.649 -> 3.655
Speed : 11.9 MB/s -> 12.2 MB/s
At the cost of more memory : 24 MB -> 32 MB
The new memory budget is a reasonable interpolation between neighboring levels 12 and 14:
level 12 : 24 MB
level 13 : 32 MB (increased from 24 MB)
level 14 : 48 MB
Window size remains unaffected (4 MB)
This removes the old `ZSTD_compressBlock_fast_generic()` and renames the new
`ZSTD_compressBlock_fast_generic_pipelined()` to replace it. This is
functionally a no-op.
Unrolling the loop to handle 2 positions in each iteration allows us to reduce
the frequency of some operations that don't need to happen at every position.
One such operation is the step calculation, which is a very rough heuristic
anyways. It's fine if we do this a position later. The other operation is the
repcode check. But since the repcode check already tries expanding back one
position, we're really not missing much of importance by only trying it every
other position.
This commit also slightly reorders some operations.
Amusingly, it seems to be a non-trivial performance hit to add in final
searches or even hash table insertions during cleanup. So let's not. It seems
to not make any meaningful difference in compression ratio.
* Add a Huffman round trip fuzzer
* Fix two minor bugs in Huffman that aren't exposed in zstd
- Incorrect weight comparison (weights are allowed to be equal to
table log).
- HUF_compress1X_usingCTable_internal() can return compressed
size >= source size, so the assert that `cSize <= 65535` isn't
correct, and it needs to be checked instead.
This PR fixes an incorrect comparison in figuring out `minChain` in
`ZSTD_dedicatedDictSearch_lazy_loadDictionary()`. This incorrect comparison
had been masked by the fact that `idx` was always 1, until @terrelln changed
that in #2726.
Credit-to: OSS-Fuzz
The DUBT can be non-deterministic if an index is equal to
`ZSTD_DUBT_UNSORTED_MARK`. Ensure that never happens by starting the
indices at 2.
This bug was found by the OSS-Fuzz determinism fuzzer. With this change
the fuzzer test passes. And I've confirmed that this is the root cause,
not just hiding the problem.
Aside: This took me a long time to figure out, because I thought I had
tried this first thing. But, apparantly I messed it up, because when I
was going through it again with @felixhandte, I was pointing out that it
wasn't the case, but it turns out it was.
Credit to: OSS-Fuzz
* The block splitter missed a bounds check, so when the buffer is too small it
passes an erroneously large size to `ZSTD_entropyCompressSeqStore()`, which
can then write the compressed data past the end of the buffer. This is a new
regression in v1.5.0 when the block splitter is enabled. It is either enabled
explicitly, or implicitly when using the optimal parser and `ZSTD_compress2()`
or `ZSTD_compressStream*()`.
* `HUF_writeCTable_wksp()` omits a bounds check when calling
`HUF_compressWeights()`. If it is called with `dstCapacity == 0` it will pass
an erroneously large size to `HUF_compressWeights()`, which can then write
past the end of the buffer. This bug has been present for ages. However, I
believe that zstd cannot trigger the bug, because it never calls
`HUF_compress*()` with `dstCapacity == 0` because of [this check][1].
Credit to: Oss-Fuzz
[1]: 89127e5ee2/lib/compress/zstd_compress_literals.c (L100)
* Flatten ZSTD_row_getMatchMask
* Remove the SIMD abstraction layer.
* Add big endian support.
* Align `hashTags` within `tagRow` to a 16-byte boundary.
* Switch SSE2 to use aligned reads.
* Optimize scalar path using SWAR.
* Optimize neon path for `n == 32`
* Work around minor clang issue for NEON (https://bugs.llvm.org/show_bug.cgi?id=49577)
* replace memcpy with MEM_readST
* silence alignment warnings
* fix neon casts
* Update zstd_lazy.c
* unify simd preprocessor detection (#3)
* remove duplicate asserts
* tweak rotates
* improve endian detection
* add cast
there is a fun little catch-22 with gcc: result from pmovmskb has to be cast to uint32_t to avoid a zero-extension
but must be uint16_t to get gcc to generate a rotate instruction..
* more casts
* fix casts
better work-around for the (bogus) warning: unary minus on unsigned