facebook/zstd - zstd - Final Minetest

Author	SHA1	Message	Date
Yann Collet	cc7d23bcec	Merge pull request #2965 from facebook/offbase Converge sumtype (offset \| repcode) numeric representation towards offBase	2022-01-24 15:47:42 -08:00
Yann Collet	ca0135c2fd	new Formulation presumes faster	2022-01-07 14:37:53 -08:00
Yann Collet	9e1b4828e5	enforce a minimum price of 1 bit per literal in the optimal parser	2022-01-07 13:53:48 -08:00
Nick Terrell	4d8a2132d0	[opt] Fix oss-fuzz bug in optimal parser oss-fuzz uncovered a scenario where we're evaluating the cost of litLength = 131072, which can't be represented in the zstd format, so we accessed 1 beyond LL_bits. Fix the issue by making it cost 1 bit more than litLength = 131071. There are still follow ups: 1. This happened because literals_cost[0] = 0, so the optimal parser chose 36 literals over a match. Should we bound literals_cost[literal] > 0, unless the block truly only has one literal value? 2. When no matches are found, the cost model isn't updated. In this case no matches were found for an entire block. So the literals cost model wasn't updated at all. That made the optimal parser think literals_cost[0] = 0, where it is actually quite high, since the block was entirely random noise. Credit to OSS-Fuzz.	2022-01-06 16:10:18 -08:00
Yann Collet	7a18d709ae	updated all names to offBase convention	2021-12-29 17:30:43 -08:00
Yann Collet	e909fa627f	abstracted storeSeq() sumtype numeric representation from zstd_opt.c	2021-12-28 12:14:33 -08:00
Yann Collet	6fa640ef70	separate newRep() from updateRep() the new contracts seems to make more sense : updateRep() updates an array of repeat offsets _in place_, while newRep() generates a new structure with the updated repeat-offset array. Most callers are actually expecting the in-place variant, and a limited sub-section, in `zstd_opt.c` mainly, prefer `newRep()`.	2021-12-28 11:52:33 -08:00
Yann Collet	b7630a474b	abstracted usage of offBase sumtype within zstd_lazy.c	2021-12-28 10:59:47 -08:00
Yann Collet	435f5a2e6d	fixed regression test assert optLdm->offset might be == 0 in invalid case. Only use STORE_OFFSET() after validating it's a correct case.	2021-12-28 09:55:31 -08:00
Yann Collet	1aed962216	introduce macros STORE_OFFSET() and STORE_REPCODE() this meant to abstract the sumtype representation required to transfert `offcode` to `ZSTD_storeSeq()`. Unfortunately, the sumtype numeric representation is currently a leaky abstraction that has permeated many other parts of the code, especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`. While this PR makes a good job a transfering a large nb of call sites to using the new macros, there are still a few sites where this transformation is more complex, or where the numeric representation itself it used "as is". One of the problematics area is the decision to use the numeric format of the sumtype within the match finders of `zstd_lazy`. This commit doesn't change the behavior, it only introduces and employes the macros, but eventually the resulting code remains identical. At target, if the numeric representation of the sumtype can be completely abstracted and no other part of the code depends on it, it will be possible to move it towards something slightly more efficient.	2021-12-23 22:03:30 -08:00
Yann Collet	b77fcac61f	change ZSTD_storeSeq() interface to accept matchLength instead of mlBase. This removes the need to do `- MINMATCH` at every call site. The new interface contract is checked with an `assert()`.	2021-12-23 12:03:33 -08:00
Nick Terrell	e5bfaeede7	Improve zstd_opt build speed and size Use the same trick as we did for zstd_lazy in PR #2828: * Create one search function specialization for each (dictMode, mls). * Select the search function pointer at the top of the match finder. Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into every function, since `dictMode` is no longer used as a template. Create two specializations, for opt levels 0 and 2, and call one of the two specializations. Lastly, remove the hack that disabled inlining for zstd_opt for the Linux Kernel, as we've gotten most of the benefit already. Compilation time sees a ~4x reduction: \| Compiler \| Flags \| Dev Time (s) \| PR Time (s) \| Delta \| \|----------\|----------------------------------\|--------------\|-------------\|-------\| \| gcc \| -O3 \| 10.1 \| 2.3 \| -77% \| \| gcc \| -O3 -fsanitize=address,undefined \| 61.1 \| 10.2 \| -83% \| \| clang \| -O3 \| 9.0 \| 2.1 \| -76% \| \| clang \| -O3 -fsanitize=address,undefined \| 33.5 \| 5.1 \| -84% \| Build size is reduced by 150KB - 200KB: \| Compiler \| Dev libzstd.a Size (B) \| PR libzstd.a Size (B) \| Delta \| \|----------\|------------------------\|-----------------------\|-------\| \| gcc \| 1327476 \| 1177108 \| -11% \| \| clang \| 1378324 \| 1167780 \| -15% \| There is a <2% speed loss in all cases: \| Compiler \| Level \| Dev Speed (MB/s) \| PR Speed (MB/s) \| Delta \| \|----------\|-------\|------------------\|-----------------\|--------\| \| gcc \| 16 \| 4.78 \| 4.72 \| -1.25% \| \| gcc \| 17 \| 3.49 \| 3.46 \| -0.85% \| \| gcc \| 18 \| 2.92 \| 2.86 \| -2.04% \| \| gcc \| 19 \| 2.61 \| 2.61 \| 0.00% \| \| clang \| 16 \| 4.69 \| 4.80 \| 2.34% \| \| clang \| 17 \| 3.53 \| 3.49 \| -1.13% \| \| clang \| 18 \| 2.86 \| 2.85 \| -0.34% \| \| clang \| 19 \| 2.61 \| 2.61 \| 0.00% \| Fixes Issue #2862.	2021-12-02 14:19:41 -08:00
Nick Terrell	19eb459da3	[linux-kernel] Don't inline function in zstd_opt.c The optimal parser is unlikely to be used in the linux kernel in practice. There is no reason these functions should be force inlined, since we aren't gaining anything, and are losing build size. \| Compiler \| Before (Bytes) \| After (Bytes) \| Delta (Bytes) \| \|----------\|----------------\|---------------\|---------------\| \| gcc-11 \| 1142090 \| 952754 \| -189336 \| \| clang-12 \| 1228402 \| 976290 \| -252112 \| This is a temporary solution pending the resolution of PR #2862 in the `dev` branch.	2021-11-15 20:37:30 -08:00
Nick Terrell	c6c482fe07	[binary-tree] Fix underflow of nbCompares Fix underflow of `nbCompares` by switching to an `int` and comparing `nbCompares > 0`. This is a minimal fix, because I don't want to change the logic. These loops seem to be doing `nbCompares + 1` comparisons. The bug was reported by Dan Carpenter and found by Smatch static checker. https://lore.kernel.org/all/20211008063704.GA5370@kili/	2021-10-08 13:22:55 -07:00
Yann Collet	fa2a4d77c7	constify MatchState* parameter when possible turns out, it's possible to constify MatchState* parameter in some parts of the binary tree algorithm, making it a pure read-only parameter, as opposed to a mutable state. This is supposed to be helpful for both maintenance and the compiler.	2021-09-23 08:27:44 -07:00
senhuang42	b5c35d7ea3	Use new paramSwitch enum for LCM, row matchfinder, and block splitter	2021-09-21 14:22:02 -04:00
sen	9d2a45a705	Merge pull request #2778 from senhuang42/opt_inlining_revert Revert opt outlining change	2021-09-15 14:22:10 -04:00
Sen Huang	bd84e4a9d3	Revert opt outlining change	2021-09-15 09:08:41 -07:00
Yann Collet	b7f46ebc23	use ZSTD_memcpy() for better portability notably within kernel space	2021-09-08 14:45:53 -07:00
Yann Collet	7fce9a41b5	change update rate to 12/11/11/11 better for large files, and sources with relatively "stable" entropy, like silesia.tar. slightly worse for files with rapidly changing entropy, like Calgary.tar/. Updated small files tests in fuzzer	2021-09-08 14:05:57 -07:00
Yann Collet	ef78611c26	change update rate to 11/10/10/10 better for larger blocks, very small inefficiency on small block.	2021-09-08 08:58:28 -07:00
Yann Collet	42a3ed752a	removed frequency booster for stat initialization of btultra2 used to be necessary to counter-balance the fixed-weight frequency update which has been recently changed for an adaptive rate (targeting stable starting frequency stats).	2021-09-08 07:56:43 -07:00
Yann Collet	08ceda3dfc	new statistics update policy small general compression ratio improvement for btopt+ strategies/	2021-09-04 00:52:44 -07:00
Yann Collet	23a9368c45	new starting offcode table for zstd_opt	2021-09-03 17:41:42 -07:00
Yann Collet	27a8bbe265	new initializer for ll price	2021-09-03 16:07:31 -07:00
Yann Collet	f0fc8cb3e1	Disable console notification by default within the library As a library, the default shouldn't be to write anything on console. `cover` and `fastcover` have a `g_displayLevel` variable to control this behavior. It's now set to 0 (no display) by default. Setting notification to a higher level should be an explicit operation by a console application.	2021-09-03 13:44:07 -07:00
Yann Collet	eab692211e	removed pretty-print of sizes in benchmark This is less appropriate for this mode : benchmark is about accuracy, it's important to read the exact values.	2021-09-03 12:51:02 -07:00
Sen Huang	d88c1d95ce	Remove inlining for opt	2021-09-01 16:58:57 -04:00
Nick Terrell	46f2710562	[HUF] Improve Huffman encoding speed Improve Huffman encoding speed by 20% for gcc and 10% for clang. \| Compiler \| Benchmark \| Config \| Dataset \| Ratio \| Speed MB/s (dev) \| Speed MB/s (huf-cspeed) \| Speed MB/s (huf-cspeed - dev) \| \|----------\|-------------------\|---------\|-------------\|-------\|------------------\|-------------------------\|-------------------------------\| \| gcc \| compress \| level_1 \| enwik7 \| 2.43 \| 253.70 \| 258.72 \| 2.0% \| \| gcc \| compress \| level_1 \| silesia \| 2.88 \| 341.90 \| 348.15 \| 1.8% \| \| gcc \| compress_literals \| level_1 \| enwik7 \| 1.49 \| 761.83 \| 912.76 \| 19.8% \| \| gcc \| compress_literals \| level_1 \| silesia \| 1.28 \| 754.83 \| 902.37 \| 19.5% \| \| gcc \| compress_literals \| level_7 \| enwik7 \| 1.29 \| 502.81 \| 552.79 \| 9.9% \| \| gcc \| compress_literals \| level_7 \| silesia \| 1.11 \| 675.97 \| 776.44 \| 14.9% \| \| clang \| compress \| level_1 \| enwik7 \| 2.43 \| 277.54 \| 280.98 \| 1.2% \| \| clang \| compress \| level_1 \| silesia \| 2.88 \| 369.98 \| 375.46 \| 1.5% \| \| clang \| compress_literals \| level_1 \| enwik7 \| 1.49 \| 828.83 \| 918.41 \| 10.8% \| \| clang \| compress_literals \| level_1 \| silesia \| 1.28 \| 815.81 \| 905.41 \| 11.0% \| \| clang \| compress_literals \| level_7 \| enwik7 \| 1.29 \| 533.13 \| 553.30 \| 3.8% \| \| clang \| compress_literals \| level_7 \| silesia \| 1.11 \| 714.52 \| 775.38 \| 8.5% \|	2021-07-27 15:10:35 -07:00
Nick Terrell	91c9a247b6	[lib] Fix determinism bug in the optimal parser `ZSTD_insertBt1()` has a speed optimization that skips the prefix of very long matches. `40def70387/lib/compress/zstd_opt.c (L476)` This optimization is based off the length longest match found. However, when indices are reset, we only ensure that we can reference the whole window starting from `ip`. If the previous block ended with a long match then `nextToUpdate` could be much less than `ip`. It might be far enough back that `nextToUpdate < maxDist`, so it doesn't have a full window of data to reference. This can cause non-determinism bugs, because we may find a match that is beyond `ip - maxDist`, and may sometimes be un-referencable, and that match triggers the speed optimization. The fix is to base the `windowLow` off of the `target` of `ZSTD_updateTree_internal()`, because anything below that value will be obsolete by the time `ZSTD_updateTree_internal()` completes.	2021-05-13 17:05:59 -07:00
Nick Terrell	a494308ae9	[copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files * Switch to yearless copyright per FB policy * Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources * Add zstd copyright/license header to the `contrib/linux-kernel` sources * Update the `tests/test-license.py` to check for yearless copyright * Improvements to `tests/test-license.py` * Check `contrib/linux-kernel` in `tests/test-license.py`	2021-03-30 10:30:43 -07:00
Nick Terrell	66e811d782	[license] Update year to 2021	2021-01-04 17:53:52 -05:00
senhuang42	d6911b86be	Require LDM matches to be strictly greater in length	2020-10-09 12:56:18 -04:00
senhuang42	b9c8033cde	Define kNullRawSeqStore for every file	2020-10-07 19:02:41 -04:00
senhuang42	a6165c1b28	Change matchState_t::ldmSeqStore to pointer	2020-10-07 14:13:57 -04:00
senhuang42	abce708a56	Move posInSequence correction to correct location	2020-10-07 13:56:25 -04:00
senhuang42	0c515590d8	Replace offCode of largest match if ldm's offCode is superior	2020-10-07 13:56:25 -04:00
senhuang42	0fac8e07e1	Refactor usage of ms->ldmSeqStore so that it is not modified during compressBlock(), and simplify skipRawSeqStoreBytes	2020-10-07 13:56:25 -04:00
senhuang42	a5500cf2af	Refactor separate ldm variables all into one struct	2020-10-07 13:56:25 -04:00
senhuang42	0325d878f2	Remove bubbling down matches with longer offCode and same matchLen	2020-10-07 13:56:25 -04:00
senhuang42	ddf8a3f1b9	Enable inclusion of mid-flight LDMs in opt parser	2020-10-07 13:56:25 -04:00
senhuang42	88f72ed942	Correct incorrect offcode calculation	2020-10-07 13:56:25 -04:00
senhuang42	d8b43a4202	Add explicit conversion of size_t to U32	2020-10-07 13:56:25 -04:00
senhuang42	b8bfc4e63d	Add cSize regression test to fuzzer.c	2020-10-07 13:56:25 -04:00
senhuang42	c87d2e5866	Prefix new static ldm helpers with ZSTD_opt	2020-10-07 13:56:25 -04:00
senhuang42	429dec4f42	Add DEBUGLOG() calls in ldm helpers	2020-10-07 13:56:25 -04:00
senhuang42	10647924f1	Make function descriptions more accurate	2020-10-07 13:56:25 -04:00
senhuang42	37617e23d7	Correct matchLength calculation and remove unnecessary functions	2020-10-07 13:56:25 -04:00
senhuang42	7dee62c287	Reset ldmSeqStore after initStats_ultra() pass for btultra2	2020-10-07 13:56:25 -04:00
senhuang42	0718aa70df	Refactor existing functions to use posInSequence	2020-10-07 13:56:25 -04:00

1 2 3 4 5

209 Commits