facebook/zstd - zstd - Final Minetest

Author	SHA1	Message	Date
Yann Collet	03903f5701	fixed minor compression difference in btlazy2 subtle dependency on sumtype numeric representation	2021-12-29 18:51:03 -08:00
Yann Collet	7a18d709ae	updated all names to offBase convention	2021-12-29 17:30:43 -08:00
Yann Collet	8da414231d	found a few more places which were dependent on seqStore offcode sumtype numeric representation	2021-12-28 17:03:24 -08:00
Yann Collet	de9f52e945	regroup all mentions of ZSTD_REP_MOVE within zstd_compress_internal.h	2021-12-28 13:47:57 -08:00
Yann Collet	a34ccad9a6	fixed minor conversion warnings	2021-12-28 13:21:22 -08:00
Yann Collet	92a08eec72	abstracted storeSeq() sumtype numeric representation from zstd_lazy.c	2021-12-28 12:23:39 -08:00
Yann Collet	321583ccf5	fixed minor typecast warnings	2021-12-28 11:38:21 -08:00
Yann Collet	b7630a474b	abstracted usage of offBase sumtype within zstd_lazy.c	2021-12-28 10:59:47 -08:00
Yann Collet	1aed962216	introduce macros STORE_OFFSET() and STORE_REPCODE() this meant to abstract the sumtype representation required to transfert `offcode` to `ZSTD_storeSeq()`. Unfortunately, the sumtype numeric representation is currently a leaky abstraction that has permeated many other parts of the code, especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`. While this PR makes a good job a transfering a large nb of call sites to using the new macros, there are still a few sites where this transformation is more complex, or where the numeric representation itself it used "as is". One of the problematics area is the decision to use the numeric format of the sumtype within the match finders of `zstd_lazy`. This commit doesn't change the behavior, it only introduces and employes the macros, but eventually the resulting code remains identical. At target, if the numeric representation of the sumtype can be completely abstracted and no other part of the code depends on it, it will be possible to move it towards something slightly more efficient.	2021-12-23 22:03:30 -08:00
Yann Collet	b77fcac61f	change ZSTD_storeSeq() interface to accept matchLength instead of mlBase. This removes the need to do `- MINMATCH` at every call site. The new interface contract is checked with an `assert()`.	2021-12-23 12:03:33 -08:00
Yann Collet	c8d6067615	fixed incorrect rowlog initialization the variable has only very limited usage, being only used once at the beginning of the block for prefetching only, hence the error had no impact on compression ratio.	2021-12-15 14:37:05 -08:00
Yann Collet	05430b25a8	roll SSE implementation of row_lazy match finder mostly for maintenance convenience. Performance wise, there is very little change, slightly faster for slog 3 & 4, neutral or very slightly negative for slot 5 & 6.	2021-12-14 10:44:23 -08:00
Nick Terrell	91f5891dd0	[CircleCI] Fix short-tests-0 short-tests-0 were silently failing. I think because of the && make clean construction. Switch to ; instead. Also fix all the test failures that were exposed. `make all` is failing on CircleCI because it is missing Docker. Move that test to GitHub actions, and switch the pedantic CircleCI test to `make allmost`.	2021-12-01 17:43:46 -08:00
Yann Collet	aba88fa996	Merge pull request #2829 from facebook/ZSTD_DECODER_INTERNAL_BUFFER minor : change build macro to ZSTD_DECODER_INTERNAL_BUFFER	2021-10-26 10:48:16 -07:00
Yann Collet	518f06b281	added minimum for decoder buffer also : introduced macro BOUNDED()	2021-10-26 08:21:31 -07:00
Nick Terrell	13cad3abb1	[lazy] Speed up compilation times Speed up compilation times by moving each specialized search function into its own function. This is faster because compilers can handle many smaller functions much faster than one gigantic function. The previous approach generated one giant function with `switch` statements and inlining to select the implementation. \| Compiler \| Flags \| Dev Time (s) \| PR Time (s) \| Delta \| \|----------\|-------------------------------------\|--------------\|-------------\|-------\| \| gcc \| -O3 \| 16.5 \| 5.6 \| -66% \| \| gcc \| -O3 -g -fsanitize=address,undefined \| 158.9 \| 38.2 \| -75% \| \| clang \| -O3 \| 36.5 \| 5.5 \| -85% \| \| clang \| -O3 -g -fsanitize=address,undefined \| 27.8 \| 17.5 \| -37% \| This also reduces the binary size because the search functions are no longer inlined into the main body. \| Compiler \| Dev libzstd.a Size (B) \| PR libzstd.a Size (B) \| Delta \| \|----------\|------------------------\|-----------------------\|-------\| \| gcc \| 1563868 \| 1308844 \| -16% \| \| clang \| 1924372 \| 1376020 \| -28% \| Finally, the performance is not impacted significantly by this change, in fact we generally see a small speed boost. \| Compiler \| Level \| Dev Speed (MB/s) \| PR Speed (MB/s) \| Delta \| \|----------\|-------\|------------------\|-----------------\|-------\| \| gcc \| 5 \| 110.6 \| 110.0 \| -0.5% \| \| gcc \| 7 \| 70.4 \| 72.2 \| +2.5% \| \| gcc \| 9 \| 53.2 \| 53.5 \| +0.5% \| \| gcc \| 13 \| 12.7 \| 12.9 \| +1.5% \| \| clang \| 5 \| 113.9 \| 110.4 \| -3.0% \| \| clang \| 7 \| 67.7 \| 70.6 \| +4.2% \| \| clang \| 9 \| 51.9 \| 52.2 \| +0.5% \| \| clang \| 13 \| 12.4 \| 13.3 \| +7.2% \| The compression strategy is unmodified in this PR, so the compressed size should be exactly the same. I may have a follow up PR to slightly improve the compression ratio, if it doesn't cost too much speed.	2021-10-22 13:45:26 -07:00
Yann Collet	9d62957b31	Merge pull request #2800 from animalize/fix_c89 Fix a C89 error in msvc	2021-10-18 14:32:04 -07:00
Nick Terrell	b77d95b053	Merge pull request #2820 from terrelln/nb-compares [binary-tree] Fix underflow of nbCompares	2021-10-11 09:59:57 -07:00
Nick Terrell	c6c482fe07	[binary-tree] Fix underflow of nbCompares Fix underflow of `nbCompares` by switching to an `int` and comparing `nbCompares > 0`. This is a minimal fix, because I don't want to change the logic. These loops seem to be doing `nbCompares + 1` comparisons. The bug was reported by Dan Carpenter and found by Smatch static checker. https://lore.kernel.org/all/20211008063704.GA5370@kili/	2021-10-08 13:22:55 -07:00
Nick Terrell	399644b1f1	[nit] Fix buggy indentation The bug was reported by Dan Carpenter and found by Smatch static checker. https://lore.kernel.org/all/20211008063704.GA5370@kili/	2021-10-08 11:13:11 -07:00
Sen Huang	4b7f45cb04	Pull hot loop into its own function	2021-09-28 08:19:44 -07:00
Sen Huang	ccdcbf4621	Try beginning and end of match	2021-09-28 08:19:44 -07:00
Sen Huang	b8fd6bf30c	Skip most long matches in lazy hash table update	2021-09-28 08:19:39 -07:00
Ma Lin	ae986fcdb8	Use __assume(0) for unreachable code path in msvc msvc will optimize away the condition check.	2021-09-27 19:23:57 +08:00
Ma Lin	e5ba858270	Don't initialize the first parameter of _BitScanForward* functions Like the document example, no need to initialize `r` to 0. https://docs.microsoft.com/en-us/cpp/intrinsics/bitscanforward-bitscanforward64	2021-09-25 16:36:53 +08:00
Ma Lin	cc22042da0	Fix a C89 error in msvc Variables (r) must be declared at the beginning of a code block. This causes msvc2012 to fail to compile 64-bit build.	2021-09-25 16:32:06 +08:00
Yann Collet	fa2a4d77c7	constify MatchState* parameter when possible turns out, it's possible to constify MatchState* parameter in some parts of the binary tree algorithm, making it a pure read-only parameter, as opposed to a mutable state. This is supposed to be helpful for both maintenance and the compiler.	2021-09-23 08:27:44 -07:00
Sen Huang	539b3aab9b	Optimize 32-bit VecMask_next()	2021-08-04 17:14:58 -04:00
senhuang42	e411040ea1	Add 64 row entry support for lazy	2021-08-04 16:19:12 -04:00
sen	5c46f62006	Merge pull request #2677 from senhuang42/ci_overhaul_2 [CI][2/2] Migrate CI tests which (currently) fail	2021-08-02 09:55:49 -04:00
Sen Huang	5ec7897a26	Fix static analyzer warnings	2021-07-29 09:11:12 -07:00
W. Felix Handte	da58821ff2	Fix DDSS Load This PR fixes an incorrect comparison in figuring out `minChain` in `ZSTD_dedicatedDictSearch_lazy_loadDictionary()`. This incorrect comparison had been masked by the fact that `idx` was always 1, until @terrelln changed that in #2726. Credit-to: OSS-Fuzz	2021-07-27 11:49:44 -04:00
aqrit	dd4f6aa9e6	Flatten ZSTD_row_getMatchMask (#2681 ) * Flatten ZSTD_row_getMatchMask * Remove the SIMD abstraction layer. * Add big endian support. * Align `hashTags` within `tagRow` to a 16-byte boundary. * Switch SSE2 to use aligned reads. * Optimize scalar path using SWAR. * Optimize neon path for `n == 32` * Work around minor clang issue for NEON (https://bugs.llvm.org/show_bug.cgi?id=49577) * replace memcpy with MEM_readST * silence alignment warnings * fix neon casts * Update zstd_lazy.c * unify simd preprocessor detection (#3) * remove duplicate asserts * tweak rotates * improve endian detection * add cast there is a fun little catch-22 with gcc: result from pmovmskb has to be cast to uint32_t to avoid a zero-extension but must be uint16_t to get gcc to generate a rotate instruction.. * more casts * fix casts better work-around for the (bogus) warning: unary minus on unsigned	2021-06-09 08:50:25 +03:00
Felix Handte	8a3bdfaa7b	Merge pull request #2654 from wolfpld/dev Initialize "potentially uninitialized" pointers.	2021-06-07 13:04:19 -04:00
Yann Collet	02ece5d59f	Merge pull request #2653 from TrianglesPCT/dev Enable SSE2 compression path to work on MSVC	2021-05-17 11:20:50 -07:00
Dan Nelson	54f78e3df8	ZSTD_VecMask_next: fix incorrect variable name in fallback code path	2021-05-15 10:20:37 -05:00
TrianglesPCT	bee0ef5647	Update zstd_lazy.c It put the changes back when I tried to make a separate pull request, i don't understand githubs interface at all.	2021-05-14 19:23:13 -06:00
TrianglesPCT	d688ab1e0c	Add files via upload AVX2	2021-05-14 19:18:12 -06:00
TrianglesPCT	bb1cdd8c63	Update zstd_lazy.c add space	2021-05-14 19:11:28 -06:00
TrianglesPCT	a62856bf65	Update zstd_lazy.c Remove the AVX2 part	2021-05-14 19:10:24 -06:00
TrianglesPCT	8f7ea1afeb	Update zstd_lazy.c Switch to other comment style	2021-05-14 19:02:34 -06:00
TrianglesPCT	0e071214b5	Update zstd_lazy.c switch to unaligned load as I don't know if buffer will always be aligned to 32 bytes, and compilers aside from MSVC might actually use aligned loads	2021-05-14 17:03:30 -06:00
TrianglesPCT	69ac124b12	Update zstd_lazy.c	2021-05-14 16:53:19 -06:00
TrianglesPCT	0b9f4bb0ff	Update zstd_lazy.c use 8bit	2021-05-14 16:47:24 -06:00
Bartosz Taudul	7012c6e7a4	Initialize "potentially uninitialized" pointers.	2021-05-15 00:40:49 +02:00
TrianglesPCT	77d54eb3b3	Add files via upload	2021-05-14 16:40:32 -06:00
TrianglesPCT	25bda9053a	Add files via upload msvc suport avx2 path	2021-05-14 16:32:04 -06:00
Nick Terrell	10b35b312b	[lib] Fix off-by-one error in repcode checks The repcode checks disallowed repcodes that are equal to `windowLow`. This is slightly inefficient, but isn't a problem on its own. Together with the next commit, it cause non-determinism.	2021-05-13 17:05:59 -07:00
Sen Huang	e6c8a5dd40	Fix incorrect usages of repIndex across all strategies	2021-05-04 19:50:55 -04:00
felixhandte	efa6dfa729	Apply DDS adjustments to avoid assert failures	2021-04-23 16:41:00 -04:00

1 2 3 4

161 Commits