facebook/zstd - zstd - Final Minetest

Author	SHA1	Message	Date
Yann Collet	23a9368c45	new starting offcode table for zstd_opt	2021-09-03 17:41:42 -07:00
Yann Collet	27a8bbe265	new initializer for ll price	2021-09-03 16:07:31 -07:00
Yann Collet	f0fc8cb3e1	Disable console notification by default within the library As a library, the default shouldn't be to write anything on console. `cover` and `fastcover` have a `g_displayLevel` variable to control this behavior. It's now set to 0 (no display) by default. Setting notification to a higher level should be an explicit operation by a console application.	2021-09-03 13:44:07 -07:00
Yann Collet	eab692211e	removed pretty-print of sizes in benchmark This is less appropriate for this mode : benchmark is about accuracy, it's important to read the exact values.	2021-09-03 12:51:02 -07:00
sen	71076b7a01	Merge pull request #2763 from senhuang42/opt_compiletime Improve compile speed and binary size in `opt`	2021-09-02 11:59:02 -04:00
Yann Collet	a8cf85ad0a	Merge pull request #2762 from facebook/level13 minor rebalancing of level 13	2021-09-01 20:32:53 -07:00
Sen Huang	d88c1d95ce	Remove inlining for opt	2021-09-01 16:58:57 -04:00
Yann Collet	70d89e5a12	minor rebalancing of level 13 This new setup is slighly better on `silesia.tar` : Ratio : 3.649 -> 3.655 Speed : 11.9 MB/s -> 12.2 MB/s At the cost of more memory : 24 MB -> 32 MB The new memory budget is a reasonable interpolation between neighboring levels 12 and 14: level 12 : 24 MB level 13 : 32 MB (increased from 24 MB) level 14 : 48 MB Window size remains unaffected (4 MB)	2021-09-01 13:05:10 -07:00
senhuang42	414e24becf	Add 8 bytes to FSE workspace	2021-09-01 15:56:33 -04:00
W. Felix Handte	d6fd7761c9	Fix VS Build: Explicitly Cast to Narrow Ints	2021-09-01 14:15:04 -04:00
W. Felix Handte	15e67bfa7e	Deduplicate Implementations This removes the old `ZSTD_compressBlock_fast_generic()` and renames the new `ZSTD_compressBlock_fast_generic_pipelined()` to replace it. This is functionally a no-op.	2021-09-01 14:15:04 -04:00
W. Felix Handte	64054dec44	Tweak Step	2021-09-01 14:15:04 -04:00
W. Felix Handte	24fcccd05c	Unroll Loop Core; Reduce Frequency of Repcode Check & Step Calc (+>1% Speed) Unrolling the loop to handle 2 positions in each iteration allows us to reduce the frequency of some operations that don't need to happen at every position. One such operation is the step calculation, which is a very rough heuristic anyways. It's fine if we do this a position later. The other operation is the repcode check. But since the repcode check already tries expanding back one position, we're really not missing much of importance by only trying it every other position. This commit also slightly reorders some operations.	2021-09-01 14:15:04 -04:00
W. Felix Handte	57a100f6dc	Add `ip1 + 128` Prefetch; Tiny Cleanup	2021-09-01 14:15:04 -04:00
W. Felix Handte	991d660ea9	Nit: Only Store 2 Hash Variables	2021-09-01 14:15:04 -04:00
W. Felix Handte	8706bc115a	Nit: Dedup idx0 and idx1	2021-09-01 14:15:04 -04:00
W. Felix Handte	7c24c3e6ce	Give Up on Searching End of Block Amusingly, it seems to be a non-trivial performance hit to add in final searches or even hash table insertions during cleanup. So let's not. It seems to not make any meaningful difference in compression ratio.	2021-09-01 14:15:03 -04:00
W. Felix Handte	35932ab2f1	Prefetch Input in Incompressible Sections (+0.25% Speed)	2021-09-01 14:15:03 -04:00
W. Felix Handte	b092dd75b7	Shrink Pipeline from 4 Positions to 3	2021-09-01 14:15:03 -04:00
W. Felix Handte	387840af79	Re-Order Operations for Slightly Better Performance	2021-09-01 14:15:03 -04:00
W. Felix Handte	bc768bccc0	Track Step Size Statefully, Rather than Recalculating Every Time	2021-09-01 14:15:03 -04:00
W. Felix Handte	80bc12b33a	Initial Pipelined Implementation for ZSTD_fast	2021-09-01 14:15:03 -04:00
Yann Collet	74b4171fb8	fix alignment condition in FSE_buildCTable 2-bytes alignment is enough for 16-bit fields	2021-08-29 19:05:04 -07:00
Yann Collet	18a20b3ad7	Merge pull request #2752 from facebook/hashLog3max make ZSTD_HASHLOG3_MAX private	2021-08-20 12:51:17 -07:00
Yann Collet	2de42174bb	make ZSTD_HASHLOG3_MAX private This is an implementation detail, it doesn't belong to public space (zstd.h).	2021-08-20 09:52:42 -07:00
sen	ae998544de	Merge pull request #2750 from senhuang42/sb_compress Improve branch misses on FSE symbol spreading	2021-08-20 12:47:24 -04:00
senhuang42	da095ed899	Improve branch misses on FSE symbol spreading	2021-08-18 10:22:22 -07:00
Sen Huang	539b3aab9b	Optimize 32-bit VecMask_next()	2021-08-04 17:14:58 -04:00
senhuang42	e411040ea1	Add 64 row entry support for lazy	2021-08-04 16:19:12 -04:00
senhuang42	31820e032c	Rebalance clevels for lazy	2021-08-04 16:18:52 -04:00
senhuang42	aa1957477b	Improve Huffman sorting algorithm	2021-08-04 12:43:34 -04:00
Nick Terrell	6ee70bae46	Merge pull request #2733 from terrelln/huf-cspeed [HUF] Improve Huffman encoding speed	2021-08-03 12:59:54 -04:00
Nick Terrell	d8a0797268	[fuzz] Add Huffman round trip fuzzer * Add a Huffman round trip fuzzer * Fix two minor bugs in Huffman that aren't exposed in zstd - Incorrect weight comparison (weights are allowed to be equal to table log). - HUF_compress1X_usingCTable_internal() can return compressed size >= source size, so the assert that `cSize <= 65535` isn't correct, and it needs to be checked instead.	2021-08-03 08:10:06 -07:00
sen	5c46f62006	Merge pull request #2677 from senhuang42/ci_overhaul_2 [CI][2/2] Migrate CI tests which (currently) fail	2021-08-02 09:55:49 -04:00
Sen Huang	5ec7897a26	Fix static analyzer warnings	2021-07-29 09:11:12 -07:00
Nick Terrell	46f2710562	[HUF] Improve Huffman encoding speed Improve Huffman encoding speed by 20% for gcc and 10% for clang. \| Compiler \| Benchmark \| Config \| Dataset \| Ratio \| Speed MB/s (dev) \| Speed MB/s (huf-cspeed) \| Speed MB/s (huf-cspeed - dev) \| \|----------\|-------------------\|---------\|-------------\|-------\|------------------\|-------------------------\|-------------------------------\| \| gcc \| compress \| level_1 \| enwik7 \| 2.43 \| 253.70 \| 258.72 \| 2.0% \| \| gcc \| compress \| level_1 \| silesia \| 2.88 \| 341.90 \| 348.15 \| 1.8% \| \| gcc \| compress_literals \| level_1 \| enwik7 \| 1.49 \| 761.83 \| 912.76 \| 19.8% \| \| gcc \| compress_literals \| level_1 \| silesia \| 1.28 \| 754.83 \| 902.37 \| 19.5% \| \| gcc \| compress_literals \| level_7 \| enwik7 \| 1.29 \| 502.81 \| 552.79 \| 9.9% \| \| gcc \| compress_literals \| level_7 \| silesia \| 1.11 \| 675.97 \| 776.44 \| 14.9% \| \| clang \| compress \| level_1 \| enwik7 \| 2.43 \| 277.54 \| 280.98 \| 1.2% \| \| clang \| compress \| level_1 \| silesia \| 2.88 \| 369.98 \| 375.46 \| 1.5% \| \| clang \| compress_literals \| level_1 \| enwik7 \| 1.49 \| 828.83 \| 918.41 \| 10.8% \| \| clang \| compress_literals \| level_1 \| silesia \| 1.28 \| 815.81 \| 905.41 \| 11.0% \| \| clang \| compress_literals \| level_7 \| enwik7 \| 1.29 \| 533.13 \| 553.30 \| 3.8% \| \| clang \| compress_literals \| level_7 \| silesia \| 1.11 \| 714.52 \| 775.38 \| 8.5% \|	2021-07-27 15:10:35 -07:00
W. Felix Handte	da58821ff2	Fix DDSS Load This PR fixes an incorrect comparison in figuring out `minChain` in `ZSTD_dedicatedDictSearch_lazy_loadDictionary()`. This incorrect comparison had been masked by the fact that `idx` was always 1, until @terrelln changed that in #2726. Credit-to: OSS-Fuzz	2021-07-27 11:49:44 -04:00
Nick Terrell	ba044bd6f1	[bug-fix] Fix a determinism bug with the DUBT The DUBT can be non-deterministic if an index is equal to `ZSTD_DUBT_UNSORTED_MARK`. Ensure that never happens by starting the indices at 2. This bug was found by the OSS-Fuzz determinism fuzzer. With this change the fuzzer test passes. And I've confirmed that this is the root cause, not just hiding the problem. Aside: This took me a long time to figure out, because I thought I had tried this first thing. But, apparantly I messed it up, because when I was going through it again with @felixhandte, I was pointing out that it wasn't the case, but it turns out it was. Credit to: OSS-Fuzz	2021-07-15 13:02:49 -07:00
binhdvo	b3e372c171	Merge pull request #2717 from binhdvo/bootcamp Proactively skip huffman compression based on sampling where non-comp…	2021-07-01 10:39:58 -04:00
Binh Vo	dc5b693f1e	Proactively skip huffman compression based on sampling where non-compressibility is suspected	2021-06-30 11:02:47 -04:00
Nick Terrell	609be382ac	Merge pull request #2719 from danlark1/danlark_iwyu Include what you use in zstd_ldm_geartab	2021-06-29 16:53:10 -07:00
Danila Kutenin	e855b78be6	Include what you use in zstd_ldm_geartab	2021-06-29 17:57:53 +01:00
sen	45d707e908	Merge pull request #2715 from senhuang42/sequence_api_3 [RFC] Add internal API for converting ZSTD_Sequence into seqStore	2021-06-24 13:02:11 -04:00
senhuang42	76466dfadf	Add simple API for converting ZSTD_Sequence into seqStore	2021-06-23 12:10:48 -04:00
Nick Terrell	05b6773fbc	[fix] Add missing bounds checks during compression * The block splitter missed a bounds check, so when the buffer is too small it passes an erroneously large size to `ZSTD_entropyCompressSeqStore()`, which can then write the compressed data past the end of the buffer. This is a new regression in v1.5.0 when the block splitter is enabled. It is either enabled explicitly, or implicitly when using the optimal parser and `ZSTD_compress2()` or `ZSTD_compressStream()`. `HUF_writeCTable_wksp()` omits a bounds check when calling `HUF_compressWeights()`. If it is called with `dstCapacity == 0` it will pass an erroneously large size to `HUF_compressWeights()`, which can then write past the end of the buffer. This bug has been present for ages. However, I believe that zstd cannot trigger the bug, because it never calls `HUF_compress*()` with `dstCapacity == 0` because of [this check][1]. Credit to: Oss-Fuzz [1]: `89127e5ee2/lib/compress/zstd_compress_literals.c (L100)`	2021-06-14 11:35:33 -07:00
sen	d5f3568c4b	Merge pull request #2697 from senhuang42/entropy_repeat_fix [bug] Fix entropy repeat mode bug	2021-06-10 16:39:17 +03:00
aqrit	dd4f6aa9e6	Flatten ZSTD_row_getMatchMask (#2681 ) * Flatten ZSTD_row_getMatchMask * Remove the SIMD abstraction layer. * Add big endian support. * Align `hashTags` within `tagRow` to a 16-byte boundary. * Switch SSE2 to use aligned reads. * Optimize scalar path using SWAR. * Optimize neon path for `n == 32` * Work around minor clang issue for NEON (https://bugs.llvm.org/show_bug.cgi?id=49577) * replace memcpy with MEM_readST * silence alignment warnings * fix neon casts * Update zstd_lazy.c * unify simd preprocessor detection (#3) * remove duplicate asserts * tweak rotates * improve endian detection * add cast there is a fun little catch-22 with gcc: result from pmovmskb has to be cast to uint32_t to avoid a zero-extension but must be uint16_t to get gcc to generate a rotate instruction.. * more casts * fix casts better work-around for the (bogus) warning: unary minus on unsigned	2021-06-09 08:50:25 +03:00
Felix Handte	8a3bdfaa7b	Merge pull request #2654 from wolfpld/dev Initialize "potentially uninitialized" pointers.	2021-06-07 13:04:19 -04:00
Sen Huang	923e5ad3f5	Fix entropy repeat mode bug	2021-06-07 00:32:03 -07:00
senhuang42	939276cd0c	Add ldm and block splitter auto-enable to old api	2021-05-24 13:09:32 -04:00

1 2 3 4 5 ...

2050 Commits