facebook/zstd - zstd - Final Minetest

Author	SHA1	Message	Date
ihsinme	5081ccb056	Update zstd_compress.c	2022-05-30 14:08:19 +03:00
Danila Kutenin	9166c6ae20	Again unused error warning. Fixed	2022-05-23 14:51:47 +00:00
Danila Kutenin	6b561d230f	Move NEON version to a separate function and fix indentation	2022-05-23 14:49:35 +00:00
Danila Kutenin	778f639be9	Disable unused variable warning	2022-05-22 10:50:33 +00:00
Danila Kutenin	e11783b04d	[lazy] Optimize ZSTD_row_getMatchMask for level 8-10 We found that movemask is not used properly or consumes too much CPU. This effort helps to optimize the movemask emulation on ARM. For level 8-9 we saw 3-5% improvements. For level 10 we say 1.5% improvement. The key idea is not to use pure movemasks but to have groups of bits. For rowEntries == 16, 32 we are going to have groups of size 4 and 2 respectively. It means that each bit will be duplicated within the group Then we do AND to have only one bit set in the group so that iteration with lowering bit `a &= (a - 1)` works as well. Also, aarch64 does not have rotate instructions for 16 bit, only for 32 and 64, that's why we see more improvements for level 8-9. vshrn_n_u16 instruction is used to achieve that: vshrn_n_u16 shifts by 4 every u16 and narrows to 8 lower bits. See the picture below. It's also used in [Folly](`c570259008/folly/container/detail/F14Table.h (L446)`). It also uses 2 cycles according to Neoverse-N{1,2} guidelines. 64 bit movemask is already well optimized. We have ongoing experiments but were not able to validate other implementations work reliably faster.	2022-05-22 10:44:24 +00:00
Elliot Gorokhovsky	f349d18776	Merge pull request #3127 from embg/repcode_history Correct and clarify repcode offset history logic	2022-05-12 13:50:15 -04:00
Elliot Gorokhovsky	3620a0a565	Nits	2022-05-12 12:53:15 -04:00
W. Felix Handte	1dd046a507	Fix Comments Slightly	2022-05-11 12:38:45 -04:00
W. Felix Handte	cd1f582943	Hoist Hash Table Writes Up into Each Match Found Block Refactoring this way avoids the bad write in the case that `step > 4`, and is a bit more straightforward. It also seems to perform better!	2022-05-11 11:27:34 -04:00
W. Felix Handte	040986a4f4	ZSTD_fast_noDict: Minimize Checks When Writing Hash Table for ip1 This commit avoids checking whether a hashtable write is safe in two of the three match-found paths in `ZSTD_compressBlock_fast_noDict_generic`. This pro- duces a ~0.5% speed-up in compression. A comment in the code describes why we can skip this check in the other two paths (the repcode check and the first match check in the unrolled loop). A downside is that in the new position where we make this check, we have not yet computed `mLength`. We therefore have to avoid writing possibly dangerous positions, rather than the old check which only avoids writing actually dangerous positions. This leads to a miniscule loss in ratio (remember that this scenario can only been triggered in very negative levels or under incomp- ressibility acceleration).	2022-05-10 14:29:39 -07:00
Elliot Gorokhovsky	22875ece61	Nits	2022-05-09 21:01:38 -04:00
Elliot Gorokhovsky	97aabc496e	Correct and clarify repcode offset history logic	2022-05-09 21:01:38 -04:00
Elliot Gorokhovsky	ac371be27b	Remove hasStep variant (not enough wins to justify the code size increase)	2022-04-28 18:06:24 -04:00
Elliot Gorokhovsky	ce6b69f5c5	Final nit	2022-04-28 14:49:45 -04:00
Elliot Gorokhovsky	6a2e1f7c69	Revert "Hardcode repcode safety check, fix cosmetic nits" This reverts commit 518cb83833074d304dfcaa93cfc16039ea4683c8.	2022-04-27 18:16:21 -04:00
Elliot Gorokhovsky	518cb83833	Hardcode repcode safety check, fix cosmetic nits	2022-04-26 17:54:25 -04:00
Elliot Gorokhovsky	809f652912	Optimize repcode predicate, hardcode hasStep == 0 scenario, cosmetic fixes	2022-04-20 14:40:52 -04:00
Elliot Gorokhovsky	2820efe7ec	Nits	2022-04-19 11:39:52 -04:00
Elliot Gorokhovsky	3536262f70	Port noDict pipeline	2022-04-15 12:16:16 -04:00
Elliot Gorokhovsky	64efba4c5e	Software pipeline for ZSTD_compressBlock_fast_dictMatchState (#3086 ) * prefetch dict content inside loop * ip0/ip1 pipeline * add L2_4 prefetch to dms pipeline * Remove L1 prefetch * Remove L2 prefetching * Reduce # of gotos * Cosmetic fixes * Check final position sometimes * Track step size as in bc768bc * Fix nits	2022-03-17 12:35:11 -04:00
Dominique Pelle	b772f53952	Typo and grammar fixes	2022-03-12 08:58:04 +01:00
Elliot Gorokhovsky	529cd7b821	Fix nits	2022-02-14 14:24:50 -05:00
Elliot Gorokhovsky	db2f4a6532	Move bitwise builtins into bits.h	2022-02-14 11:16:03 -05:00
binhdvo	b9566fc558	Add rails for huffman table log calculation (#3047 )	2022-02-02 15:12:48 -05:00
Yann Collet	cad9f8d5f9	fix 44239 credit to oss-fuzz This issue could happen when using the new Sequence Compression API in Explicit Delimiter Mode with a too small dstCapacity. In which case, there was one place where the buffer size wasn't checked.	2022-02-01 10:49:38 -08:00
Yann Collet	637b2d7a24	fixed bug 44168 discovered by oss-fuzz It's a bug in the test itself : ZSTD_compressBound() as an upper bound of the compress size only works for data compressed "normally". But in situations where many flushes are forcefully introduced, this creates many more blocks, each of which has a potential to increase the size by 3 bytes. In extreme cases (lots of small incompressible blocks), the expansion can go beyond ZSTD_compressBound(). This situation is similar when using the CompressSequences() API with Explicit Block Delimiters. In which case, each explicit block acts like a deliberate flush. When employed by a fuzzer, it's possible to generate scenarios like the one described above, with tons of incompressible blocks of small sizes, thus going beyond ZSTD_compressBound(). fix : when using Explicit Block Delimiters, use a larger bound, to account for this scenario.	2022-01-29 16:36:20 -08:00
Yann Collet	9a68840176	minor refactor to blocksplit notably simplication of ZSTD_deriveSeqStoreChunk()	2022-01-27 20:24:35 -08:00
Yann Collet	5d70ec0bc4	Merge pull request #3033 from facebook/fix44108 fix issue 44108	2022-01-27 10:57:48 -08:00
Yann Collet	bad7f82300	Merge pull request #2974 from facebook/fix2966_part3 Lazy parameters adaptation (part 1 - ZSTD_c_stableInBuffer)	2022-01-27 06:14:04 -08:00
Yann Collet	8df1257c3c	fix issue 44108 credit to oss-fuzz In rare circumstances, the block-splitter might cut a block at the exact beginning of a repcode. In which case, since litlength=0, if the repcode expected 1+ literals in front, its signification changes. This scenario is controlled in ZSTD_seqStore_resolveOffCodes(), and the repcode is transformed into a raw offset when its new meaning is incorrect. In more complex scenarios, the previous block might be emitted as uncompressed after all, thus modifying the expected repcode history. In the case discovered by oss-fuzz, the first block is emitted as uncompressed, so the repcode history remains at default values: 1,4,8. But since the starting repcode is repcode3, and the literal length is == 0, its meaning is : = repcode1 - 1. Since repcode1==1, it results in an offset value of 0, which is invalid. So that's what the `assert()` was verifying : the result of the repcode translation should be a valid offset. But actually, it doesn't matter, because this result will then be compared to reality, and since it's an invalid offset, it will necessarily be discarded if incorrect, then the repcode will be replaced by a raw offset. So the `assert()` is not useful. Furthermore, it's incorrect, because it assumes this situation cannot happen, but it does, as described in above scenario.	2022-01-27 05:49:59 -08:00
Yann Collet	f2d9652ad8	more usage of new error code stabilityCondition_notRespected as suggested by @terrelln	2022-01-26 18:30:55 -08:00
Yann Collet	8b46895588	removed new huffman depth heuristic results are now identical to before this PR	2022-01-26 15:22:06 -08:00
Yann Collet	a66e8bb437	introduced LitHufLog constant which properly represents the maximum bit size of compressed literals (11) as defined in the specification. To be preferred from HUF_TABLELOG_DEFAULT which represents the same value but by accident. Name selected to keep the same convention as existing width definitions, MLFSELog, LLFSELog and OffFSELog.	2022-01-26 14:47:24 -08:00
Yann Collet	32a5d95dcb	moved HufLog to lib/decompress it's only used to size decompression tables	2022-01-26 14:47:24 -08:00
Yann Collet	e9dd923fa4	only declare debug functions in debug mode	2022-01-26 14:47:24 -08:00
Yann Collet	5db717af10	proper max limit to 11	2022-01-26 14:47:24 -08:00
Yann Collet	4684836f4f	update regression tests minor compression ratio benefits in some cases, no compression ratio regression in the measured scenarios.	2022-01-26 14:47:24 -08:00
Yann Collet	51da2d2ff2	improved compression of literals in specific corner cases In rare cases, the default huffman depth selector is a bit too harsh, requiring brutal adaptations to the tree, resulting is some loss of compression ratio. This new heuristic avoids the worse cases, favoring compression ratio. As an example, compression of a specific distribution of 771 literals is now improved to 441 bytes, from 601 bytes before.	2022-01-26 14:47:24 -08:00
Yann Collet	7616e39f3b	adding traces to better track processing of literals	2022-01-26 14:47:21 -08:00
Yann Collet	cbff372d10	added helper function inBuffer_forEndFlush()	2022-01-26 11:05:57 -08:00
Yann Collet	b99ece96b9	converted checks into user validation generating error codes had to create a new error code for this condition, none of the existing ones were fitting enough.	2022-01-26 10:43:50 -08:00
Yann Collet	c1668a00d2	fix extended case combining stableInBuffer with continue() and flush() modes	2022-01-26 10:31:25 -08:00
Yann Collet	270f9bf005	better consistency in accessing @input as suggested by @terrelln. Also : commented zstreamtest more to ensure ZSTD_stableInBuffer is tested/	2022-01-26 10:31:24 -08:00
Yann Collet	8296be4a0a	pretend consuming input to provide a sense of forward progress	2022-01-26 10:31:24 -08:00
Yann Collet	4b9d1dd9ff	fixed incorrect comment	2022-01-26 10:31:24 -08:00
Yann Collet	27d336b099	minor behavior refinements specifically, there is no obligation to start streaming compression with pos=0. stableSrc mode is now compatible with this setup.	2022-01-26 10:31:24 -08:00
Yann Collet	37b87add7a	make stableSrc compatible with regular streaming API including flushStream(). Now the only condition is for `input.size` to continuously grow.	2022-01-26 10:31:24 -08:00
Yann Collet	c0c5ffa973	streaming compression : lazy parameter adaptation with stable input effectively makes ZSTD_c_stableInput compatible ZSTD_compressStream() and zstd_e_continue operation mode.	2022-01-26 10:31:24 -08:00
Yann Collet	5684bae4f6	minor refactoring on streaming compression implementation.	2022-01-26 10:31:23 -08:00
Yann Collet	fc2ea97442	refactored fuzzer tests for sequence compression api add explicit delimiter mode to libfuzzer test	2022-01-26 00:19:35 -08:00

1 2 3 4 5 ...

2128 Commits