facebook/zstd - zstd - Final Minetest

Commit Graph

Author	SHA1	Message	Date
Yann Collet	58ecf13e02	zstdmt : can compress at block granularity offering perspective of more accurate progression report.	2018-01-13 13:18:57 -08:00
Yann Collet	1edf33764e	Merge pull request #974 from terrelln/dstfile [fileio] Improve safety of output file modifications	2018-01-10 19:02:48 +01:00
Yann Collet	752880ffed	Merge pull request #963 from facebook/benchfix fix: bench can accept hlog custom parameter	2018-01-06 06:57:02 +01:00
Nick Terrell	ed9611dc62	[fileio] Don't call FIO_remove() on stdout or /dev/null	2018-01-05 11:50:24 -08:00
Nick Terrell	282ad05e0a	[fileio] Use FIO_remove() everywhere for safety	2018-01-05 11:44:45 -08:00
Nick Terrell	fd63140e1c	[util] Refuse to set file stat on non-regular file	2018-01-05 11:44:25 -08:00
Pádraig Brady	e0596715dc	zstd: fix crash when not overwriting existing files This fixes the following crash: $ touch exists $ programs/zstd -r examples/ -o exists zstd: exists already exists; not overwritten Segmentation fault (core dumped) * programs/fileio.c (FIO_compressMultipleFilenames): Handle the case where we're not overwriting the destination. Reported at https://bugzilla.redhat.com/1530049	2018-01-02 15:24:09 +00:00
Yann Collet	c707c6e9f2	fix: bench can accept hlog custom parameter was ignored during initialization	2017-12-27 13:32:05 +01:00
Yann Collet	cc9e026866	Merge pull request #952 from terrelln/merge-end [fileio] Merge end loop for small optimization	2017-12-15 10:27:53 -08:00
Yann Collet	2cff66b62f	version bump to v1.3.3	2017-12-14 16:11:20 -08:00
Nick Terrell	f48d34edba	[fileio] Merge end loop for small optimization	2017-12-14 15:52:24 -08:00
Yann Collet	a0ac8c895c	Merge pull request #950 from facebook/srcSizeAdaptation fix adaptation on srcSize	2017-12-14 14:48:31 -08:00
Yann Collet	2e97a6d464	fixed minor declaration-after-statement warning	2017-12-13 18:50:05 -08:00
Yann Collet	5432ef6921	fixes adaptation on srcSize This patch restores capability for each file to receive adapted compression parameters depending on its size. The bug breaking this feature was relatively silly : setting a parameter with a value "0" is supposed to be a no-op. Unfortunately, it would pin down compression parameters as if they were manually set, preventing later automatic adaptation. Unfortunately, I'm currently short of a test case that could check this situation and trigger an error. Compression parameters selection between tableID 0,1,2,3 is largely internal, leaving no trace to outside world, not even in frame header.	2017-12-13 17:45:26 -08:00
Nick Terrell	4680e85bdf	Allow -o with multiple files	2017-12-13 17:44:34 -08:00
Yann Collet	4d0dfafa7b	Merge pull request #949 from terrelln/rrm [fileio] Refuse to remove non-regular file	2017-12-13 17:36:39 -08:00
Nick Terrell	82bc8fe0cc	[fileio] Refuse to remove non-regular file	2017-12-13 13:38:26 -08:00
Nick Terrell	b5e7f6c0f3	[fileio] Fix window size MB calculation Test command: ``` head -c 10000 /dev/zero \| ./zstd -c --zstd=wlog=12 \| ./zstd -M2048 -t ```	2017-12-13 10:57:01 -08:00
Yann Collet	31293330d0	It's still necessary to check PLATFORM_POSIX_VERSION for clock_gettime() glibc/uclibc is not enough	2017-12-04 16:31:59 -08:00
Yann Collet	0097469238	removed a few redundant #include	2017-12-04 16:02:42 -08:00
Yann Collet	e46194bbf9	fix #911 : changed detection macro for clock_gettime() The new macro might be a bit too restrictive. Systems which do not support new test will simply default to <time.h>'s `clock_t clock()`, suffering lesser benchmark accuracy. Should it matter, the detection macro will have to be upgraded.	2017-12-04 15:57:01 -08:00
Yann Collet	55faa5492d	fileio: fixed LZ4F invocation from assert()	2017-12-04 11:26:59 -08:00
Yann Collet	af2fbbcb0d	Merge pull request #939 from facebook/shorterCircleCI Faster CircleCI tests	2017-12-04 11:22:30 -08:00
Yann Collet	71f012e5bf	zstdcli: fixed minor warning when bench module not enabled one variable defined but not used	2017-12-01 17:42:46 -08:00
Yann Collet	a1b24e6262	Merge pull request #938 from terrelln/time Use util.h for timing	2017-12-01 16:40:38 -08:00
Nick Terrell	dab8cfa3c7	Combine definitions of SEC_TO_MICRO	2017-11-30 19:40:53 -08:00
Nick Terrell	9a2f6f477b	Use util.h for timing	2017-11-30 14:57:25 -08:00
Yann Collet	2f22a6ec50	Merge branch 'dev' into opt3	2017-11-28 15:03:58 -08:00
Yann Collet	0a0a212934	zstd_opt: changed cost formula There was a flaw in the formula which compared literal cost with match cost : at a given position, a non-null literal suite is going to be part of next sequence, while if position ends a previous match, to immediately start another match, next sequence will have a litlength of zero. A litlength of zero has a non-null cost. It follows that literals cost should be compared to match cost + litlength==0. Not doing so gave a structural advantage to matches, which would be selected more often. I believe that's what led to the creation of the strange heuristic which added a complex cost to matches. The heuristic was actually compensating. It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar). The problem with this heuristic is that it's hard to understand, and unfortunately, any future change in the parser would impact the way it should be calculated and its effects. The "proper" formula makes it possible to remove this heuristic. Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse. Note that all differences are small (< 0.01 ratio). In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7). I suspect that's because starting statistics are pretty poor (another area of improvement). However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...). It's a pity that zstd -22 gets worse on silesia.tar. That being said, I like that the new code gets rid of strange variables, which were introducing complexity for any future evolution (faster variants being in mind). Therefore, in spite of this detrimental side effect, I tend to be in favor of it.	2017-11-28 14:07:03 -08:00
W. Felix Handte	baff9dd15e	Fix LZ4 Compression Buffer Overflow Fixes issue where, when `zstd --format=lz4` is fed an input larger than 128KB, the read overruns the input buffer. This changes Zstd to use LZ4 with chained 64KB blocks. This is technically a breaking change in that some third party LZ4 implementations may not support linked blocks. However, progress should not be allowed to be stopped by such petty concerns as backwards compatibility!	2017-11-28 12:07:26 -05:00
Yann Collet	743b23878e	install: changed variable MANDIR into MAN1DIR MANDIR still exists, and is now the parent of MAN1DIR	2017-11-27 13:47:35 -08:00
Yann Collet	2fd765498a	updated man page following patch #931 by @scottchiefbaker	2017-11-24 17:20:54 -08:00
Yann Collet	c857ee850a	minor update	2017-11-24 16:44:28 -08:00
Scott Baker	31a191b178	Include information about the benchmark output/methodology Addresses #930	2017-11-22 20:34:25 -08:00
Yann Collet	daebc7fe26	bench: slightly adjusted display format adapt accuracy depending on value. makes it possible to have higher accuracy for small value, notably small compression speed. This capability is expected to be useful while modifying optimal parser.	2017-11-18 15:54:32 -08:00
Nick Terrell	a6052af0e8	[zstd] Fix rare bug with signal handler	2017-11-17 16:38:56 -08:00
Yann Collet	5b957ba899	minor interface adjustments	2017-11-17 01:21:40 -08:00
Yann Collet	d898fb7ba6	bench: added cli command `-S` to benchmark multiple files separately Currently, all files are joined by default, they are compressed separately but benchmarked together, providing a single final result. Benchmarking files separately make it possible to accurately measure difference for each file. This is expected to be useful while tuning optimal parser.	2017-11-17 00:22:55 -08:00
Yann Collet	8accfa7fcc	bench: realTime is a global parameter like most parameters not directly related to compression	2017-11-17 00:02:37 -08:00
Yann Collet	9a11f70dc3	merged repcode search into BT match search this version has same speed as branch `opt` which is itself 5-10% slower than branch `dev` (no identified reason) It does not compress exactly the same as `opt` or `dev`, maybe because it doesn't stop search after repcodes, leading to sometimes better compression, sometimes worse (by a small margin). warning : _extDict path does not work for the time being This means that benchmark module works, but file module will fail with large files (and high compression level). Objective is to fuse _extDict path into current one, in order to have a single parser to maintain.	2017-11-13 02:23:48 -08:00
Yann Collet	6f1dfa8adf	removed line with `//` comment this is for a different topic (better parameter adaptation for small files + dictionary and/or custome parameters)	2017-11-01 17:01:45 -07:00
Yann Collet	428e8b3bf4	fix : ZSTD_compress_generic(,,,ZSTD_e_end) automatically sets pledgedSrcSize as per documentation, on ZSTD_setPledgedSrcSize() : > If all data is provided and consumed in a single round, > this value (pledgedSrcSize) is overriden by srcSize instead. This wasn't applied before compression level is transformed into compression parameters. As a consequence, small input missed compression parameters adaptation. It seems to work fine now : compression was compared with ZSTD_compress_advanced(), results were the same.	2017-11-01 13:15:23 -07:00
Nick Terrell	b495140f67	Update BUCK files * Correct XXH namespace (Fixes #901) * Multithreading always enabled * GZIP/LZ4/LZMA always enabled * Legacy support always fully enabled	2017-10-25 12:47:57 -07:00
Yann Collet	91535d71ec	fixed missing zstdmt_compress.h dependency we lose a warning message : when a job size is chosen < minimum job size for multithreading, it is automatically resized to minimum size. If this information is really useful, it should be present in zstd.h now.	2017-10-19 12:09:34 -07:00
Yann Collet	eac42534fe	bench: fixed Visual warning regarding struct initialization also : removed dependency on zstdmt_compress.h removed several unused macros fileio : small code refactoring to reduce some variable scope	2017-10-19 11:56:14 -07:00
Yann Collet	d3b9547aa4	IO and bench : ZSTD_NEWAPI is the only remaining code path removed the other 2 code paths (single thread, and ZSTDMT ones) keeping only the new advanced API, for easier code coverage. It shall also fix identified issue with Visual Studio which doesn't have ZSTD_NEWAPI defined.	2017-10-18 17:01:53 -07:00
Yann Collet	300e1df0a3	fixed wrong test to display compression status	2017-10-18 11:41:52 -07:00
Yann Collet	18b795374a	UTIL_getFileSize() returns UTIL_FILESIZE_UNKNOWN on failure UTIL_getFileSize() used to return zero on failure. This made it impossible to distinguish a failure from a genuine empty file. Both cases where coalesced. Adding UTIL_FILESIZE_UNKNOWN constant has many consequences on user code, since in many places, the `0` was assumed to mean "error". This is no longer the case, and the error code must be actively checked.	2017-10-17 16:14:25 -07:00
Yann Collet	32c9f715ae	fixed : Visual build compressing stdin with multi-threading enabled fails It was multiple reasons stacked : - Visual use a different code path, because ZSTD_NEWAPI is not defined - fileio.c sends `0` as `pledgedSrcSize` to mean `ZSTD_CONTENTSIZE_UNKNOWN` (fixed) - ZSTDMT_resetCCtx() interpreted `0` as "empty" instead of "unknown" (fixed)	2017-10-17 14:07:43 -07:00
Yann Collet	fc8d293460	dictionary compression use correct file size estimation when determining compression parameters to compress one file only. For multiple files, it still "bets" that files are going to be small. There was also a bug recently added in ZSTD_CCtx_loadDictionary_advanced() making it incapable to use pledgedSrcSize to determine compression parameters.	2017-10-14 01:21:43 -07:00

1 2 3 4 5 ...

1153 Commits (6025465e42dfce6d3312b14a80e7614bc52b362c)