Commit Graph

2829 Commits (06d5e5ff5a9d7176aed075be3a2cb6f5cece845e)

Author SHA1 Message Date
Yann Collet 9af909bf35
Merge pull request #1624 from facebook/smallwlog
Improves compression ratio for small windowLog
2019-06-14 17:28:21 -07:00
Nick Terrell cdb9481e38 [libzstd] Optimize ZSTD_insertBt1() for repetitive data
We would only skip at most 192 bytes at a time before this diff.
This was added to optimize long matches and skip the middle of the
match. However, it doesn't handle the case of repetitive data.

This patch keeps the optimization, but also handles repetitive data
by taking the max of the two return values.

```
> for n in $(seq 9); do echo strategy=$n; dd status=none if=/dev/zero bs=1024k count=1000 | command time -f %U ./zstd --zstd=strategy=$n >/dev/null; done
strategy=1
0.27
strategy=2
0.23
strategy=3
0.27
strategy=4
0.43
strategy=5
0.56
strategy=6
0.43
strategy=7
0.34
strategy=8
0.34
strategy=9
0.35
```

At level 19 with multithreading the compressed size of `silesia.tar` regresses 300 bytes, and `enwik8` regresses 100 bytes.
In single threaded mode `enwik8` is also within 100 bytes, and I didn't test `silesia.tar`.

Fixes Issue #1634.
2019-06-05 20:34:00 -07:00
Yann Collet b3af1873a0 better title formatting for html documentation
must pay attention to /** and /*! patterns.
2019-06-04 10:35:40 -07:00
Yann Collet b5c98fbfd0 Added comments on I/O buffer sizes for streaming
It seems this is still a confusing topic,
as in https://github.com/klauspost/compress/issues/109 .
2019-06-04 10:26:16 -07:00
Yann Collet 80d6ccea79 removed UINT32_MAX
apparently not guaranteed on all platforms,
replaced by UINT_MAX.
2019-05-31 17:27:07 -07:00
Yann Collet fce4df3ab7 fixed wrong assert in double_fast 2019-05-31 17:06:28 -07:00
Yann Collet a968099038 minor code cleaning for new index invalidation strategy 2019-05-31 16:52:37 -07:00
Yann Collet d605f482c7 make double_fast compatible with new index invalidation strategy 2019-05-31 16:50:04 -07:00
Yann Collet a30febaeeb Made fast strategy compatible with new offset validation strategy
fast mode does the same thing as before :
it pre-emptively invalidates any index that could lead to offset > maxDistance.
It's supposed to help speed.

But this logic is performed inside zstd_fast,
so that other strategies can select a different behavior.
2019-05-31 16:34:55 -07:00
Yann Collet 58adb1059f extended exact window size to greedy/lazy modes 2019-05-31 16:08:48 -07:00
Yann Collet bc601bdc6d first implementation of small window size for btopt
noticeably improves compression ratio
when window size is small (< 18).

enwik7	level 19

windowLog	`dev`	`smallwlog`	improvement
23	3.577	3.577	0.02%
22	3.536	3.538	0.06%
21	3.462	3.467	0.14%
20	3.364	3.377	0.39%
19	3.244	3.272	0.86%
18	3.110	3.166	1.80%
17	2.843	3.057	7.53%
16	2.724	2.943	8.04%
15	2.594	2.822	8.79%
14	2.456	2.686	9.36%
13	2.312	2.523	9.13%
12	2.162	2.361	9.20%
11	2.003	2.182	8.94%
2019-05-31 15:55:12 -07:00
Yann Collet b13a9207f9
Merge pull request #1623 from facebook/fullbench
fullbench minor improvements
2019-05-31 14:40:19 -07:00
Yann Collet ed38b645db fullbench: pass proper parameters in scenario 43 2019-05-29 15:26:06 -07:00
Yann Collet 9719fd616c removed nextToUpdate3 from ZSTD_window
it's now a local variable of ZSTD_compressBlock_opt()
2019-05-28 16:18:12 -07:00
Yann Collet 33dabc8c80 get bt matches : made it a bit clearer which parameters are input and output 2019-05-28 16:11:32 -07:00
Yann Collet 327cf6fac1 nextToUpdate3 does not need to be maintained outside of zstd_opt.c
It's re-synchronized with nextToUpdate at beginning of each block.
It only needs to be tracked from within zstd_opt block parser.

Made the logic clear, so that no code tried to maintain this variable.

An even better solution would be to make nextToUpdate3
an internal variable of ZSTD_compressBlock_opt_generic().
That would make it possible to remove it from ZSTD_matchState_t,
thus restricting its visibility to only where it's actually useful.

This would require deeper changes though,
since the matchState is the natural structure to transport parameters into and inside the parser.
2019-05-28 15:26:52 -07:00
Yann Collet 6453f8158f complementary code comments
on variables used / impacted during maxDist check
2019-05-28 14:12:16 -07:00
Yann Collet 4baecdf72a added comments to better understand enforceMaxDist() 2019-05-28 13:15:48 -07:00
Tyler-Tran cb47871a0a [dictBuilder] Be more specific than ERROR(generic) (#1616)
* Specify errors at a finer granularity than `ERROR(generic)`.
* Add tests for bad parameters in the dictionary builder.
2019-05-22 18:57:50 -07:00
Nick Terrell 5f228f8db2 [libzstd] Add a ZSTD_STATIC_ASSERT for BIT_DStream_status 2019-04-23 14:22:16 -07:00
Nick Terrell a892e25374 [libzstd] Error if all sequence bits aren't consumed 2019-04-23 14:07:36 -07:00
Nick Terrell 0fd322f812 [legacy] Fix ZSTDv0*_decodeSequence()
* Version <= 0.5 could read beyond the end of `dumps`, which points into
  the input buffer.
* Check the validity of `dumps` before using it, if it is out of bounds
  return garbage values. There is no return code for this function.
* Introduce `MEM_readLE24()` for simplicity, since I don't want to trust
  that there is an extra byte after `dumps`.
2019-04-19 11:34:52 -07:00
Nick Terrell 2536771134 [legacy] Fix Huffman jump table reads in v01 and v05 2019-04-18 16:20:42 -07:00
Nick Terrell 579f3d7794 [legacy] Fix bug in ZSTD_decodeSeqHeaders() 2019-04-18 13:41:10 -07:00
Nick Terrell ac098c7f5f [legacy] Fix a bug in ZSTDv06_findFrameSizeInfoLegacy() 2019-04-18 13:33:26 -07:00
Nick Terrell ee130a9889 [libzstd] Check the size in readSkippableFrameSize() 2019-04-17 11:41:55 -07:00
Nick Terrell 5922f4e2ae [legacy] Return the right error code 2019-04-17 11:34:52 -07:00
Nick Terrell 450feb0f95 [libzstd] Fix ZSTD_decompressBound() on bad skippable frames
The function didn't verify that the skippable frame size is correct.
2019-04-17 11:29:42 -07:00
Nick Terrell a17fe4c9e5 [visual] Fix unreachable code warning 2019-04-16 11:32:35 -07:00
Nick Terrell de0499f7fa [libzstd] Require ZSTD_MULTITHREAD to create a ZSTDMT_CCtx
ZSTDMT was broken when compiled without ZSTD_MULTITHREAD defined,
because `ZSTD_CCtx_setParameter(cctx, ZSTD_c_nbWorkers, nbWorkerss)`
failed. It was detected by the MSVC test which runs the fuzzer with
multithreading disabled.

This is a very niche use case of a deprecated API, because the API is
inefficient and synchronous, since `threading.h` will be synchronous.
Users almost certainly don't want this, and anyone who tested their code
should realize that it is broken. Therefore, I think it is safe to
require `ZSTD_MULTITHREAD` to be defined to use ZSTDMT.
2019-04-15 23:04:46 -07:00
Josh Soref a880ca239b Spelling (#1582)
* spelling: accidentally

* spelling: across

* spelling: additionally

* spelling: addresses

* spelling: appropriate

* spelling: assumed

* spelling: available

* spelling: builder

* spelling: capacity

* spelling: compiler

* spelling: compressibility

* spelling: compressor

* spelling: compression

* spelling: contract

* spelling: convenience

* spelling: decompress

* spelling: description

* spelling: deflate

* spelling: deterministically

* spelling: dictionary

* spelling: display

* spelling: eliminate

* spelling: preemptively

* spelling: exclude

* spelling: failure

* spelling: independence

* spelling: independent

* spelling: intentionally

* spelling: matching

* spelling: maximum

* spelling: meaning

* spelling: mishandled

* spelling: memory

* spelling: occasionally

* spelling: occurrence

* spelling: official

* spelling: offsets

* spelling: original

* spelling: output

* spelling: overflow

* spelling: overridden

* spelling: parameter

* spelling: performance

* spelling: probability

* spelling: receives

* spelling: redundant

* spelling: recompression

* spelling: resources

* spelling: sanity

* spelling: segment

* spelling: series

* spelling: specified

* spelling: specify

* spelling: subtracted

* spelling: successful

* spelling: return

* spelling: translation

* spelling: update

* spelling: unrelated

* spelling: useless

* spelling: variables

* spelling: variety

* spelling: verbatim

* spelling: verification

* spelling: visited

* spelling: warming

* spelling: workers

* spelling: with
2019-04-12 11:18:11 -07:00
Nick Terrell aafe97b67d [libzstd] Switch dictUses to an enum 2019-04-10 16:50:35 -07:00
Nick Terrell 50b9c41196 [libzstd] Fix decompression dictionary bugs and clean up initialization
Bugs:

* `ZSTD_DCtx_refPrefix()` didn't clear the dictionary after the first
  use. Fix and add a test case.
* `ZSTD_DCtx_reset()` always cleared the dictionary. Fix and add a test
  case.
* After calling `ZSTD_resetDStream()` you could no longer load a
  dictionary, since the stage was set to `zdss_loadHeader`. Fix and add
  a test case.

Cleanup:

* Make `ZSTD_initDStream*()` and `ZSTD_resetDStream()` wrap the new
 advanced API, and add test cases.
* Document the equivalent of these functions in the advanced API and
  document the unstable functions as deprecated.
2019-04-10 12:59:02 -07:00
Nick Terrell 824aaa695f [libzstd] Fix ZSTD_decompressDCtx() with a dictionary
* `ZSTD_decompressDCtx()` did not use the dictionary loaded by
  `ZSTD_DCtx_loadDictionary()`.
* Add a unit test.
* A stacked diff uses `ZSTD_decompressDCtx()` in the
  `dictionary_round_trip` and `dictionary_decompress` fuzzers.
2019-04-09 17:59:27 -07:00
Nick Terrell 48a6427d22 [libzstd] Fix ZSTD_compress2() for multithreaded compression
`ZSTD_compress2()` wouldn't wait for multithreaded compression to
finish. We didn't find this because ZSTDMT will block when it can
compress all in one go, but it can't do that if it doesn't have enough
output space, or if `ZSTD_c_rsyncable` is enabled.

Since we will already sometimes block when using `ZSTD_e_end`, I've
changed `ZSTD_e_end` and `ZSTD_e_flush` to guarantee maximum forward
progress. This simplifies the API, and helps users avoid the easy bug
that was made in `ZSTD_compress2()`

* Found by the libfuzzer fuzzers.
* Added a test case that catches the problem.
* I will make the fuzzers sometimes allocate less than
  `ZSTD_compressBound()` output space.
2019-04-09 16:24:17 -07:00
Nick Terrell e649fad7aa [dictBuilder] Fix displayLevel for corpus warning
Pass the displaylevel into the corpus warning, because it is used in
fast cover and cover, so it needs to respect the local level.
2019-04-08 20:00:18 -07:00
Nick Terrell bfcd5b81d7 [libzstd] Don't check the dictID in fuzzing mode
When `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` is defined don't check
the dictID. This check makes the fuzzers job harder, and it is at the
very beginning.
2019-04-08 19:57:41 -07:00
Nick Terrell 947548c24f Remove double the from README 2019-04-08 16:50:18 -07:00
Nick Terrell 641e594309 [libzstd] Remove ZSTDMT from the shared object
* Remove ZSTDMT from the shared object by default.
* Provide a macro `ZSTD_LEGACY_MULTITHREADED_API` to override it.
* Document it in `lib/README.md`.
2019-04-07 18:47:52 -07:00
Nick Terrell 1dfe37fea9 [libzstd] Stabilize ZSTD_getDictID_*() functions 2019-04-05 18:59:30 -07:00
Nick Terrell ce388fe4d2 [libzstd] Fix return value docs for ZSTD_compressStream2() 2019-04-05 17:44:07 -07:00
Nick Terrell 7231ea72a8 [libzstd] Reword the streaming docs for the new API 2019-04-03 19:21:05 -07:00
Nick Terrell cf7d601bf5 Move the dictionary API and mark the legacy API
* Move the dictionary API below the streaming API
* Mark the legacy streaming API as redundant
2019-04-03 19:16:40 -07:00
Nick Terrell d7d89513d6 Stabilize advance API
This commit moves the candidate advanced API to the stable section.
It makes some minor whitespace changes, but it doesn't change any
of the wording of the documentation.

I'll put up a separate PR that tweaks some of the documentation
once this lands, so that it is easier to review.

NOTE: Even though these functions are now in stable, they aren't
stable until the next release (in under 1 month). It is possible
that they change until then.
2019-04-03 18:43:20 -07:00
Nick Terrell 0827edeace [libzstd] Bump the library version to 1.4.0
Bumps the library version to 1.4.0 in preparation to stabilize the
advanced API.
2019-04-03 18:43:20 -07:00
Nick Terrell 72a3fbc0e4
Merge pull request #1562 from terrelln/2fast
[libzstd] Speed up single segment zstd_fast by 5%
2019-04-03 18:08:15 -07:00
Nick Terrell 00679da22b [libzstd] Setting ZSTD_d_maxWindowLog to 0 means default 2019-04-02 19:20:52 -07:00
Nick Terrell 95624b77e4 [libzstd] Speed up single segment zstd_fast by 5%
This PR is based on top of PR #1563.

The optimization is to process two input pointers per loop.
It is based on ideas from [igzip] level 1, and talking to @gbtucker.

| Platform                | Silesia     | Enwik8 |
|-------------------------|-------------|--------|
| OSX clang-10            | +5.3%       | +5.4%  |
| i9 5 GHz gcc-8          | +6.6%       | +6.6%  |
| i9 5 GHz clang-7        | +8.0%       | +8.0%  |
| Skylake 2.4 GHz gcc-4.8 | +6.3%       | +7.9%  |
| Skylake 2.4 GHz clang-7 | +6.2%       | +7.5%  |

Testing on all Silesia files on my Intel i9-9900k with gcc-8

| Silesia File | Ratio Change | Speed Change |
|--------------|--------------|--------------|
| silesia.tar  | +0.17%       | +6.6%        |
| dickens      | +0.25%       | +7.0%        |
| mozilla      | +0.02%       | +6.8%        |
| mr           | -0.30%       | +10.9%       |
| nci          | +1.28%       | +4.5%        |
| ooffice      | -0.35%       | +10.7%       |
| osdb         | +0.75%       | +9.8%        |
| reymont      | +0.65%       | +4.6%        |
| samba        | +0.70%       | +5.9%        |
| sao          | -0.01%       | +14.0%       |
| webster      | +0.30%       | +5.5%        |
| xml          | +0.92%       | +5.3%        |
| x-ray        | -0.00%       | +1.4%        |

Same tests on Calgary. For brevity, I've only included files
where compression ratio regressed or was much better.

| Calgary File | Ratio Change | Speed Change |
|--------------|--------------|--------------|
| calgary.tar  | +0.30%       | +7.1%        |
| geo          | -0.14%       | +25.0%       |
| obj1         | -0.46%       | +15.2%       |
| obj2         | -0.18%       | +6.0%        |
| pic          | +1.80%       | +9.3%        |
| trans        | -0.35%       | +5.5%        |

We gain 0.1% of compression ratio on Silesia.
We gain 0.3% of compression ratio on enwik8.
I also tested on the GitHub and hg-commands datasets without a dictionary,
and we gain a small amount of compression ratio on each, as well as speed.

I tested the negative compression levels on Silesia on my
Intel i9-9900k with gcc-8:

| Level | Ratio Change | Speed Change |
|-------|--------------|--------------|
| -1    | +0.13%       | +6.4%        |
| -2    | +4.6%        | -1.5%        |
| -3    | +7.5%        | -4.8%        |
| -4    | +8.5%        | -6.9%        |
| -5    | +9.1%        | -9.1%        |

Roughly, the negative levels now scale half as quickly. E.g. the new
level 16 is roughly equivalent to the old level 8, but a bit quicker
and smaller.  If you don't think this is the right trade off, we can
change it to multiply the step size by 2, instead of adding 1. I think
this makes sense, because it gives a bit slower ratio decay.

[igzip]: https://github.com/01org/isa-l/tree/master/igzip
2019-04-02 19:02:50 -07:00
Nick Terrell 56682a7709 Fix ZSTD_estimateCStreamSize_usingCCtxParams()
It wasn't using the ZSTD_CCtx_params correctly. It must actualize
the compression parameters by calling ZSTD_getCParamsFromCCtxParams()
to get the real window log.

Tested by updating the streaming memory usage example in the next
commit. The CHECK() failed before this patch, and passes after.

I also added a unit test to zstreamtest.c that failed before this
patch, and passes after.
2019-04-01 18:02:52 -07:00
Nick Terrell 425ce5547c
Merge pull request #1563 from terrelln/dms-sep
[libzstd] Split out zstd_fast dict match state function
2019-03-29 16:19:21 -06:00