Commit Graph

4887 Commits (31b54b6eea9051cb44f239ecd9d52b453df8880a)

Author SHA1 Message Date
Yann Collet 31b54b6eea updated ZSTD_initStaticDDict() prototype
can also specify dictContentType.
2018-03-20 14:52:02 -07:00
Yann Collet 353117c5d7 implemented ZSTD_DCtx_loadDictionary*()
this required updating ZSTD_createDDict_advanced()
to accept a dictContentType parameter (raw, full, auto).
2018-03-20 13:40:29 -07:00
Yann Collet 9e6ba88363 moved snap to /contrib 2018-03-19 16:15:06 -07:00
Yann Collet 5ad6cc4780
Merge pull request #1015 from elopio/snapcraft
Add the packaging metadata to build the zstd snap
2018-03-19 16:14:42 -07:00
Yann Collet a7b532a347 added docker readme 2018-03-19 16:13:12 -07:00
Yann Collet b06db3b3c5
Merge pull request #1052 from Varunram/dockerfile
Add Dockerfile
2018-03-19 16:07:41 -07:00
Nick Terrell aa4dbd09a1
Pull job/overlap log logic into common function (#1055)
Prepares for LDM integration by separating the job size and overlap logic
into helper functions.
2018-03-19 15:56:36 -07:00
Yann Collet dc6a5471eb
Merge pull request #1056 from terrelln/xxh-parallel
Move XXH64_update() into worker threads
2018-03-19 14:13:25 -07:00
Yann Collet 9f6b7b6ef4
Merge pull request #1049 from facebook/contrib
`/contrib` projects as part of CI tests
2018-03-19 12:04:27 -07:00
Nick Terrell 2253d01b27 Move XXH64_update() into worker threads
* Computes the XXH hash in the worker threads.
* Workers get a sequence number and wait until ther number shows up. On
  error, ensures that its sequence is finished, so future threads don't
  get blocked.
* Sets up for ldm integration, which will go in the same spot.
2018-03-19 11:08:27 -07:00
Yann Collet f2562c02e4
Merge pull request #1053 from Varunram/cmakelists
Add missing checks to CMakeLists
2018-03-18 12:00:24 -07:00
Yann Collet d932fb621f
Merge pull request #1045 from terrelln/mt-single
Use a single buffer in zstdmt
2018-03-18 11:37:35 -07:00
Varunram 90c598f089 Add missing checks to CMakeLists;closes #1023 2018-03-18 15:48:58 +05:30
Varunram 7616200eaf Add Dockerfile
Dockerfile initially proposed by @gyscos at #880
2018-03-18 14:53:48 +05:30
Yann Collet ec0959e701
Merge branch 'dev' into mt-single 2018-03-18 01:06:31 -07:00
Yann Collet d796a78c92
Merge pull request #1050 from terrelln/restore-loadedDictEnd
Restore setting loadedDictEnd
2018-03-18 01:03:26 -07:00
Nick Terrell 4af1fafeb8 Restore setting loadedDictEnd
Setting `loadedDictEnd` was accidently removed from `ZSTD_loadDictionaryContent()`,
which means that dictionary compression will only be able to reference the parts of
the dictionary within the window. The spec allows us to reference the entire
dictionary so long as even one byte is in the window.

`ZSTD_enforceMaxDist()` incorrectly always allowed offsets up to `loadedDictEnd`
beyond the window, even once the dictionary was out of range.

When overflow protection kicked in, the check `current > loadedDictEnd + maxDist`
is incorrect if `loadedDictEnd` isn't reset back to zero. `current` could be reset
below the value, which would incorrectly allow references beyond the window. This
bug is present in `master`, but is very hard to trigger, since it requires both
dictionaries and data which triggers overflow correction.
2018-03-16 14:54:06 -07:00
Yann Collet 5373e44ba7 fixed contrib/adaptive-compression 2018-03-15 17:10:15 -07:00
Yann Collet 97816400ca added /contrib projects to `make all` 2018-03-15 16:40:14 -07:00
Yann Collet 355cb645bf fixed seekable format example 2018-03-15 16:29:28 -07:00
Yann Collet 38cbcb5f1a removed LRM exploratory experiment 2018-03-15 16:26:08 -07:00
Nick Terrell f15a17e19f Use a single buffer in zstdmt
Summary:
Allocate a single input buffer large enough to house each job, as well as
enough space for the IO thread to write 2 extra buffers. One goes in the
`POOL` queue, and one to fill, and then block on a full `POOL` queue.
Since we can't overlap with the prefix, we allocate space for 3 extra
input buffers.

Test Plan:
* CI
* With and without ASAN/UBSAN run zstdmt with different number of threads
  on two large binaries, and verify that their checksums match.
* Test on the tip of the zstdmt ldm integration.

Reviewers: cyan

Differential Revision: https://phabricator.intern.facebook.com/D7284007

Tasks: T25664120
2018-03-15 16:21:33 -07:00
Yann Collet 192542b63c
Merge pull request #1047 from facebook/hufCompress
removed huf_compress_impl.h
2018-03-15 14:14:03 -07:00
Yann Collet 55bb8bb85f
Merge pull request #1037 from terrelln/extern-seq
Expose reference external sequence API
2018-03-15 11:56:43 -07:00
Nick Terrell a271399c97 Expose reference external sequence API
Summary:
* Expose the reference external sequences API for zstdmt.
  Allows external sequences of any length, which get split when necessary.
* Reset the LDM window when the context is reset.
* Store the maximum number of LDM sequences.
* Sequence generation now returns the number of last literals.
* Fix sequence generation to not throw out the last literals when blocks of
  more than 1 MB are encountered.

Expose reference external sequence API

* Expose the reference external sequences API for zstdmt.
* Allows external sequences of any length, which get split when necessary.
* Reset the LDM window when the context is reset.
* Store the maximum number of LDM sequences.
* Sequence generation now returns the number of last literals.
* Fix sequence generation to not throw out the last literals when blocks of
  more than 1 MB are encountered.

Test Plan:
* CI
* Test the zstdmt ldm integration stacked on top of this diff

Reviewers: cyan

Differential Revision: https://phabricator.intern.facebook.com/D7283968

Tasks: T25664120
2018-03-14 18:07:53 -07:00
Nick Terrell 1908c92c46 Merge remote-tracking branch 'upstream/dev' into extern-seq
* upstream/dev:
  Fix overflow protection with wlog=31
2018-03-14 17:26:31 -07:00
Yann Collet 49ce7a0edf
Merge pull request #1035 from terrelln/31-overflow
Fix overflow protection with wlog=31
2018-03-14 17:20:28 -07:00
Yann Collet a909c293c6 Merge branch 'dev' into hufCompress 2018-03-14 16:11:25 -07:00
Nick Terrell a9a6dcba63 Expose reference external sequence API
* Expose the reference external sequences API for zstdmt.
  Allows external sequences of any length, which get split when necessary.
* Reset the LDM window when the context is reset.
* Store the maximum number of LDM sequences.
* Sequence generation now returns the number of last literals.
* Fix sequence generation to not throw out the last literals when blocks of
  more than 1 MB are encountered.
2018-03-14 12:29:31 -07:00
Nick Terrell 33fb966e56 Fix overflow protection with wlog=31
The overflow protection is broken when the window log is `> (3U << 29)`, so 31.
It doesn't work when `current` isn't around `1U << windowLog` ahead of `lowLimit`,
and the the assertion `current > newCurrent` fails. This happens when the same
context is used many times over, but with a large window log, like in zstdmt.

Fix it by triggering correction based on `nextSrc - base` instead of `lowLimit`.

The added test fails before the patch, and passes after.
2018-03-14 11:45:44 -07:00
Yann Collet 4c5cbac179
Merge pull request #1041 from facebook/fasterFast
Negative compression levels
2018-03-13 21:32:46 -07:00
Yann Collet 50f763ec44 fixed several comments are underlined by @terrelln 2018-03-13 14:23:14 -07:00
Yann Collet a95a88af57 removed huf_compress_impl.h
re-imported all functions inside huf_compress.c
for easier source editing.

Also updated a bunch of code comments
for clarification.
2018-03-13 14:14:05 -07:00
Yann Collet bd7bb94361
Merge pull request #1044 from baldurk/remove-utf8-characters
Remove non-ASCII characters in header file comments
2018-03-13 13:22:07 -07:00
Baldur Karlsson 430a2fec19 Remove non-ASCII characters in header file comments
* Replaced a non-breaking space and an en dash with a plain space and
  a hyphen.
* This means the files are simple ASCII and less likely to run into
  codepage issues.
2018-03-13 20:05:53 +00:00
Yann Collet 6e388eeac2
Merge pull request #1042 from JesseTG/jtg/t0-printout
Made -H's printout specify the semantics of -T0
2018-03-12 18:28:34 -07:00
Yann Collet 530eeb41a7
Merge pull request #1039 from facebook/zstd_decompress
Removed zstd_decompress_impl.h
2018-03-12 18:21:46 -07:00
Jesse Talavera-Greenberg 2f70fbf2a3
Made -H's printout specify the semantics of -T0 2018-03-12 20:43:32 -04:00
Yann Collet 2291b85a1e changed ZSTD_p_literalCompression into ZSTD_p_compressLiterals
prefer verb+object construction
2018-03-12 11:44:10 -07:00
Yann Collet a57d43d4d4 updated documentation of targetLength 2018-03-12 11:35:01 -07:00
Yann Collet f24566b597 minor bench improvements
- do not test level 0, as it is converted into level 3,
  which feels strange when compressing multiple levels
- Use direct synchronous mode when a single worker is requested.
2018-03-12 04:02:57 -07:00
Yann Collet 6a9b41b731 create command --fast[=#]
access negative compression levels from command line
for both compression and benchmark modes.

also : ensure proper propagation of parameters
through ZSTD_compress_generic() interface.

added relevant cli tests.
2018-03-11 20:01:23 -07:00
Yann Collet a146ee04ae added negative compression levels
negative compression level trade compression ratio for more compression speed.
They turn off huffman compression of literals,
and use row 0 as baseline with a stepSize = -cLevel.

added associated test in fuzzer

also added : new advanced parameter ZSTD_p_literalCompression
2018-03-11 05:21:53 -07:00
Yann Collet facc09aa03 minor compression level adaptation
level 12 compresses slightly more and faster
due to better btlazy2 mode
2018-03-11 03:06:52 -07:00
Yann Collet e0ba7ccdec
Merge pull request #1038 from HaydnTrigg/dev
Visual Studio 2017 build scripts
2018-03-10 10:38:30 -08:00
Haydn Trigg c6351021e4 Visual Studio 2017 build scripts 2018-03-11 00:15:31 +10:30
Yann Collet fe321f9e2a re-integrate ZSTD_decompressSequencesLong() into zstd_decompress.c
removed zstd_decompress_impl.h
2018-03-09 19:48:06 -08:00
Yann Collet 89a2ebb971 incorporated ZSTD_decompressSequences() into zstd_decompress() 2018-03-09 19:35:57 -08:00
Yann Collet cdb1f1433e incorporated ZSTD_initFseState() inside zstd_decompress.c 2018-03-09 18:16:10 -08:00
Yann Collet a166eae1ba incorporate ZSTD_decodeSequenceLong() within zstd_decompress.c 2018-03-09 18:11:14 -08:00