Commit Graph

4958 Commits (eba13dba2e2bac09bef231164b3cc08ffda999d7)

Author SHA1 Message Date
Yann Collet 33a3f18848 fixed wrong size test 2018-02-26 18:27:51 -08:00
Yann Collet d18d43aaf9
Merge pull request #1024 from terrelln/window-split
Split the window state into substructure
2018-02-26 17:18:33 -08:00
Yann Collet 89741653ab added error code workSpace_tooSmall 2018-02-26 15:11:50 -08:00
Yann Collet 6cdf690441 minor cleaning of huff0
Update code documentation, and properly names a few "magic constants".
Also, HUF_compress_internal() gets a cleaner way
to determine size of tables inside workspace.
2018-02-26 14:52:23 -08:00
Nick Terrell 6b88d592fd Reduce ZSTD_CHAINLOG_MAX to 29 in 32-bit mode 2018-02-26 13:30:24 -08:00
Nick Terrell 7e5e226cbf Split the window state into substructure 2018-02-26 13:29:57 -08:00
Yann Collet 50bc2ce95e
Merge pull request #1021 from terrelln/lrm-split
Split block compresser out of long range matcher
2018-02-23 17:36:51 -08:00
Yann Collet 653383f74a minor nit from Mac XCode 2018-02-22 15:44:26 -08:00
Nick Terrell 7e2bf4ebad Remove long range matcher immediate repcode check
The compression ratio gets about 0.01% worse on the files I tested, but the
code is much simpler.
2018-02-22 15:18:47 -08:00
Nick Terrell af866b3a58 Split block compresser out of long range matcher
* `ZSTD_ldm_generateSequences()` generates the LDM sequences and
  stores them in a table. It should work with any chunk size, but
  is currently only called one block at a time.
* `ZSTD_ldm_blockCompress()` emits the pre-defined sequences, and
  instead of encoding the literals directly, it passes them to a
  secondary block compressor. The code to handle chunk sizes greater
  than the block size is currently commented out, since it is unused.
  The next PR will uncomment exercise this code.
* During optimal parsing, ensure LDM `minMatchLength` is at least
  `targetLength`. Also don't emit repcode matches in the LDM block
  compressor. Enabling the LDM with the optimal parser now actually improves
  the compression ratio.
* The compression ratio is very similar to before. It is very slightly
  different, because the repcode handling is slightly different. If I remove
  immediate repcode checking in both branches the compressed size is exactly
  the same.
* The speed looks to be the same or better than before.

Up Next (in a separate PR)
--------------------------

Allow sequence generation to happen prior to compression, and produce more
than a block worth of sequences. Expose some API for zstdmt to consume.
This will test out some currently untested code in
`ZSTD_ldm_blockCompress()`.
2018-02-22 15:18:41 -08:00
Yann Collet 4fb071ec3c
Merge pull request #1022 from facebook/bmi2IntoC
Implemented BMI2 functions directly within huf_decompress.c
2018-02-22 14:30:43 -08:00
Yann Collet 0fd4df6ed3 Implemented BMI2 functions directly within huf_decompress.c
This makes it easier to edit for maintenance and evolutions
(I plan to experiment modifications in huffman decompression functions).

The methology followed seems broadly applicable to other BMI2 modules.

Performance was tracked rigorously at each step,
there is no noticeable loss (nor win) of performance compared to `#include` version.

Note however that 4X decoder variants tend to be extremely sensitive to code alignment.
This source code resulted in pretty good performance for gcc 7.2 and 7.3,
but future changes (even in other parts of the code) might trigger the issue again.
2018-02-22 10:51:47 -08:00
Yann Collet 4d6632c8f3
Merge pull request #1020 from facebook/betterBench
updated fullbench measurement methodology
2018-02-21 14:51:39 -08:00
Yann Collet 6e481504ee fullbench includes assert.h
as it is missing for Windows
2018-02-21 11:42:23 -08:00
Yann Collet 9c5a8040a9 fixed huf_compress workspace size 2018-02-21 11:34:49 -08:00
Yann Collet 364ce19463 update fullbench measurement methodology
to use less calls to time(), like bench.c.

also upgraded accuracy to nanosecond.
2018-02-21 09:43:32 -08:00
Yann Collet 993ffffba3
Merge pull request #1019 from facebook/betterBench
improve benchmark measurement for small inputs
2018-02-21 05:47:08 -08:00
Yann Collet 25d00d10fc fixed minor conversion warning 2018-02-20 16:52:28 -08:00
Yann Collet 010ba5f71f
Merge pull request #1017 from terrelln/c-bmi2
[compress] Support BMI2
2018-02-20 15:34:59 -08:00
Yann Collet 3538a535bf use TIMELOOP_NANOSEC
as suggested by @terrelln
2018-02-20 15:33:56 -08:00
Yann Collet d3364aa39e improve benchmark measurement for small inputs
by invoking time() once per batch, instead of once per compression / decompression.
Batch is dynamically resized so that each round lasts approximately 1 second.

Also : increases time accuracy to nanosecond
2018-02-20 14:58:40 -08:00
Nick Terrell 6e128d3534 [BMI2] Add comments to the bmi2 variable in the contexts 2018-02-20 14:12:11 -08:00
Yann Collet 70163bf0d3 added clarification comments in zstd_errors.h
answering some points in #1018
2018-02-20 12:54:49 -08:00
Yann Collet 7117ea8bec
Merge pull request #1011 from terrelln/bmi2
[decompress] Support BMI2
2018-02-15 11:40:34 -08:00
Nick Terrell b58f01537e [compress] Support BMI2 2018-02-14 19:20:32 -08:00
Nick Terrell 4319132312 [decompress] Support BMI2 2018-02-13 17:00:15 -08:00
Leo Arias 5ec97fff17 Add the packaging metadata to build the zstd snap 2018-02-14 00:37:58 +00:00
Yann Collet 5cb1144872 fixed --single-thread
was incorrectly set to -T0 (use as many cores as possible) previously
2018-02-13 14:56:35 -08:00
Yann Collet 9716250197
Merge pull request #1014 from facebook/fasterDec
Faster decoding speed
2018-02-13 12:05:54 -08:00
Yann Collet 9b184359e2 pretify last unit test output 2018-02-13 10:09:01 -08:00
Yann Collet 2524cbd847 added code comment on how to generate default tables
as suggested by @terrelln
2018-02-13 10:02:25 -08:00
Yann Collet 71c07966bb added SEQSYMBOL_TABLE_SIZE()
as suggested by @terrelln's comment
2018-02-12 16:52:15 -08:00
Yann Collet 821efa466e fixed logo path 2018-02-10 21:05:48 -08:00
Yann Collet 5f7495371e Merge branch 'dev' into fasterDec 2018-02-10 14:24:44 -08:00
Yann Collet 992c2370f6
Merge pull request #1010 from facebook/flexibleLevel
Updatable compression parameters
2018-02-10 14:19:54 -08:00
Yann Collet 9945e60ac4 Merge branch 'dev' into flexibleLevel 2018-02-10 11:54:49 -08:00
Yann Collet 04a3f85ce7 fixed gcc warning on a switch code path 2018-02-09 16:16:27 -08:00
Yann Collet 4e3db17cab
Merge pull request #1013 from facebook/fasterDec32
Disable Long Offset mode in 32-bits
2018-02-09 16:13:55 -08:00
Yann Collet 75689838e4 specify new command --single-thread 2018-02-09 15:55:41 -08:00
Yann Collet af48f0b62b fix : offset table pointer when using default table 2018-02-09 15:15:46 -08:00
Yann Collet 426944c3e3 fixed strict aliasing issue
tuned threshold
2018-02-09 13:24:11 -08:00
Yann Collet 64ee732694 decide long-offset mode based on offcode statistics
threshold vaguely estimated
2018-02-09 12:33:28 -08:00
Yann Collet c72091556b fixed minor nit as per @terrelln's comments 2018-02-09 09:46:08 -08:00
Yann Collet 4beaeaace5 Merge branch 'dev' into flexibleLevel 2018-02-09 09:15:05 -08:00
Yann Collet 6bfe50ad48 re-enabled ZSTD_decompressSequencesLong() 2018-02-09 09:14:25 -08:00
Yann Collet 1850597eaa pre-calculated default decoding tables 2018-02-09 06:01:02 -08:00
Yann Collet ab75df21ed fixed mono-symbol distribution 2018-02-09 05:12:13 -08:00
Yann Collet 421a2716d8 fixed default fse distributions
but would be better to pre-calculate tables, for speed
2018-02-09 04:50:58 -08:00
Yann Collet 95424409ea addBits and baseline into FSE decoding table
note : unfinished
- need new default tables
- need modify long mode
2018-02-09 04:25:15 -08:00
Yann Collet cc61a3694a Merge branch 'dev' into fasterDec 2018-02-09 02:41:02 -08:00