Commit Graph

319 Commits (c2007388a556613f2e3ce1f810f5ff2cee4dfb9c)

Author SHA1 Message Date
Yann Collet 7cf78f1be7 Protects ZSTD_compressBegin_usingCDict() vs NULL cdict dereference
Will issue an error (GENERIC) is cdict==NULL
2017-04-04 12:38:14 -07:00
Nick Terrell 26b046a7c4 Remove unnecessary dictID store 2017-04-03 21:46:28 -07:00
Nick Terrell 39a6cc5172 Make ZSTD_compress_usingCDict() respect contentSizeFlag 2017-04-03 21:09:55 -07:00
Nick Terrell 62ecad3819 Fix ZSTD_initCStream_usingCDict() to use dictionary 2017-04-03 21:05:59 -07:00
Yann Collet 30c7698970 optimize ZSTDMT_compress() memory usage
does no longer allocate temporary buffers
when there is enough room in dstBuffer to decompress directly there.
(previous method would skip that for 1st chunk only).

Also : fix ZSTD_compressBound() for small srcSize
2017-03-31 18:27:03 -07:00
Yann Collet 3f75d52527 Changed ZSTD_compressBound()
required so that if Total = A+B
compressBound(Total) <= compressBound(A) + compressBound(B)
under condition of a minimum size for A and B

Will help for ZSTDMT_compress() memory allocation
2017-03-31 17:11:38 -07:00
Yann Collet eea7858e2b fixed minor warnings in debug code 2017-03-30 16:47:19 -07:00
Yann Collet 34cc487d05 overlap at full windowSize for max compression level
as it provides max compression ratio
2017-03-30 16:23:22 -07:00
Yann Collet 458e955c23 improved ZSTDMT_compress()
Use a bit more threads by default.
Uses overlap segments to boost compression ratio (like the streaming variant)
2017-03-30 15:51:58 -07:00
Yann Collet 6476c51b86 Merge pull request #637 from facebook/zstdmt
Zstdmt
2017-03-30 14:18:37 -07:00
Nick Terrell 5152fb2cb2 Convert all tabs to spaces 2017-03-29 18:51:58 -07:00
Yann Collet ca5a8bbe36 re-added patch ... 2017-03-29 17:15:27 -07:00
Yann Collet 2e2e78de47 removed unnecessary restriction on minmatchLength
it's now transparently translated to nearest value when unsupported
(7->6) (3->4)
2017-03-29 16:02:47 -07:00
Yann Collet 933ce4a1dd fix : minmatch 7 conversion
minmatch 7 now converted to minmatch 6 for strategies which do not support 7
Used to folded into "default", which applied minmatch 4
2017-03-29 14:35:38 -07:00
Yann Collet 2238870eb6 Merge pull request #625 from facebook/loadCDict
limited CDict acceptation criteria to be the same as DDict
2017-03-24 16:06:20 -07:00
Yann Collet 16a0b10781 fixed ZSTD_loadZstdDictionary()
forgot to add the dictionary content
(tests were not failing, just compressing less).

Also : added size protections when adding dict content
since hc/bt table filling would fail if size < 8
2017-03-24 12:46:46 -07:00
Yann Collet 23776ce290 fixed ERROR_GENERIC on dstSize_tooSmall
required by users which depends on this error code to size dest buffer
2017-03-23 17:59:50 -07:00
Yann Collet bea78e8fc2 limited CDict acceptation criteria to be the same as DDict 2017-03-23 15:46:06 -07:00
Nick Terrell eaf69b07f0 Zero pointers after freeing 2017-03-21 13:20:59 -07:00
Yann Collet a41a4ed39a Merge pull request #594 from terrelln/bugs
Small fixes
2017-03-08 14:56:07 -08:00
Nick Terrell e06c303475 Fix ZSTD_sizeof_CStream() 2017-03-08 13:45:10 -08:00
Sean Purcell 881abe44f1 Reduce point at which we reduce offsets to protect against UB 2017-03-07 16:58:08 -08:00
Sean Purcell 3437bf2feb Add build targets to the Makefile, and update CircleCI tests 2017-03-06 15:05:02 -08:00
Nick Terrell 54c4babd8f Always check Huffman tables for ZSTD_lazy+
The compressor always reuses the existing Huffman table if the literals
size is at most 1 KiB. If the compression strategy is `ZSTD_lazy` or
stronger always check to see if reusing the previous table or creating
a new table is better.

This doesn't yet weigh in decompression speed. I don't want to add any
heuristics there until I have real data to work with to ensure that the
heuristic works for at least one use case, preferably more.
2017-03-03 16:49:38 -08:00
Yann Collet f44b55c18d Merge pull request #584 from terrelln/huff-repeat
Allow compressor to repeat Huffman tables
2017-03-02 17:20:11 -08:00
Nick Terrell d051cd5b43 Use workspace for count and CTable 2017-03-02 16:38:07 -08:00
Sean Purcell 553f67e0c1 Remove 'generic' inline strategy
Seems to avoid performance loss for compression.
Same strategy tested on decompression side, did not appear to improve
speed.
2017-03-02 15:18:13 -08:00
Sean Purcell 3d95925a59 Merge remote-tracking branch 'origin/dev' into m32 2017-03-02 15:17:56 -08:00
Nick Terrell a419777eb1 Allow compressor to repeat Huffman tables
* Compressor saves most recently used Huffman table and reuses it
  if it produces better results.
* I attempted to preserve CPU usage profile.
  I intentionally left all of the existing heuristics in place.
  There is only a speed difference on the second block and later.
  When compressing large enough blocks (say >= 4 KiB) there is
  no significant difference in compression speed.
  Dictionary compression of one block is the same speed for blocks
  with literals <= 1 KiB, and after that the difference is not
  very significant.
* In the synthetic data, with blocks 10 KB or smaller, most blocks
  can't use repeated tables because the previous block did not
  contain a symbol that the current block contains.
  Once blocks are about 12 KB or more, most previous blocks have
  valid Huffman tables for the current block, and the compression
  ratio and decompression speed jumped.
* In silesia blocks as small as 4KB can frequently reuse the
  previous Huffman table (85%), but it isn't as profitable, and
  the previous Huffman table only gets used about 3% of the time.
* Microbenchmarks show that `HUF_validateCTable()` takes ~55 ns
  and `HUF_estimateCompressedSize()` takes ~35 ns.
  They are decently well optimized, the first versions took 90 ns
  and 120 ns respectively. `HUF_validateCTable()` could be twice as
  fast, if we cast the `HUF_CElt*` to a `U32*` and compare to 0.
  However, `U32` has an alignment of 4 instead of 2, so I think that
  might be undefined behavior.
* I've ran `zstreamtest` compiled normally, with UASAN and with MSAN
  for 4 hours each.

The worst case for the speed difference is a bunch of small blocks
in the same frame. I modified `bench.c` to compress the input in a
single frame but with blocks of the given block size, set by `-B`.
Benchmarks on level 1:

|  Program  | Block size |   Corpus  | Ratio | Compression MB/s | Decompression MB/s |
|-----------|------------|-----------|-------|------------------|--------------------|
| zstd.base |        256 | synthetic | 2.364 |            110.0 |              297.0 |
|      zstd |        256 | synthetic | 2.367 |            108.9 |              297.0 |
| zstd.base |        256 | silesia   | 2.204 |             93.8 |              415.7 |
|      zstd |        256 | silesia   | 2.204 |             93.4 |              415.7 |
| zstd.base |        512 | synthetic | 2.594 |            144.2 |              420.0 |
|      zstd |        512 | synthetic | 2.599 |            141.5 |              425.7 |
| zstd.base |        512 | silesia   | 2.358 |            118.4 |              432.6 |
|      zstd |        512 | silesia   | 2.358 |            119.8 |              432.6 |
| zstd.base |       1024 | synthetic | 2.790 |            192.3 |              594.1 |
|      zstd |       1024 | synthetic | 2.794 |            192.3 |              600.0 |
| zstd.base |       1024 | silesia   | 2.524 |            148.2 |              464.2 |
|      zstd |       1024 | silesia   | 2.525 |            148.2 |              467.6 |
| zstd.base |       4096 | synthetic | 3.023 |            300.0 |             1000.0 |
|      zstd |       4096 | synthetic | 3.024 |            300.0 |             1010.1 |
| zstd.base |       4096 | silesia   | 2.779 |            223.1 |              623.5 |
|      zstd |       4096 | silesia   | 2.779 |            223.1 |              636.0 |
| zstd.base |      16384 | synthetic | 3.131 |            350.0 |             1150.1 |
|      zstd |      16384 | synthetic | 3.152 |            350.0 |             1630.3 |
| zstd.base |      16384 | silesia   | 2.871 |            296.5 |              883.3 |
|      zstd |      16384 | silesia   | 2.872 |            294.4 |              898.3 |
2017-03-02 13:27:52 -08:00
Sean Purcell d44703d145 Offsets >= 32MB in 32-bits mode 2017-03-01 16:27:56 -08:00
Yann Collet 4bcc69b761 solves warnings when compiling with global XXH_STATIC_LINKING_ONLY
XXH_STATIC_LINKING_ONLY protection macro is intended to be triggered just before the include.
The main idea is to keep this setting local :
user module shall explicitly understand and accept the static linking restriction
which becomes transparent when triggering the macro at project level.
Global definition also triggers redefinition warnings for user modules which do locally define the macro.

This new version compiles lib and cli without warning when the macro is set globally.
That's not a scenario to be recommended, since it trades a local effect for a global one,
but it was easy enough to provide from zstd side.
2017-03-01 11:33:25 -08:00
Yann Collet dccd6b6f65 cli : fix : --rm is silent when input is stdin
previously, app would produce an error message, and stop.
2017-02-27 15:57:50 -08:00
Yann Collet 14312d833e zstdmt : fix : loading prefix from previous segments
There used to be a (very small) chance that
loading prefix from previous segment
would be confused with a real zstd dictionary.
For that to happen, the prefix needs to start
with the same value as dictionary magic.
That's 1 chance in 4 billions if all values have equal probability.
But in fact, since some values are more common (0x00000000 for example)
others are less common, and dictionary magic was selected to be one of them,
so probabilities are likely even lower.

Anyway, this risk is no down to zero
by adding a new CCtx parameter : ZSTD_p_forceRawDict

Current parameter policy : the parameter "stick" to its CCtx,
so any dictionary loading after ZSTD_p_forceRawDict is set
will be loaded in "raw" ("content only") mode,
even if CCtx is re-used multiple times with multiple different dictionary.
It's up to the user to reset this value differently if it needs so.
2017-02-23 23:42:12 -08:00
Yann Collet 831b4890ce minor tests/Makefile refactoring
and update of zstd_manual,html
2017-02-23 23:09:10 -08:00
Sean Purcell 83038d236a Fix bug in FSE distribution normalization 2017-02-22 13:52:48 -08:00
Przemyslaw Skibinski d8114e5802 zstd_compress.c: fix memory leaks 2017-02-21 18:59:56 +01:00
Anders Oleson 517577bf53 spelling fixes in comments
i.e. occurred labeled Huffman
2017-02-20 12:08:59 -08:00
Yann Collet 2252d29a5a Merge branch 'dev' of github.com:facebook/zstd into dev 2017-02-15 12:00:50 -08:00
Yann Collet 4596037042 updated fse version
feature minor refactoring (removing FSE_abs())
also : fix a few minor issues recently introduced in examples
2017-02-15 12:00:03 -08:00
Nick Terrell ecf90ca24b [zstdmt] Fix MSAN failure with ZSTD_p_forceWindow
Reproduction steps:

```
make zstreamtest CC=clang CFLAGS="-O3 -g -fsanitize=memory -fsanitize-memory-track-origins"
./zstreamtest -vv -t4178 -i4178 -s4531
```

How to get to the error in gdb (may be a more efficient way):

* 2 breaks at zstd_compress.c:2418  -- in ZSTD_compressContinue_internal()
* 2 breaks at zstd_compress.c:2276  -- in ZSTD_compressBlock_internal()
* 1 break at zstd_compress.c:1547

Why the error occurred:

When `zc->forceWindow == 1`, after calling `ZSTD_loadDictionaryContent()` we
have `zc->loadedDictEnd == zc->nextToUpdate == 0`. But, we've really loaded up
to `iend` into the dictionary. Then in `ZSTD_compressBlock_internal()` we see
that `current > zc->nextToUpdate + 384`, so we load the last 192 bytes a second
time. In this case the bytes we are loading are a block of all 0s, starting in
the previous block. So when we are loading the last 192 bytes, we find a `match`
in the future, 183 bytes beyond `ip`. Since the block is all 0s, the match
extends to the end of the block. But in `ZSTD_count()` we only check that
`pIn < pInLoopLimit`, but since `pMatch > pIn`, `pMatch` eventually points past
the end of the buffer, causing the MSAN failure.

The fix:

The line changed sets sets `zc->nextToUpdate` to the end of the dictionary.
This is the behavior that existed before `ZSTD_p_forceWindow` was introduced.
This fixes the exposing test case. Since the code doesn't fail without
`zc->forceWindow`, it makes sense that this works. I've run the command
`./zstreamtest -T2mn` 64 times without failures. CI should also verify nothing
obvious broke.
2017-02-13 19:11:22 -08:00
Sean Purcell 2db7249265 Make pledgedSrcSize meaning clear for other functions
- Added tests
- Moved new size functions to static link only
2017-02-09 11:49:58 -08:00
Sean Purcell 0f5c95af44 Disambiguate pledgedSrcSize == 0
- Modify ZSTD CLI to only set contentSizeFlag if it _knows_ the size
- Change pzstd to stop setting contentSizeFlag without accurate pledgedSrcSize
2017-02-08 15:12:46 -08:00
Yann Collet 48bed91606 Merge pull request #527 from facebook/zstdmt
zstdmt refinements
2017-01-31 16:36:46 -08:00
Yann Collet b2e1b3d670 fixed overlapLog==0 => no overlap 2017-01-30 14:54:46 -08:00
Yann Collet 3672d06d06 zstdmt : section size is set to be a minimum of overlapSize
the minimum size condition size is applied transparently (no warning, no error)
like previous minimum section size condition (1 KB) which still applies.
2017-01-30 13:35:45 -08:00
Yann Collet 88df1aed61 changed advanced parameter overlapLog
Follows a positive logic (increasing value => increasing overlap)
which is easier to use
2017-01-30 11:00:00 -08:00
Nick Terrell b42dd27ef5 Add include guards and extern C 2017-01-27 16:00:19 -08:00
Yann Collet f6d4a786fc reduced zstdmt latency when using small custom section sizes with high compression levels
Previous version was requiring a fairly large initial amount of input data
before starting to create compression jobs.
This new version starts the process much sooner.
2017-01-27 15:55:30 -08:00
Yann Collet 717c65d690 Merge pull request #519 from inikep/dev11
Dev11
2017-01-26 14:23:44 -08:00
Yann Collet 8dafb1acf5 CLI : automatically set overlap size to max (windowSize) for max compression level 2017-01-25 17:01:13 -08:00