589 Commits

Author SHA1 Message Date
Yann Collet
439e58d060 improved gcc-9 and gcc-10 decoding speed
the new alignment setting is better for gcc-9 and gcc-10
by about ~+5%.

Unfortunately, it's worse for essentially all other compilers.

Make the new alignment setting conditional to gcc-9+.
2021-05-08 00:01:01 -07:00
Yann Collet
6755baf940 update decoder hot loop alignment
This seems to bring an additional ~+1.2% decompression speed
on average across 10 compilers x 6 scenarios.
2021-05-07 15:18:16 -07:00
Yann Collet
1db5947591 improve decompression speed of long variant by ~+5%
changed strategy,
now unconditionally prefetch the first 2 cache lines,
instead of cache lines corresponding to the first and last bytes of the match.

This better corresponds to cpu expectation,
which should auto-prefetch following cachelines on detecting the sequential nature of the read.

This is globally positive, by +5%,
though exact gains depend on compiler (from -2% to +15%).
The only negative counter-example is gcc-9.
2021-05-07 11:26:14 -07:00
Yann Collet
ee425faaa7 Merge branch 'dev' into d_prefetch_refactor 2021-05-06 19:49:26 -07:00
Nick Terrell
b052b583e5 [lib] Fix UBSAN warning in ZSTD_decompressSequences() 2021-05-06 15:31:30 -07:00
Yann Collet
7ef6d7b36c deeper prefetching pipeline for decompressSequencesLong
pipeline increased from 4 to 8 slots.
This change substantially improves decompression speed when there are long distance offsets.
example with enwik9 compressed at level 22 :
gcc-9 : 947 -> 1039 MB/s
clang-10: 884 -> 946 MB/s

I also checked the "cold dictionary" scenario,
and found a smaller benefit, around ~2%
(measurements are more noisy for this scenario).
2021-05-05 10:04:03 -07:00
Yann Collet
8cde167a27 Merge branch 'dev' into d_prefetch_refactor 2021-05-05 09:13:38 -07:00
Nick Terrell
6cee3c2c4f [trace] Remove default definitions of weak symbols
Instead of providing a default no-op implementation, check the symbols
for `NULL` before accessing them. Providing a default implementation
doesn't reliably work with dynamic linking. Depending on link order the
default implementations may not be overridden. By skipping the default
implementation, all link order issues are resolved. If the symbols
aren't provided the weak function will be `NULL`.
2021-04-26 16:05:39 -07:00
Nick Terrell
a494308ae9 [copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files
* Switch to yearless copyright per FB policy
* Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources
* Add zstd copyright/license header to the `contrib/linux-kernel` sources
* Update the `tests/test-license.py` to check for yearless copyright
* Improvements to `tests/test-license.py`
* Check `contrib/linux-kernel` in `tests/test-license.py`
2021-03-30 10:30:43 -07:00
Nick Terrell
f8ac0ea7ef
Merge pull request #2539 from terrelln/linux-kernel-fixes
Fixes for the next linux kernel patch version
2021-03-24 10:34:29 -07:00
Yann Collet
f5434663ea Refactor prefetching for the decoding loop
Following #2545,
I noticed that one field in `seq_t` is optional,
and only used in combination with prefetching.
(This may have contributed to static analyzer failure to detect correct initialization).

I then wondered if it would be possible to rewrite the code
so that this optional part is handled directly by the prefetching code
rather than delegated as an option into `ZSTD_decodeSequence()`.

This resulted into this refactoring exercise
where the prefetching responsibility is better isolated into its own function
and `ZSTD_decodeSequence()` is streamlined to contain strictly Sequence decoding operations.
Incidently, due to better code locality,
it reduces the need to send information around,
leading to simplified interface, and smaller state structures.
2021-03-19 15:48:17 -07:00
Nick Terrell
756bd59322 [huf][fse] Clean up workspaces
* Move `counting` to a struct in `FSE_decompress_wksp_body()`
* Fix error code in `FSE_decompress_wksp_body()`
* Rename a variable in `HUF_ReadDTableX2_Workspace`
2021-03-17 16:50:37 -07:00
Nick Terrell
cd1551d261 [lib][tracing] Add ZSTD_NO_TRACE macro
When defined, it disables tracing, and avoids including the header.
2021-03-16 11:47:27 -07:00
Nick Terrell
0f18059a4e [huf] Reduce stack usage of HUF_readDTableX2 by ~460 bytes
* Use `HUF_readStats_wksp()`
* Use workspace in `HUF_fillDTableX2*()`
* Clean up workspace usage to use a workspace struct
2021-03-05 12:39:46 -08:00
Nick Terrell
e59c9459a5 [trace] Keep track of a uint64_t tracing context
The most common information that you want to track between begin() and
end() is the timestamp of the begin function, so you can measure the
duration of the (de)compression call. Allow the tracing library to put
this information inside the `ZSTD_TraceCtx`, so it doesn't need to keep
a global map in this case. If a single uint64_t is not enough, the
tracing library can return a unique identifier (like the context
pointer) instead, and use it as a key in a map.

This keeps the simple case simple.
2021-02-09 11:37:05 -08:00
Nick Terrell
54a4998a80 Add basic tracing functionality 2021-02-05 16:28:52 -08:00
Nick Terrell
f9b1e711ba [zstd] Fix NULL pointer addition in ZSTD_checkContinuity()
Don't start a new section when `dstSize == 0` to avoid NULL
pointer addition.
2021-02-05 12:18:06 -08:00
Yann Collet
b9748757b0 fixed minor cast warning 2021-02-05 09:55:54 -08:00
senhuang42
9ae0dd9336 Fix Visual and staticanalyze warnings 2021-01-07 17:58:37 -05:00
senhuang42
c2c9b8a7ec Address comments, clean up interface/internals 2021-01-07 12:29:12 -05:00
senhuang42
22b7bff2bc Add unit test, improve documentation 2021-01-07 12:29:12 -05:00
senhuang42
ea52fc3606 Use XXHash for hash function, create a sensible public interface 2021-01-07 12:29:12 -05:00
senhuang42
7c1a79f232 Add debuglog statements 2021-01-07 12:29:11 -05:00
senhuang42
d1a6a9d285 Reference requested dict ID at decompression time 2021-01-07 12:29:11 -05:00
senhuang42
5a6d3eef2b Allocate memory for DDict hash set when parameter is set 2021-01-07 12:29:11 -05:00
senhuang42
fd5b608f1c Add parameter to control multiple DDicts 2021-01-07 12:29:11 -05:00
senhuang42
f933668d3f Implement hashset for dictIDs 2021-01-07 12:29:11 -05:00
Nick Terrell
66e811d782 [license] Update year to 2021 2021-01-04 17:53:52 -05:00
Yann Collet
0b39531d75 moving all references to release branch
was previously `master`
2020-12-16 23:00:35 -08:00
Yann Collet
95e74616d5 fix multiple minor conversion warnings
unrelated to #2386, just cleaning up while I'm updating this file ...
2020-11-06 09:57:05 -08:00
Yann Collet
2769e4d459 fix incorrect assert
fix #2386, reported by @Neumann-A
2020-11-06 09:44:04 -08:00
Nick Terrell
e3e0775cc8 [API] Add ZSTD_c_stable{In,Out}Buffer parameters
This commit adds the parameters and sets the value in the CCtxParams
but it does not do anything with the value.
2020-10-30 10:54:39 -07:00
Nick Terrell
2e7d174130 Reset all decompression parameters in ZSTD_DCtx_reset()
* Reset all decompression parameters in `ZSTD_DCtx_reset()` when
  resetting parameters.
* Add a test case.
2020-10-01 14:19:21 -07:00
Nick Terrell
9261476b7d [lib] Wrap customMem xor checks in parens for readability
This clarifies operator precedence, and quiets cppcheck in
the Kernel Test Robot. I think this is a slight bonus to
readability, so I am accepting the suggestion.
2020-09-23 23:26:07 -07:00
Nick Terrell
dec7fb03ec [lib] Silence -Wunused-const-variable warnings 2020-09-23 12:59:57 -07:00
Nick Terrell
79ded1b4a9 [lib] Add ZSTD_NO_UNUSED_FUNCTIONS macro to hide unused functions
The unused function definitions are hidden behind a
`#ifndef ZSTD_NO_UNUSED_FUNCTIONS` check.

Initially hiding all functions which are unused and take up more than
2KB of stack space, because these will show up as warnings in the
Linux Kernel build system.
2020-09-09 14:35:39 -07:00
Nick Terrell
f91ed5c766 [lib] s/current/curr because it collides with Linux Kernel macro 2020-09-09 14:35:39 -07:00
Nick Terrell
c465f24457 ZSTD_ prefix mem{cpy,move,set},malloc,calloc,free 2020-08-26 12:26:03 -07:00
Nick Terrell
a686d306d2 Rename ZSTD_{malloc,calloc,free} to ZSTD_custom{Malloc,Calloc,Free} 2020-08-26 12:25:08 -07:00
Nick Terrell
80f577baa2 Move standard includes to zstd_deps.h 2020-08-26 12:25:08 -07:00
Yann Collet
f82d9865b9
Merge pull request #2278 from senhuang42/ignore_checksum_advanced_param
New advanced decompression param to ignore checksums
2020-08-25 12:08:53 -07:00
Nick Terrell
614e446000
Merge pull request #2271 from terrelln/small-blocks
Small block optimizations
2020-08-24 18:54:33 -07:00
Nick Terrell
52f33a1da5 Fix compiler warnings 2020-08-24 16:09:45 -07:00
Nick Terrell
6f301a7903
Merge pull request #2272 from terrelln/dstSize_tooSmall
[fix] Always return dstSize_tooSmall when it is the case
2020-08-24 15:01:17 -07:00
Nick Terrell
6d2f750b37 Document the BMI2 default() functions 2020-08-24 14:44:33 -07:00
senhuang42
a030560d62 Add new DCtx param: validateChecksum and update unit tests 2020-08-24 17:28:00 -04:00
Nick Terrell
1302f8d676 [fix] Always return dstSize_tooSmall when it is the case 2020-08-24 13:38:13 -07:00
senhuang42
44c54a3e31 Addressing comments: more comments, cleanup, remove extra function, checksum logic 2020-08-24 16:14:19 -04:00
senhuang42
ffaa0df76d Document change in CLI for --no-check during decompression in --help menu 2020-08-24 09:49:12 -04:00
senhuang42
e3f5f9658a Added CLI tests for --no-check, fixed ignore checksum logic 2020-08-22 16:05:40 -04:00