9379 Commits

Author SHA1 Message Date
Yann Collet
05430b25a8 roll SSE implementation of row_lazy match finder
mostly for maintenance convenience.

Performance wise, there is very little change,
slightly faster for slog 3 & 4,
neutral or very slightly negative for slot 5 & 6.
2021-12-14 10:44:23 -08:00
W. Felix Handte
450fca9704 Update Regression Tests w/ New Sizes 2021-12-13 17:29:32 -05:00
W. Felix Handte
82a49c88f9 Increment Step by 1 not 2
I couldn't find a good way to spread `ip0` and `ip1` apart when we accelerate
due to incompressible inputs. (The methods I tried slowed things down quite a
bit.)

Since we aren't splaying ip0 and ip1 apart (which would be like `0_1_2_3_`, as
opposed to the `01__23__` we were actually doing), it's a big ambitious to
increment `step` by 2. Instead, let's increment it by 1, which has the benefit
sliiightly improving compression. Speed remains pretty much unchanged.
2021-12-13 16:59:33 -05:00
Nick Terrell
3e2a70b6fb
Merge pull request #2905 from 15596858998/dev_1205
add test case
2021-12-13 13:45:23 -08:00
W. Felix Handte
6ca5f42402 Rewrite step to Track Increment Between Pairs of Positions
The position updates are rewritten from `ip[N] = ip[N-1] + step` to be
`ip[N] = ip[N-2] + step`. This lets us only deal with the asymmetric spacing
of gaps at setup and then we only have to keep a single `step` variable.

This seems to work quite well on GCC and Clang!
2021-12-13 14:48:26 -05:00
W. Felix Handte
b8434cb754 Allow Templating ZSTD_fast Matchfinders on Acceleration (Lvl < -1) 2021-12-13 14:46:57 -05:00
Felix Handte
65404fe14a
Merge pull request #2923 from IAL32/patch-1
typo: Small spelling mistake in example
2021-12-13 13:15:21 -05:00
zx123123
c69d13eb99
Update playTests.sh 2021-12-13 08:58:42 +08:00
Adrian Castro
e0f9dc0dde
typo: Small spelling mistake in example
Just a couple of characters:
`main` -> `may`
2021-12-11 12:02:23 +01:00
Yann Collet
252ef866fb
Merge pull request #2922 from facebook/x32
x32 compatibility
2021-12-11 00:12:10 -08:00
Yann Collet
e1ab2200ff fixed x32 compatibility 2021-12-10 21:02:17 -08:00
Yann Collet
c94cda283c added x32 compatibility test 2021-12-10 20:56:20 -08:00
W. Felix Handte
ace6a7e746 Decompose step into Two Variables
This avoids an additional addition, at the cost of an additional variable.
2021-12-10 16:44:23 -05:00
W. Felix Handte
22501cd283 Stagger Application of stepSize in ZSTD_fast
This replicates the behavior of @terrelln's `ZSTD_fast` implementation. That
is, it always looks at adjacent pairs of positions, and only applies the
acceleration every other position. This produces a more fine-grained
acceleration.
2021-12-10 16:44:23 -05:00
Yann Collet
4cc5e2818a complete changelog with #2885 2021-12-09 09:53:45 -08:00
Yann Collet
c077b530a0
Merge pull request #2917 from facebook/change151
Update changelog for v1.5.1
2021-12-09 08:45:34 -08:00
Felix Handte
0c26d98c0d
Merge pull request #2910 from felixhandte/reject-irregular-dicts
Reject Irregular Dictionary Files
2021-12-09 11:44:37 -05:00
Yann Collet
3d738307b4 Update changelog for v1.5.1 2021-12-08 16:55:38 -08:00
Yann Collet
57383d2317
Merge pull request #2914 from facebook/xxhash081
updated xxHash to latest v0.8.1
2021-12-08 16:48:46 -08:00
Yann Collet
3ce265fea8 remove offending static assert lines
no idea why visual + clang-cl + appveyor don't like them,
I've not been able to reproduce the issue locally,
but these static assert are very unlikely to deliver a useful signal,
I can't imagine a situation where they will be wrong,
and if they are, then a ton of other things will be broken way before reaching that point.
2021-12-08 15:05:17 -08:00
Nick Terrell
8b40095b3f
Merge pull request #2916 from terrelln/issue-2906
Remove possible NULL pointer addition
2021-12-08 16:51:10 -05:00
Yann Collet
16241b7d26 altered copyright title 2021-12-08 13:18:41 -08:00
W. Felix Handte
9985e10fda Reject Irregular Dictionary Files
I hadn't seen #2890, so I wrote my own version. I like this approach a little
better, since it does an explicit check for a regular file, rather than
passing a magic value.

Addresses #2874.
2021-12-08 16:17:04 -05:00
Yann Collet
a9cd6164d7 removed declarations of XXH3 symbols when XXH_NO_XXH3 is defined
on top of implementations, which were already scoped out.
2021-12-08 12:56:16 -08:00
Yann Collet
27e706de88 replaces malloc / free / memcpy by Zstandard's version 2021-12-08 12:51:04 -08:00
Nick Terrell
b94407b6cf Remove possible NULL pointer addition
Refactor `ZSTDMT_isOverlapped()` to do NULL checks before computing the
end pointer.

Fixes #2906.
2021-12-08 12:40:40 -08:00
Nick Terrell
859e0500ab
Merge pull request #2915 from terrelln/oss-fuzz-build-fix
Fix oss-fuzz build
2021-12-08 15:32:49 -05:00
Felix Handte
33562c55de
Merge pull request #2912 from felixhandte/pkg-config-fix
Fix Up #2659; Build libzstd.pc Whenever Building the Lib on Unix
2021-12-08 15:22:56 -05:00
Nick Terrell
aa7729c9f3 Fix oss-fuzz build
Disable assembly when dataflow sanitizer is enabled.

This regressed in PR #2893, which accidentally removed the check for
dataflow sanitizer.
2021-12-08 11:01:52 -08:00
Yann Collet
fb3522a3fe fixed very minor cast warning under cygwin 2021-12-08 09:48:56 -08:00
W. Felix Handte
9f1dee8fa5 Fix Up #2659; Build libzstd.pc Whenever Building the Lib on Unix 2021-12-08 12:43:34 -05:00
Yann Collet
1c7d2c4dd5 updated xxHash to latest v0.8.1
with minor modifications directly embedded in source :
- does not compile XXH3
- namespace emulation (ZSTD_ prefix)

Incidentally fix #2824
2021-12-07 21:16:15 -08:00
binhdvo
38dfc4699e
Imply -q when stderr is not a tty (#2884)
* Imply -q when stderr is not a tty
2021-12-07 16:56:19 -05:00
Felix Handte
9118ee04c2
Merge pull request #2659 from ericonr/pc
[lib] Fix libzstd.pc for lib-mt builds
2021-12-07 14:18:38 -05:00
Nick Terrell
b6b4c9a3da
Merge pull request #2907 from Hello71/armv6-fix-legacy
Apply FORCE_MEMORY_ACCESS=1 to legacy
2021-12-06 15:41:22 -05:00
Nick Terrell
e7b0ae385e
Merge pull request #2890 from 15596858998/dec_1201
fixbug CLI's -D fails when the argument is not a regular file
2021-12-06 13:18:15 -05:00
Alex Xu (Hello71)
3d773d7013 Apply FORCE_MEMORY_ACCESS=1 to legacy
See #2633, #2881.
2021-12-05 22:51:44 -05:00
15596858998
ba7e8f1051 add test case 2021-12-05 19:12:52 +08:00
Nick Terrell
486472c453
Merge pull request #2893 from terrelln/issue-2789
[asm] Share portability macros and restrict ASM further
2021-12-03 14:07:30 -05:00
Felix Handte
d2c86ec898
Merge pull request #2897 from felixhandte/zstd-deprecated-avoid-deprecated
Avoid Using Deprecated Functions in Deprecated Code
2021-12-03 12:09:58 -05:00
Felix Handte
fd6c34f641
Merge pull request #2899 from felixhandte/cmake-disable-multithreading-android
Disable Multithreading in CMake Builds for Android
2021-12-03 12:09:39 -05:00
Nick Terrell
c284569457 [asm] Share portability macros and restrict ASM further
Move portability macros to `lib/common/portability_macros.h`. This file
only contains platform/feature detection (e.g. 0/1 macros). This file is
shared between C and ASM code, so it cannot include any C code.

Rename `HUF_` ASM macros to be `ZSTD_` prefixed, and move to the new
header.

Restrict `ZSTD_ASM_SUPPORTED` to `__GNUC__`, because we need the GAS
assembler.

Finally, only include the ASM code if we are actually going to use it.
This disables it on all Windows platforms, which should resolve the
problem brought up in Issue #2789.
2021-12-02 16:58:04 -08:00
Nick Terrell
647c1b6615
Merge pull request #2900 from terrelln/issue-2893-test
[CI] Add cmake windows build
2021-12-02 19:51:33 -05:00
Nick Terrell
014bbb29f8
Merge pull request #2898 from terrelln/issue-2862
Improve zstd_opt build speed and size
2021-12-02 19:49:43 -05:00
Nick Terrell
a74a36985a [CI] Add cmake windows build
Build on windows with cmake to ensure everything compiles.
2021-12-02 15:23:33 -08:00
Yann Collet
1bf3d8a475
Merge pull request #2896 from facebook/m68k
Zstandard compiles and run on m68k cpus
2021-12-02 14:25:45 -08:00
W. Felix Handte
4a82bc9d00 Disable Multithreading in CMake Builds for Android 2021-12-02 17:23:42 -05:00
Nick Terrell
e5bfaeede7 Improve zstd_opt build speed and size
Use the same trick as we did for zstd_lazy in PR #2828:
* Create one search function specialization for each (dictMode, mls).
* Select the search function pointer at the top of the match finder.

Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into
every function, since `dictMode` is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.

Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.

Compilation time sees a ~4x reduction:

| Compiler | Flags                            | Dev Time (s) | PR Time (s) | Delta |
|----------|----------------------------------|--------------|-------------|-------|
| gcc      | -O3                              |         10.1 |         2.3 |  -77% |
| gcc      | -O3 -fsanitize=address,undefined |         61.1 |        10.2 |  -83% |
| clang    | -O3                              |          9.0 |         2.1 |  -76% |
| clang    | -O3 -fsanitize=address,undefined |         33.5 |         5.1 |  -84% |

Build size is reduced by 150KB - 200KB:

| Compiler | Dev libzstd.a Size (B) | PR libzstd.a Size (B) | Delta |
|----------|------------------------|-----------------------|-------|
| gcc      |                1327476 |               1177108 |  -11% |
| clang    |                1378324 |               1167780 |  -15% |

There is a <2% speed loss in all cases:

| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta  |
|----------|-------|------------------|-----------------|--------|
| gcc      |    16 |             4.78 |            4.72 | -1.25% |
| gcc      |    17 |             3.49 |            3.46 | -0.85% |
| gcc      |    18 |             2.92 |            2.86 | -2.04% |
| gcc      |    19 |             2.61 |            2.61 |  0.00% |
| clang    |    16 |             4.69 |            4.80 |  2.34% |
| clang    |    17 |             3.53 |            3.49 | -1.13% |
| clang    |    18 |             2.86 |            2.85 | -0.34% |
| clang    |    19 |             2.61 |            2.61 |  0.00% |

Fixes Issue #2862.
2021-12-02 14:19:41 -08:00
W. Felix Handte
e688317652 Fix Include Path 2021-12-02 16:53:52 -05:00
Nick Terrell
01ecd6ffc0
Merge pull request #2892 from terrelln/issue-2785
[CircleCI] Fix short-tests-0
2021-12-02 16:20:56 -05:00