Yann Collet
02ece5d59f
Merge pull request #2653 from TrianglesPCT/dev
...
Enable SSE2 compression path to work on MSVC
2021-05-17 11:20:50 -07:00
Dan Nelson
54f78e3df8
ZSTD_VecMask_next: fix incorrect variable name in fallback code path
2021-05-15 10:20:37 -05:00
TrianglesPCT
bee0ef5647
Update zstd_lazy.c
...
It put the changes back when I tried to make a separate pull request, i don't understand githubs interface at all.
2021-05-14 19:23:13 -06:00
TrianglesPCT
d688ab1e0c
Add files via upload
...
AVX2
2021-05-14 19:18:12 -06:00
TrianglesPCT
bb1cdd8c63
Update zstd_lazy.c
...
add space
2021-05-14 19:11:28 -06:00
TrianglesPCT
a62856bf65
Update zstd_lazy.c
...
Remove the AVX2 part
2021-05-14 19:10:24 -06:00
TrianglesPCT
8f7ea1afeb
Update zstd_lazy.c
...
Switch to other comment style
2021-05-14 19:02:34 -06:00
TrianglesPCT
0e071214b5
Update zstd_lazy.c
...
switch to unaligned load as I don't know if buffer will always be aligned to 32 bytes, and compilers aside from MSVC might actually use aligned loads
2021-05-14 17:03:30 -06:00
TrianglesPCT
69ac124b12
Update zstd_lazy.c
2021-05-14 16:53:19 -06:00
TrianglesPCT
0b9f4bb0ff
Update zstd_lazy.c
...
use 8bit
2021-05-14 16:47:24 -06:00
TrianglesPCT
77d54eb3b3
Add files via upload
2021-05-14 16:40:32 -06:00
TrianglesPCT
25bda9053a
Add files via upload
...
msvc suport
avx2 path
2021-05-14 16:32:04 -06:00
Nick Terrell
10b35b312b
[lib] Fix off-by-one error in repcode checks
...
The repcode checks disallowed repcodes that are equal to `windowLow`.
This is slightly inefficient, but isn't a problem on its own. Together
with the next commit, it cause non-determinism.
2021-05-13 17:05:59 -07:00
Sen Huang
e6c8a5dd40
Fix incorrect usages of repIndex across all strategies
2021-05-04 19:50:55 -04:00
felixhandte
efa6dfa729
Apply DDS adjustments to avoid assert failures
2021-04-23 16:41:00 -04:00
Sen Huang
8844f93957
Adjust nb elements to prefetch in ZSTD_row_fillHashCache()
2021-04-12 14:24:58 -04:00
Sen Huang
4d63d6e8aa
Update results.csv, add Row hash to regression test
2021-04-07 10:31:41 -07:00
Nick Terrell
4694423c4f
Add and integrate lazy row hash strategy
2021-04-07 09:53:34 -07:00
Nick Terrell
a494308ae9
[copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files
...
* Switch to yearless copyright per FB policy
* Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources
* Add zstd copyright/license header to the `contrib/linux-kernel` sources
* Update the `tests/test-license.py` to check for yearless copyright
* Improvements to `tests/test-license.py`
* Check `contrib/linux-kernel` in `tests/test-license.py`
2021-03-30 10:30:43 -07:00
Nick Terrell
66e811d782
[license] Update year to 2021
2021-01-04 17:53:52 -05:00
W. Felix Handte
c5fab8848a
Document searchFuncs Table
2020-09-10 22:10:02 -04:00
W. Felix Handte
85a95840e4
Further Consolidate Dict Mode Checks
2020-09-10 22:10:02 -04:00
W. Felix Handte
efa33861f2
Attempt to Fix MSVC Warnings
2020-09-10 22:10:02 -04:00
W. Felix Handte
ed43832770
Simplify Match Limit Checks
...
Seems like a ~1.25% speedup.
2020-09-10 22:10:02 -04:00
W. Felix Handte
06d240b8a7
Use All Available Space in the Hash Table to Extent Chain Table Reach
...
Rather than restrict our temp chain table to 2 ** chainLog entries, this
commit uses all available space to reach further back to gather longer
chains to pack into the DDSS chain table.
2020-09-10 22:10:02 -04:00
W. Felix Handte
b2b0641ea0
Rewrite Table Fill to Retain Cache Entries Beyond Chain Window
2020-09-10 22:10:02 -04:00
W. Felix Handte
916238d9dc
Avoid Malloc in Table Fill; Pack Tmp Structure into Hash Table
2020-09-10 22:10:02 -04:00
W. Felix Handte
f42c5bddd9
Truncate Chain at Last Possible Attempt
...
Make the chain table denser?
2020-09-10 22:10:02 -04:00
W. Felix Handte
20a020edbc
Prefetch Chain Table Matches
2020-09-10 22:10:02 -04:00
W. Felix Handte
9b9feb84f2
Lay Out Chain Table Chains Contiguously
...
Rather than interleave all of the chain table entries, tying each entry's
position to the corresponding position in the input, this commit changes the
layout so that all the entries in a single chain are laid out next to each
other. The last entry in the hash table's bucket for this hash is now a packed
pointer of position + length of this chain.
This cannot be merged as written, since it allocates temporary memory inside
ZSTD_dedicatedDictSearch_lazy_loadDictionary().
2020-09-10 22:10:02 -04:00
W. Felix Handte
66509c7bf4
Only Insert Positions Inside the Chain Window
2020-09-10 22:10:02 -04:00
W. Felix Handte
d214d8c859
Shorten Dict Mode Conditionals in Order to Improve Readability
2020-09-10 18:51:52 -04:00
W. Felix Handte
f49c1563ff
Force-Inline ZSTD_insertAndFindFirstIndex_internal()
...
Without this, gcc was declining to inline the function in `ZSTD_noDict` mode,
resulting in a ~10% slowdown.
2020-09-10 18:51:52 -04:00
W. Felix Handte
cab86b074f
Clean Up Search Function Selection
2020-09-10 18:51:52 -04:00
W. Felix Handte
2ffbde0d95
Fix -Wshorten-64-to-32
Error
2020-09-10 18:51:52 -04:00
W. Felix Handte
d332f57897
Permit Matching Against Lowest Valid Position
...
This comparison was previously faulty: the lowest valid position is itself
valid, and we should therefore be allowed to match against it.
2020-09-10 18:51:52 -04:00
W. Felix Handte
7b9a755ac9
Remove Chain Limit on Hash Cache Entries; Slightly Improve Compression
...
Entries in the hashTable chain cache aren't subject to the same aliasing that
the circular chain table is subject to. As such, we don't need to stop when we
cross the chain limit. We can delve deeper. :)
2020-09-10 18:51:52 -04:00
W. Felix Handte
e8b4011b52
Split Lookups in Hash Cache and Chain Table into Two Loops
...
Sliiiight speedup.
2020-09-10 18:51:52 -04:00
W. Felix Handte
9e83c782f8
Simplify DDS Hash Table Construction
...
No need to walk the chainTable; we can just keep shifting the entries in the
hashTable.
2020-09-10 18:51:52 -04:00
W. Felix Handte
5390fee4f7
Rename and Move DD_BLOG Constant to ZSTD_LAZY_DDSS_BUCKET_LOG
2020-09-10 18:51:52 -04:00
W. Felix Handte
5e91ae27eb
Prefetch First Batch of Match Positions; +11% Speed in Level 5 w/ 1 Dict
2020-09-10 18:51:52 -04:00
W. Felix Handte
df386b3d8d
Fix Off-By-One Error in Counting DDS Search Attempts
...
This caused us to double-search the first position and fail to search the
last position in the chain, slowing down search and making it less effective.
2020-09-10 18:51:52 -04:00
W. Felix Handte
a494111385
Move Prefetch Before Insertion; Speed Up ~6%
2020-09-10 18:51:52 -04:00
W. Felix Handte
eede46a47e
Misc Refactor of DDS Search Code
2020-09-10 18:51:52 -04:00
W. Felix Handte
34b545acb0
Add a ZSTD_dedicatedDictSearch ZSTD_dictMode_e to Allow Const Propagation
...
Speed +1.5%.
2020-09-10 18:51:52 -04:00
Bimba Shrestha
e29bc3a009
using dict mls instead of src mls
2020-09-10 18:51:52 -04:00
Bimba Shrestha
145c2d12f9
add hashtable head prefetching
2020-09-10 18:51:52 -04:00
Bimba Shrestha
5d5507788d
change method name for consistency
2020-09-10 18:51:52 -04:00
Bimba Shrestha
628559d0e4
loading dict using new algorithm
2020-09-10 18:51:52 -04:00
Bimba Shrestha
22705f0c93
adding dedicatedDictSearch algorithm
2020-09-10 18:51:52 -04:00