deeper prefetching pipeline for decompressSequencesLong
pipeline increased from 4 to 8 slots. This change substantially improves decompression speed when there are long distance offsets. example with enwik9 compressed at level 22 : gcc-9 : 947 -> 1039 MB/s clang-10: 884 -> 946 MB/s I also checked the "cold dictionary" scenario, and found a smaller benefit, around ~2% (measurements are more noisy for this scenario).
This commit is contained in:
parent
455fd1a067
commit
7ef6d7b36c
@ -1254,9 +1254,9 @@ ZSTD_decompressSequencesLong_body(
|
||||
|
||||
/* Regen sequences */
|
||||
if (nbSeq) {
|
||||
#define STORED_SEQS 4
|
||||
#define STORED_SEQS 8
|
||||
#define STORED_SEQS_MASK (STORED_SEQS-1)
|
||||
#define ADVANCED_SEQS 4
|
||||
#define ADVANCED_SEQS STORED_SEQS
|
||||
seq_t sequences[STORED_SEQS];
|
||||
int const seqAdvance = MIN(nbSeq, ADVANCED_SEQS);
|
||||
seqState_t seqState;
|
||||
|
Loading…
x
Reference in New Issue
Block a user