Comply with suggested comments by @terrelln

created FSE_CTABLE_SIZE() and FSE_DTABLE_SIZE()
This commit is contained in:
Yann Collet 2017-04-26 11:39:35 -07:00
parent 7271203bdb
commit e42afbc6fa
4 changed files with 78 additions and 75 deletions

View File

@ -57,46 +57,46 @@
<pre><b>size_t ZSTD_compress( void* dst, size_t dstCapacity, <pre><b>size_t ZSTD_compress( void* dst, size_t dstCapacity,
const void* src, size_t srcSize, const void* src, size_t srcSize,
int compressionLevel); int compressionLevel);
</b><p> Compresses `src` content as a single zstd compressed frame into already allocated `dst`. </b><p> Compresses `src` content as a single zstd compressed frame into already allocated `dst`.
Hint : compression runs faster if `dstCapacity` >= `ZSTD_compressBound(srcSize)`. Hint : compression runs faster if `dstCapacity` >= `ZSTD_compressBound(srcSize)`.
@return : compressed size written into `dst` (<= `dstCapacity), @return : compressed size written into `dst` (<= `dstCapacity),
or an error code if it fails (which can be tested using ZSTD_isError()). or an error code if it fails (which can be tested using ZSTD_isError()).
</p></pre><BR> </p></pre><BR>
<pre><b>size_t ZSTD_decompress( void* dst, size_t dstCapacity, <pre><b>size_t ZSTD_decompress( void* dst, size_t dstCapacity,
const void* src, size_t compressedSize); const void* src, size_t compressedSize);
</b><p> `compressedSize` : must be the _exact_ size of some number of compressed and/or skippable frames. </b><p> `compressedSize` : must be the _exact_ size of some number of compressed and/or skippable frames.
`dstCapacity` is an upper bound of originalSize. `dstCapacity` is an upper bound of originalSize.
If user cannot imply a maximum upper bound, it's better to use streaming mode to decompress data. If user cannot imply a maximum upper bound, it's better to use streaming mode to decompress data.
@return : the number of bytes decompressed into `dst` (<= `dstCapacity`), @return : the number of bytes decompressed into `dst` (<= `dstCapacity`),
or an errorCode if it fails (which can be tested using ZSTD_isError()). or an errorCode if it fails (which can be tested using ZSTD_isError()).
</p></pre><BR> </p></pre><BR>
<pre><b>unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize); <pre><b>unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize);
</b><p> NOTE: This function is planned to be obsolete, in favour of ZSTD_getFrameContentSize. </b><p> NOTE: This function is planned to be obsolete, in favour of ZSTD_getFrameContentSize.
ZSTD_getFrameContentSize functions the same way, returning the decompressed size of a single ZSTD_getFrameContentSize functions the same way, returning the decompressed size of a single
frame, but distinguishes empty frames from frames with an unknown size, or errors. frame, but distinguishes empty frames from frames with an unknown size, or errors.
Additionally, ZSTD_findDecompressedSize can be used instead. It can handle multiple Additionally, ZSTD_findDecompressedSize can be used instead. It can handle multiple
concatenated frames in one buffer, and so is more general. concatenated frames in one buffer, and so is more general.
As a result however, it requires more computation and entire frames to be passed to it, As a result however, it requires more computation and entire frames to be passed to it,
as opposed to ZSTD_getFrameContentSize which requires only a single frame's header. as opposed to ZSTD_getFrameContentSize which requires only a single frame's header.
'src' is the start of a zstd compressed frame. 'src' is the start of a zstd compressed frame.
@return : content size to be decompressed, as a 64-bits value _if known_, 0 otherwise. @return : content size to be decompressed, as a 64-bits value _if known_, 0 otherwise.
note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode. note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode.
When `return==0`, data to decompress could be any size. When `return==0`, data to decompress could be any size.
In which case, it's necessary to use streaming mode to decompress data. In which case, it's necessary to use streaming mode to decompress data.
Optionally, application can still use ZSTD_decompress() while relying on implied limits. Optionally, application can still use ZSTD_decompress() while relying on implied limits.
(For example, data may be necessarily cut into blocks <= 16 KB). (For example, data may be necessarily cut into blocks <= 16 KB).
note 2 : decompressed size is always present when compression is done with ZSTD_compress() note 2 : decompressed size is always present when compression is done with ZSTD_compress()
note 3 : decompressed size can be very large (64-bits value), note 3 : decompressed size can be very large (64-bits value),
potentially larger than what local system can handle as a single memory segment. potentially larger than what local system can handle as a single memory segment.
In which case, it's necessary to use streaming mode to decompress data. In which case, it's necessary to use streaming mode to decompress data.
note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified. note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified.
Always ensure result fits within application's authorized limits. Always ensure result fits within application's authorized limits.
Each application can set its own limits. Each application can set its own limits.
note 5 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameParams() to know more. note 5 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameParams() to know more.
</p></pre><BR> </p></pre><BR>
<h3>Helper functions</h3><pre></pre><b><pre>int ZSTD_maxCLevel(void); </b>/*!< maximum compression level available */<b> <h3>Helper functions</h3><pre></pre><b><pre>int ZSTD_maxCLevel(void); </b>/*!< maximum compression level available */<b>
@ -106,28 +106,28 @@ const char* ZSTD_getErrorName(size_t code); </b>/*!< provides readable strin
</pre></b><BR> </pre></b><BR>
<a name="Chapter4"></a><h2>Explicit memory management</h2><pre></pre> <a name="Chapter4"></a><h2>Explicit memory management</h2><pre></pre>
<h3>Compression context</h3><pre> When compressing many times, <h3>Compression context</h3><pre> When compressing many times,
it is recommended to allocate a context just once, and re-use it for each successive compression operation. it is recommended to allocate a context just once, and re-use it for each successive compression operation.
This will make workload friendlier for system's memory. This will make workload friendlier for system's memory.
Use one context per thread for parallel execution in multi-threaded environments. Use one context per thread for parallel execution in multi-threaded environments.
</pre><b><pre>typedef struct ZSTD_CCtx_s ZSTD_CCtx; </pre><b><pre>typedef struct ZSTD_CCtx_s ZSTD_CCtx;
ZSTD_CCtx* ZSTD_createCCtx(void); ZSTD_CCtx* ZSTD_createCCtx(void);
size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx); size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx);
</pre></b><BR> </pre></b><BR>
<pre><b>size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel); <pre><b>size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel);
</b><p> Same as ZSTD_compress(), requires an allocated ZSTD_CCtx (see ZSTD_createCCtx()). </b><p> Same as ZSTD_compress(), requires an allocated ZSTD_CCtx (see ZSTD_createCCtx()).
</p></pre><BR> </p></pre><BR>
<h3>Decompression context</h3><pre> When decompressing many times, <h3>Decompression context</h3><pre> When decompressing many times,
it is recommended to allocate a context just once, and re-use it for each successive compression operation. it is recommended to allocate a context just once, and re-use it for each successive compression operation.
This will make workload friendlier for system's memory. This will make workload friendlier for system's memory.
Use one context per thread for parallel execution in multi-threaded environments. Use one context per thread for parallel execution in multi-threaded environments.
</pre><b><pre>typedef struct ZSTD_DCtx_s ZSTD_DCtx; </pre><b><pre>typedef struct ZSTD_DCtx_s ZSTD_DCtx;
ZSTD_DCtx* ZSTD_createDCtx(void); ZSTD_DCtx* ZSTD_createDCtx(void);
size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
</pre></b><BR> </pre></b><BR>
<pre><b>size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); <pre><b>size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
</b><p> Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()). </b><p> Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()).
</p></pre><BR> </p></pre><BR>
<a name="Chapter5"></a><h2>Simple dictionary API</h2><pre></pre> <a name="Chapter5"></a><h2>Simple dictionary API</h2><pre></pre>

View File

@ -316,6 +316,10 @@ If there is an error, the function will return an error code, which can be teste
#define FSE_CTABLE_SIZE_U32(maxTableLog, maxSymbolValue) (1 + (1<<(maxTableLog-1)) + ((maxSymbolValue+1)*2)) #define FSE_CTABLE_SIZE_U32(maxTableLog, maxSymbolValue) (1 + (1<<(maxTableLog-1)) + ((maxSymbolValue+1)*2))
#define FSE_DTABLE_SIZE_U32(maxTableLog) (1 + (1<<maxTableLog)) #define FSE_DTABLE_SIZE_U32(maxTableLog) (1 + (1<<maxTableLog))
/* or use the size to malloc() space directly. Pay attention to alignment restrictions though */
#define FSE_CTABLE_SIZE(maxTableLog, maxSymbolValue) (FSE_CTABLE_SIZE_U32(maxTableLog, maxSymbolValue) * sizeof(FSE_CTable))
#define FSE_DTABLE_SIZE(maxTableLog) (FSE_DTABLE_SIZE_U32(maxTableLog) * sizeof(FSE_DTable))
/* ***************************************** /* *****************************************
* FSE advanced API * FSE advanced API

View File

@ -106,7 +106,6 @@ typedef enum { set_basic, set_rle, set_compressed, set_repeat } symbolEncodingTy
#define LONGNBSEQ 0x7F00 #define LONGNBSEQ 0x7F00
#define MINMATCH 3 #define MINMATCH 3
#define EQUAL_READ32 4
#define Litbits 8 #define Litbits 8
#define MaxLit ((1<<Litbits) - 1) #define MaxLit ((1<<Litbits) - 1)

View File

@ -29,9 +29,9 @@ typedef enum { ZSTDcs_created=0, ZSTDcs_init, ZSTDcs_ongoing, ZSTDcs_ending } ZS
/* entropy tables always have same size */ /* entropy tables always have same size */
static size_t const hufCTable_size = HUF_CTABLE_SIZE(255); static size_t const hufCTable_size = HUF_CTABLE_SIZE(255);
static size_t const litlengthCTable_size = FSE_CTABLE_SIZE_U32(LLFSELog, MaxLL) * sizeof(FSE_CTable); static size_t const litlengthCTable_size = FSE_CTABLE_SIZE(LLFSELog, MaxLL);
static size_t const offcodeCTable_size = FSE_CTABLE_SIZE_U32(OffFSELog, MaxOff) * sizeof(FSE_CTable); static size_t const offcodeCTable_size = FSE_CTABLE_SIZE(OffFSELog, MaxOff);
static size_t const matchlengthCTable_size = FSE_CTABLE_SIZE_U32(MLFSELog, MaxML) * sizeof(FSE_CTable); static size_t const matchlengthCTable_size = FSE_CTABLE_SIZE(MLFSELog, MaxML);
static size_t const entropyScratchSpace_size = HUF_WORKSPACE_SIZE; static size_t const entropyScratchSpace_size = HUF_WORKSPACE_SIZE;
@ -102,7 +102,7 @@ struct ZSTD_CCtx_s {
FSE_CTable* offcodeCTable; FSE_CTable* offcodeCTable;
FSE_CTable* matchlengthCTable; FSE_CTable* matchlengthCTable;
FSE_CTable* litlengthCTable; FSE_CTable* litlengthCTable;
unsigned* tmpCounters; unsigned* entropyScratchSpace;
}; };
ZSTD_CCtx* ZSTD_createCCtx(void) ZSTD_CCtx* ZSTD_createCCtx(void)
@ -317,7 +317,7 @@ static size_t ZSTD_resetCCtx_internal (ZSTD_CCtx* zc,
zc->litlengthCTable = (FSE_CTable*) ptr; zc->litlengthCTable = (FSE_CTable*) ptr;
ptr = (char*)ptr + litlengthCTable_size; ptr = (char*)ptr + litlengthCTable_size;
assert(((size_t)ptr & 3) == 0); /* ensure correct alignment */ assert(((size_t)ptr & 3) == 0); /* ensure correct alignment */
zc->tmpCounters = (unsigned*) ptr; zc->entropyScratchSpace = (unsigned*) ptr;
} } } }
/* init params */ /* init params */
@ -347,8 +347,8 @@ static size_t ZSTD_resetCCtx_internal (ZSTD_CCtx* zc,
assert((char*)zc->offcodeCTable == (char*)zc->hufCTable + hufCTable_size); assert((char*)zc->offcodeCTable == (char*)zc->hufCTable + hufCTable_size);
assert((char*)zc->matchlengthCTable == (char*)zc->offcodeCTable + offcodeCTable_size); assert((char*)zc->matchlengthCTable == (char*)zc->offcodeCTable + offcodeCTable_size);
assert((char*)zc->litlengthCTable == (char*)zc->matchlengthCTable + matchlengthCTable_size); assert((char*)zc->litlengthCTable == (char*)zc->matchlengthCTable + matchlengthCTable_size);
assert((char*)zc->tmpCounters == (char*)zc->litlengthCTable + litlengthCTable_size); assert((char*)zc->entropyScratchSpace == (char*)zc->litlengthCTable + litlengthCTable_size);
ptr = (char*)zc->tmpCounters + entropyScratchSpace_size; ptr = (char*)zc->entropyScratchSpace + entropyScratchSpace_size;
/* opt parser space */ /* opt parser space */
if ((params.cParams.strategy == ZSTD_btopt) || (params.cParams.strategy == ZSTD_btopt2)) { if ((params.cParams.strategy == ZSTD_btopt) || (params.cParams.strategy == ZSTD_btopt2)) {
@ -577,9 +577,9 @@ static size_t ZSTD_compressLiterals (ZSTD_CCtx* zc,
int const preferRepeat = zc->params.cParams.strategy < ZSTD_lazy ? srcSize <= 1024 : 0; int const preferRepeat = zc->params.cParams.strategy < ZSTD_lazy ? srcSize <= 1024 : 0;
if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1; if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1;
cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11, cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
zc->tmpCounters, entropyScratchSpace_size, zc->hufCTable, &repeat, preferRepeat) zc->entropyScratchSpace, entropyScratchSpace_size, zc->hufCTable, &repeat, preferRepeat)
: HUF_compress4X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11, : HUF_compress4X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
zc->tmpCounters, entropyScratchSpace_size, zc->hufCTable, &repeat, preferRepeat); zc->entropyScratchSpace, entropyScratchSpace_size, zc->hufCTable, &repeat, preferRepeat);
if (repeat != HUF_repeat_none) { hType = set_repeat; } /* reused the existing table */ if (repeat != HUF_repeat_none) { hType = set_repeat; } /* reused the existing table */
else { zc->hufCTable_repeatMode = HUF_repeat_check; } /* now have a table to reuse */ else { zc->hufCTable_repeatMode = HUF_repeat_check; } /* now have a table to reuse */
} }
@ -708,7 +708,7 @@ MEM_STATIC size_t ZSTD_compressSequences (ZSTD_CCtx* zc,
/* CTable for Literal Lengths */ /* CTable for Literal Lengths */
{ U32 max = MaxLL; { U32 max = MaxLL;
size_t const mostFrequent = FSE_countFast_wksp(count, &max, llCodeTable, nbSeq, zc->tmpCounters); size_t const mostFrequent = FSE_countFast_wksp(count, &max, llCodeTable, nbSeq, zc->entropyScratchSpace);
if ((mostFrequent == nbSeq) && (nbSeq > 2)) { if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
*op++ = llCodeTable[0]; *op++ = llCodeTable[0];
FSE_buildCTable_rle(CTable_LitLength, (BYTE)max); FSE_buildCTable_rle(CTable_LitLength, (BYTE)max);
@ -732,7 +732,7 @@ MEM_STATIC size_t ZSTD_compressSequences (ZSTD_CCtx* zc,
/* CTable for Offsets */ /* CTable for Offsets */
{ U32 max = MaxOff; { U32 max = MaxOff;
size_t const mostFrequent = FSE_countFast_wksp(count, &max, ofCodeTable, nbSeq, zc->tmpCounters); size_t const mostFrequent = FSE_countFast_wksp(count, &max, ofCodeTable, nbSeq, zc->entropyScratchSpace);
if ((mostFrequent == nbSeq) && (nbSeq > 2)) { if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
*op++ = ofCodeTable[0]; *op++ = ofCodeTable[0];
FSE_buildCTable_rle(CTable_OffsetBits, (BYTE)max); FSE_buildCTable_rle(CTable_OffsetBits, (BYTE)max);
@ -756,7 +756,7 @@ MEM_STATIC size_t ZSTD_compressSequences (ZSTD_CCtx* zc,
/* CTable for MatchLengths */ /* CTable for MatchLengths */
{ U32 max = MaxML; { U32 max = MaxML;
size_t const mostFrequent = FSE_countFast_wksp(count, &max, mlCodeTable, nbSeq, zc->tmpCounters); size_t const mostFrequent = FSE_countFast_wksp(count, &max, mlCodeTable, nbSeq, zc->entropyScratchSpace);
if ((mostFrequent == nbSeq) && (nbSeq > 2)) { if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
*op++ = *mlCodeTable; *op++ = *mlCodeTable;
FSE_buildCTable_rle(CTable_MatchLength, (BYTE)max); FSE_buildCTable_rle(CTable_MatchLength, (BYTE)max);
@ -1564,7 +1564,7 @@ static void ZSTD_compressBlock_doubleFast_extDict_generic(ZSTD_CCtx* ctx,
if ( (((U32)((dictLimit-1) - repIndex2) >= 3) & (repIndex2 > lowestIndex)) /* intentional overflow */ if ( (((U32)((dictLimit-1) - repIndex2) >= 3) & (repIndex2 > lowestIndex)) /* intentional overflow */
&& (MEM_read32(repMatch2) == MEM_read32(ip)) ) { && (MEM_read32(repMatch2) == MEM_read32(ip)) ) {
const BYTE* const repEnd2 = repIndex2 < dictLimit ? dictEnd : iend; const BYTE* const repEnd2 = repIndex2 < dictLimit ? dictEnd : iend;
size_t const repLength2 = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch2+EQUAL_READ32, iend, repEnd2, lowPrefixPtr) + EQUAL_READ32; size_t const repLength2 = ZSTD_count_2segments(ip+4, repMatch2+4, iend, repEnd2, lowPrefixPtr) + 4;
U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */ U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */
ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, repLength2-MINMATCH); ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, repLength2-MINMATCH);
hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = current2; hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = current2;
@ -1923,7 +1923,7 @@ size_t ZSTD_HcFindBestMatch_generic (
const U32 current = (U32)(ip-base); const U32 current = (U32)(ip-base);
const U32 minChain = current > chainSize ? current - chainSize : 0; const U32 minChain = current > chainSize ? current - chainSize : 0;
int nbAttempts=maxNbAttempts; int nbAttempts=maxNbAttempts;
size_t ml=EQUAL_READ32-1; size_t ml=4-1;
/* HC4 match finder */ /* HC4 match finder */
U32 matchIndex = ZSTD_insertAndFindFirstIndex (zc, ip, mls); U32 matchIndex = ZSTD_insertAndFindFirstIndex (zc, ip, mls);
@ -1938,7 +1938,7 @@ size_t ZSTD_HcFindBestMatch_generic (
} else { } else {
match = dictBase + matchIndex; match = dictBase + matchIndex;
if (MEM_read32(match) == MEM_read32(ip)) /* assumption : matchIndex <= dictLimit-4 (by table construction) */ if (MEM_read32(match) == MEM_read32(ip)) /* assumption : matchIndex <= dictLimit-4 (by table construction) */
currentMl = ZSTD_count_2segments(ip+EQUAL_READ32, match+EQUAL_READ32, iLimit, dictEnd, prefixStart) + EQUAL_READ32; currentMl = ZSTD_count_2segments(ip+4, match+4, iLimit, dictEnd, prefixStart) + 4;
} }
/* save best solution */ /* save best solution */
@ -2032,7 +2032,7 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
/* check repCode */ /* check repCode */
if ((offset_1>0) & (MEM_read32(ip+1) == MEM_read32(ip+1 - offset_1))) { if ((offset_1>0) & (MEM_read32(ip+1) == MEM_read32(ip+1 - offset_1))) {
/* repcode : we take it */ /* repcode : we take it */
matchLength = ZSTD_count(ip+1+EQUAL_READ32, ip+1+EQUAL_READ32-offset_1, iend) + EQUAL_READ32; matchLength = ZSTD_count(ip+1+4, ip+1+4-offset_1, iend) + 4;
if (depth==0) goto _storeSequence; if (depth==0) goto _storeSequence;
} }
@ -2043,7 +2043,7 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
matchLength = ml2, start = ip, offset=offsetFound; matchLength = ml2, start = ip, offset=offsetFound;
} }
if (matchLength < EQUAL_READ32) { if (matchLength < 4) {
ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */ ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */
continue; continue;
} }
@ -2053,17 +2053,17 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
while (ip<ilimit) { while (ip<ilimit) {
ip ++; ip ++;
if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) { if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) {
size_t const mlRep = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_1, iend) + EQUAL_READ32; size_t const mlRep = ZSTD_count(ip+4, ip+4-offset_1, iend) + 4;
int const gain2 = (int)(mlRep * 3); int const gain2 = (int)(mlRep * 3);
int const gain1 = (int)(matchLength*3 - ZSTD_highbit32((U32)offset+1) + 1); int const gain1 = (int)(matchLength*3 - ZSTD_highbit32((U32)offset+1) + 1);
if ((mlRep >= EQUAL_READ32) && (gain2 > gain1)) if ((mlRep >= 4) && (gain2 > gain1))
matchLength = mlRep, offset = 0, start = ip; matchLength = mlRep, offset = 0, start = ip;
} }
{ size_t offset2=99999999; { size_t offset2=99999999;
size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls); size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */ int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 4); int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 4);
if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) { if ((ml2 >= 4) && (gain2 > gain1)) {
matchLength = ml2, offset = offset2, start = ip; matchLength = ml2, offset = offset2, start = ip;
continue; /* search a better one */ continue; /* search a better one */
} } } }
@ -2072,17 +2072,17 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
if ((depth==2) && (ip<ilimit)) { if ((depth==2) && (ip<ilimit)) {
ip ++; ip ++;
if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) { if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) {
size_t const ml2 = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_1, iend) + EQUAL_READ32; size_t const ml2 = ZSTD_count(ip+4, ip+4-offset_1, iend) + 4;
int const gain2 = (int)(ml2 * 4); int const gain2 = (int)(ml2 * 4);
int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 1); int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 1);
if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) if ((ml2 >= 4) && (gain2 > gain1))
matchLength = ml2, offset = 0, start = ip; matchLength = ml2, offset = 0, start = ip;
} }
{ size_t offset2=99999999; { size_t offset2=99999999;
size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls); size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */ int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 7); int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 7);
if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) { if ((ml2 >= 4) && (gain2 > gain1)) {
matchLength = ml2, offset = offset2, start = ip; matchLength = ml2, offset = offset2, start = ip;
continue; continue;
} } } } } }
@ -2110,7 +2110,7 @@ _storeSequence:
&& ((offset_2>0) && ((offset_2>0)
& (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) { & (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) {
/* store sequence */ /* store sequence */
matchLength = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_2, iend) + EQUAL_READ32; matchLength = ZSTD_count(ip+4, ip+4-offset_2, iend) + 4;
offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap repcodes */ offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap repcodes */
ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH); ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH);
ip += matchLength; ip += matchLength;
@ -2199,7 +2199,7 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
if (MEM_read32(ip+1) == MEM_read32(repMatch)) { if (MEM_read32(ip+1) == MEM_read32(repMatch)) {
/* repcode detected we should take it */ /* repcode detected we should take it */
const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend; const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
matchLength = ZSTD_count_2segments(ip+1+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32; matchLength = ZSTD_count_2segments(ip+1+4, repMatch+4, iend, repEnd, prefixStart) + 4;
if (depth==0) goto _storeSequence; if (depth==0) goto _storeSequence;
} } } }
@ -2210,7 +2210,7 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
matchLength = ml2, start = ip, offset=offsetFound; matchLength = ml2, start = ip, offset=offsetFound;
} }
if (matchLength < EQUAL_READ32) { if (matchLength < 4) {
ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */ ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */
continue; continue;
} }
@ -2229,10 +2229,10 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
if (MEM_read32(ip) == MEM_read32(repMatch)) { if (MEM_read32(ip) == MEM_read32(repMatch)) {
/* repcode detected */ /* repcode detected */
const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend; const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
size_t const repLength = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32; size_t const repLength = ZSTD_count_2segments(ip+4, repMatch+4, iend, repEnd, prefixStart) + 4;
int const gain2 = (int)(repLength * 3); int const gain2 = (int)(repLength * 3);
int const gain1 = (int)(matchLength*3 - ZSTD_highbit32((U32)offset+1) + 1); int const gain1 = (int)(matchLength*3 - ZSTD_highbit32((U32)offset+1) + 1);
if ((repLength >= EQUAL_READ32) && (gain2 > gain1)) if ((repLength >= 4) && (gain2 > gain1))
matchLength = repLength, offset = 0, start = ip; matchLength = repLength, offset = 0, start = ip;
} } } }
@ -2241,7 +2241,7 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls); size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */ int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 4); int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 4);
if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) { if ((ml2 >= 4) && (gain2 > gain1)) {
matchLength = ml2, offset = offset2, start = ip; matchLength = ml2, offset = offset2, start = ip;
continue; /* search a better one */ continue; /* search a better one */
} } } }
@ -2259,10 +2259,10 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
if (MEM_read32(ip) == MEM_read32(repMatch)) { if (MEM_read32(ip) == MEM_read32(repMatch)) {
/* repcode detected */ /* repcode detected */
const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend; const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
size_t const repLength = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32; size_t const repLength = ZSTD_count_2segments(ip+4, repMatch+4, iend, repEnd, prefixStart) + 4;
int const gain2 = (int)(repLength * 4); int const gain2 = (int)(repLength * 4);
int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 1); int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 1);
if ((repLength >= EQUAL_READ32) && (gain2 > gain1)) if ((repLength >= 4) && (gain2 > gain1))
matchLength = repLength, offset = 0, start = ip; matchLength = repLength, offset = 0, start = ip;
} } } }
@ -2271,7 +2271,7 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls); size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */ int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 7); int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 7);
if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) { if ((ml2 >= 4) && (gain2 > gain1)) {
matchLength = ml2, offset = offset2, start = ip; matchLength = ml2, offset = offset2, start = ip;
continue; continue;
} } } } } }
@ -2303,7 +2303,7 @@ _storeSequence:
if (MEM_read32(ip) == MEM_read32(repMatch)) { if (MEM_read32(ip) == MEM_read32(repMatch)) {
/* repcode detected we should take it */ /* repcode detected we should take it */
const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend; const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
matchLength = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32; matchLength = ZSTD_count_2segments(ip+4, repMatch+4, iend, repEnd, prefixStart) + 4;
offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap offset history */ offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap offset history */
ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH); ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH);
ip += matchLength; ip += matchLength;