Merge branch 'dev' into longOffsetMode
This commit is contained in:
commit
b91ddf0ae6
@ -1,4 +1,4 @@
|
|||||||
<p align="center"><img src="https://raw.githubusercontent.com/facebook/zstd/readme/doc/images/zstd_logo86.png" alt="Zstandard"></p>
|
<p align="center"><img src="https://raw.githubusercontent.com/facebook/zstd/dev/doc/images/zstd_logo86.png" alt="Zstandard"></p>
|
||||||
|
|
||||||
__Zstandard__, or `zstd` as short version, is a fast lossless compression algorithm,
|
__Zstandard__, or `zstd` as short version, is a fast lossless compression algorithm,
|
||||||
targeting real-time compression scenarios at zlib-level and better compression ratios.
|
targeting real-time compression scenarios at zlib-level and better compression ratios.
|
||||||
|
@ -2,19 +2,24 @@ Zstandard Documentation
|
|||||||
=======================
|
=======================
|
||||||
|
|
||||||
This directory contains material defining the Zstandard format,
|
This directory contains material defining the Zstandard format,
|
||||||
as well as for help using the `zstd` library.
|
as well as detailed instructions to use `zstd` library.
|
||||||
|
|
||||||
|
__`zstd_manual.html`__ : Documentation of `zstd.h` API, in html format.
|
||||||
|
Click on this link: [http://zstd.net/zstd_manual.html](http://zstd.net/zstd_manual.html)
|
||||||
|
to display documentation of latest release in readable format within a browser.
|
||||||
|
|
||||||
__`zstd_compression_format.md`__ : This document defines the Zstandard compression format.
|
__`zstd_compression_format.md`__ : This document defines the Zstandard compression format.
|
||||||
Compliant decoders must adhere to this document,
|
Compliant decoders must adhere to this document,
|
||||||
and compliant encoders must generate data that follows it.
|
and compliant encoders must generate data that follows it.
|
||||||
|
|
||||||
|
Should you look for ressources to develop your own port of Zstandard algorithm,
|
||||||
|
you may find the following ressources useful :
|
||||||
|
|
||||||
__`educational_decoder`__ : This directory contains an implementation of a Zstandard decoder,
|
__`educational_decoder`__ : This directory contains an implementation of a Zstandard decoder,
|
||||||
compliant with the Zstandard compression format.
|
compliant with the Zstandard compression format.
|
||||||
It can be used, for example, to better understand the format,
|
It can be used, for example, to better understand the format,
|
||||||
or as the basis for a separate implementation a Zstandard decoder/encoder.
|
or as the basis for a separate implementation of Zstandard decoder.
|
||||||
|
|
||||||
__`zstd_manual.html`__ : Documentation on the functions found in `zstd.h`.
|
|
||||||
See [http://zstd.net/zstd_manual.html](http://zstd.net/zstd_manual.html) for
|
|
||||||
the manual released with the latest official `zstd` release.
|
|
||||||
|
|
||||||
|
|
||||||
|
[__`decode_corpus`__](https://github.com/facebook/zstd/tree/dev/tests#decodecorpus---tool-to-generate-zstandard-frames-for-decoder-testing) :
|
||||||
|
This tool, stored in `/tests` directory, is able to generate random valid frames,
|
||||||
|
which is useful if you wish to test your decoder and verify it fully supports the specification.
|
||||||
|
@ -416,7 +416,7 @@ size_t ZSTD_estimateDCtxSize(void);
|
|||||||
It will also consider src size to be arbitrarily "large", which is worst case.
|
It will also consider src size to be arbitrarily "large", which is worst case.
|
||||||
If srcSize is known to always be small, ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation.
|
If srcSize is known to always be small, ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation.
|
||||||
ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
||||||
ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbThreads is > 1.
|
ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbWorkers is >= 1.
|
||||||
Note : CCtx size estimation is only correct for single-threaded compression.
|
Note : CCtx size estimation is only correct for single-threaded compression.
|
||||||
</p></pre><BR>
|
</p></pre><BR>
|
||||||
|
|
||||||
@ -429,7 +429,7 @@ size_t ZSTD_estimateDStreamSize_fromFrame(const void* src, size_t srcSize);
|
|||||||
It will also consider src size to be arbitrarily "large", which is worst case.
|
It will also consider src size to be arbitrarily "large", which is worst case.
|
||||||
If srcSize is known to always be small, ZSTD_estimateCStreamSize_usingCParams() can provide a tighter estimation.
|
If srcSize is known to always be small, ZSTD_estimateCStreamSize_usingCParams() can provide a tighter estimation.
|
||||||
ZSTD_estimateCStreamSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
ZSTD_estimateCStreamSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
||||||
ZSTD_estimateCStreamSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbThreads is set to a value > 1.
|
ZSTD_estimateCStreamSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbWorkers is >= 1.
|
||||||
Note : CStream size estimation is only correct for single-threaded compression.
|
Note : CStream size estimation is only correct for single-threaded compression.
|
||||||
ZSTD_DStream memory budget depends on window Size.
|
ZSTD_DStream memory budget depends on window Size.
|
||||||
This information can be passed manually, using ZSTD_estimateDStreamSize,
|
This information can be passed manually, using ZSTD_estimateDStreamSize,
|
||||||
@ -800,18 +800,13 @@ size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long
|
|||||||
</b>/* multi-threading parameters */<b>
|
</b>/* multi-threading parameters */<b>
|
||||||
</b>/* These parameters are only useful if multi-threading is enabled (ZSTD_MULTITHREAD).<b>
|
</b>/* These parameters are only useful if multi-threading is enabled (ZSTD_MULTITHREAD).<b>
|
||||||
* They return an error otherwise. */
|
* They return an error otherwise. */
|
||||||
ZSTD_p_nbThreads=400, </b>/* Select how many threads a compression job can spawn (default:1)<b>
|
ZSTD_p_nbWorkers=400, </b>/* Select how many threads will be spawned to compress in parallel.<b>
|
||||||
* More threads improve speed, but also increase memory usage.
|
* When nbWorkers >= 1, triggers asynchronous mode :
|
||||||
* Can only receive a value > 1 if ZSTD_MULTITHREAD is enabled.
|
* ZSTD_compress_generic() consumes some input, flush some output if possible, and immediately gives back control to caller,
|
||||||
* Special: value 0 means "do not change nbThreads" */
|
* while compression work is performed in parallel, within worker threads.
|
||||||
ZSTD_p_nonBlockingMode, </b>/* Single thread mode is by default "blocking" :<b>
|
* (note : a strong exception to this rule is when first invocation sets ZSTD_e_end : it becomes a blocking call).
|
||||||
* it finishes its job as much as possible, and only then gives back control to caller.
|
* More workers improve speed, but also increase memory usage.
|
||||||
* In contrast, multi-thread is by default "non-blocking" :
|
* Default value is `0`, aka "single-threaded mode" : no worker is spawned, compression is performed inside Caller's thread, all invocations are blocking */
|
||||||
* it takes some input, flush some output if available, and immediately gives back control to caller.
|
|
||||||
* Compression work is performed in parallel, within worker threads.
|
|
||||||
* (note : a strong exception to this rule is when first job is called with ZSTD_e_end : it becomes blocking)
|
|
||||||
* Setting this parameter to 1 will enforce non-blocking mode even when only 1 thread is selected.
|
|
||||||
* It allows the caller to do other tasks while the worker thread compresses in parallel. */
|
|
||||||
ZSTD_p_jobSize, </b>/* Size of a compression job. This value is only enforced in streaming (non-blocking) mode.<b>
|
ZSTD_p_jobSize, </b>/* Size of a compression job. This value is only enforced in streaming (non-blocking) mode.<b>
|
||||||
* Each compression job is completed in parallel, so indirectly controls the nb of active threads.
|
* Each compression job is completed in parallel, so indirectly controls the nb of active threads.
|
||||||
* 0 means default, which is dynamically determined based on compression parameters.
|
* 0 means default, which is dynamically determined based on compression parameters.
|
||||||
@ -823,7 +818,7 @@ size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long
|
|||||||
</b>/* advanced parameters - may not remain available after API update */<b>
|
</b>/* advanced parameters - may not remain available after API update */<b>
|
||||||
ZSTD_p_forceMaxWindow=1100, </b>/* Force back-reference distances to remain < windowSize,<b>
|
ZSTD_p_forceMaxWindow=1100, </b>/* Force back-reference distances to remain < windowSize,<b>
|
||||||
* even when referencing into Dictionary content (default:0) */
|
* even when referencing into Dictionary content (default:0) */
|
||||||
ZSTD_p_enableLongDistanceMatching=1200, </b>/* Enable long distance matching.<b>
|
ZSTD_p_enableLongDistanceMatching=1200, </b>/* Enable long distance matching.<b>
|
||||||
* This parameter is designed to improve the compression
|
* This parameter is designed to improve the compression
|
||||||
* ratio for large inputs with long distance matches.
|
* ratio for large inputs with long distance matches.
|
||||||
* This increases the memory usage as well as window size.
|
* This increases the memory usage as well as window size.
|
||||||
@ -833,32 +828,38 @@ size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long
|
|||||||
* other LDM parameters. Setting the compression level
|
* other LDM parameters. Setting the compression level
|
||||||
* after this parameter overrides the window log, though LDM
|
* after this parameter overrides the window log, though LDM
|
||||||
* will remain enabled until explicitly disabled. */
|
* will remain enabled until explicitly disabled. */
|
||||||
ZSTD_p_ldmHashLog, </b>/* Size of the table for long distance matching, as a power of 2.<b>
|
ZSTD_p_ldmHashLog, </b>/* Size of the table for long distance matching, as a power of 2.<b>
|
||||||
* Larger values increase memory usage and compression ratio, but decrease
|
* Larger values increase memory usage and compression ratio, but decrease
|
||||||
* compression speed.
|
* compression speed.
|
||||||
* Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX
|
* Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX
|
||||||
* (default: windowlog - 7). */
|
* (default: windowlog - 7).
|
||||||
ZSTD_p_ldmMinMatch, </b>/* Minimum size of searched matches for long distance matcher.<b>
|
* Special: value 0 means "do not change ldmHashLog". */
|
||||||
* Larger/too small values usually decrease compression ratio.
|
ZSTD_p_ldmMinMatch, </b>/* Minimum size of searched matches for long distance matcher.<b>
|
||||||
* Must be clamped between ZSTD_LDM_MINMATCH_MIN
|
* Larger/too small values usually decrease compression ratio.
|
||||||
* and ZSTD_LDM_MINMATCH_MAX (default: 64). */
|
* Must be clamped between ZSTD_LDM_MINMATCH_MIN
|
||||||
ZSTD_p_ldmBucketSizeLog, </b>/* Log size of each bucket in the LDM hash table for collision resolution.<b>
|
* and ZSTD_LDM_MINMATCH_MAX (default: 64).
|
||||||
* Larger values usually improve collision resolution but may decrease
|
* Special: value 0 means "do not change ldmMinMatch". */
|
||||||
* compression speed.
|
ZSTD_p_ldmBucketSizeLog, </b>/* Log size of each bucket in the LDM hash table for collision resolution.<b>
|
||||||
* The maximum value is ZSTD_LDM_BUCKETSIZELOG_MAX (default: 3). */
|
* Larger values usually improve collision resolution but may decrease
|
||||||
|
* compression speed.
|
||||||
|
* The maximum value is ZSTD_LDM_BUCKETSIZELOG_MAX (default: 3).
|
||||||
|
* note : 0 is a valid value */
|
||||||
ZSTD_p_ldmHashEveryLog, </b>/* Frequency of inserting/looking up entries in the LDM hash table.<b>
|
ZSTD_p_ldmHashEveryLog, </b>/* Frequency of inserting/looking up entries in the LDM hash table.<b>
|
||||||
* The default is MAX(0, (windowLog - ldmHashLog)) to
|
* The default is MAX(0, (windowLog - ldmHashLog)) to
|
||||||
* optimize hash table usage.
|
* optimize hash table usage.
|
||||||
* Larger values improve compression speed. Deviating far from the
|
* Larger values improve compression speed. Deviating far from the
|
||||||
* default value will likely result in a decrease in compression ratio.
|
* default value will likely result in a decrease in compression ratio.
|
||||||
* Must be clamped between 0 and ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN. */
|
* Must be clamped between 0 and ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN.
|
||||||
|
* note : 0 is a valid value */
|
||||||
|
|
||||||
} ZSTD_cParameter;
|
} ZSTD_cParameter;
|
||||||
</b></pre><BR>
|
</b></pre><BR>
|
||||||
<pre><b>size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value);
|
<pre><b>size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value);
|
||||||
</b><p> Set one compression parameter, selected by enum ZSTD_cParameter.
|
</b><p> Set one compression parameter, selected by enum ZSTD_cParameter.
|
||||||
|
Setting a parameter is generally only possible during frame initialization (before starting compression),
|
||||||
|
except for a few exceptions which can be updated during compression: compressionLevel, hashLog, chainLog, searchLog, minMatch, targetLength and strategy.
|
||||||
Note : when `value` is an enum, cast it to unsigned for proper type checking.
|
Note : when `value` is an enum, cast it to unsigned for proper type checking.
|
||||||
@result : informational value (typically, the one being set, possibly corrected),
|
@result : informational value (typically, value being set clamped correctly),
|
||||||
or an error code (which can be tested with ZSTD_isError()).
|
or an error code (which can be tested with ZSTD_isError()).
|
||||||
</p></pre><BR>
|
</p></pre><BR>
|
||||||
|
|
||||||
@ -1000,7 +1001,7 @@ size_t ZSTD_CCtx_refPrefix_advanced(ZSTD_CCtx* cctx, const void* prefix, size_t
|
|||||||
</p></pre><BR>
|
</p></pre><BR>
|
||||||
|
|
||||||
<pre><b>size_t ZSTD_resetCCtxParams(ZSTD_CCtx_params* params);
|
<pre><b>size_t ZSTD_resetCCtxParams(ZSTD_CCtx_params* params);
|
||||||
</b><p> Reset params to default, with the default compression level.
|
</b><p> Reset params to default values.
|
||||||
|
|
||||||
</p></pre><BR>
|
</p></pre><BR>
|
||||||
|
|
||||||
@ -1028,9 +1029,10 @@ size_t ZSTD_CCtx_refPrefix_advanced(ZSTD_CCtx* cctx, const void* prefix, size_t
|
|||||||
<pre><b>size_t ZSTD_CCtx_setParametersUsingCCtxParams(
|
<pre><b>size_t ZSTD_CCtx_setParametersUsingCCtxParams(
|
||||||
ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params);
|
ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params);
|
||||||
</b><p> Apply a set of ZSTD_CCtx_params to the compression context.
|
</b><p> Apply a set of ZSTD_CCtx_params to the compression context.
|
||||||
This must be done before the dictionary is loaded.
|
This can be done even after compression is started,
|
||||||
The pledgedSrcSize is treated as unknown.
|
if nbWorkers==0, this will have no impact until a new compression is started.
|
||||||
Multithreading parameters are applied only if nbThreads > 1.
|
if nbWorkers>=1, new parameters will be picked up at next job,
|
||||||
|
with a few restrictions (windowLog, pledgedSrcSize, nbWorkers, jobSize, and overlapLog are not updated).
|
||||||
|
|
||||||
</p></pre><BR>
|
</p></pre><BR>
|
||||||
|
|
||||||
|
@ -9,6 +9,7 @@
|
|||||||
|
|
||||||
# This Makefile presumes libzstd is installed, using `sudo make install`
|
# This Makefile presumes libzstd is installed, using `sudo make install`
|
||||||
|
|
||||||
|
CPPFLAGS += -I../lib
|
||||||
LIB = ../lib/libzstd.a
|
LIB = ../lib/libzstd.a
|
||||||
|
|
||||||
.PHONY: default all clean test
|
.PHONY: default all clean test
|
||||||
|
@ -46,7 +46,7 @@ static unsigned readU32FromChar(const char** stringPtr)
|
|||||||
|
|
||||||
int main(int argc, char const *argv[]) {
|
int main(int argc, char const *argv[]) {
|
||||||
|
|
||||||
printf("\n Zstandard (v%u) memory usage for streaming contexts : \n\n", ZSTD_versionNumber());
|
printf("\n Zstandard (v%s) memory usage for streaming : \n\n", ZSTD_versionString());
|
||||||
|
|
||||||
unsigned wLog = 0;
|
unsigned wLog = 0;
|
||||||
if (argc > 1) {
|
if (argc > 1) {
|
||||||
@ -69,11 +69,13 @@ int main(int argc, char const *argv[]) {
|
|||||||
|
|
||||||
/* forces compressor to use maximum memory size for given compression level,
|
/* forces compressor to use maximum memory size for given compression level,
|
||||||
* by not providing any information on input size */
|
* by not providing any information on input size */
|
||||||
ZSTD_parameters params = ZSTD_getParams(compressionLevel, 0, 0);
|
ZSTD_parameters params = ZSTD_getParams(compressionLevel, ZSTD_CONTENTSIZE_UNKNOWN, 0);
|
||||||
if (wLog) { /* special mode : specific wLog */
|
if (wLog) { /* special mode : specific wLog */
|
||||||
printf("Using custom compression parameter : level 1 + wLog=%u \n", wLog);
|
printf("Using custom compression parameter : level 1 + wLog=%u \n", wLog);
|
||||||
params = ZSTD_getParams(1, 1 << wLog, 0);
|
params = ZSTD_getParams(1 /*compressionLevel*/,
|
||||||
size_t const error = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 0);
|
1 << wLog /*estimatedSrcSize*/,
|
||||||
|
0 /*no dictionary*/);
|
||||||
|
size_t const error = ZSTD_initCStream_advanced(cstream, NULL, 0, params, ZSTD_CONTENTSIZE_UNKNOWN);
|
||||||
if (ZSTD_isError(error)) {
|
if (ZSTD_isError(error)) {
|
||||||
printf("ZSTD_initCStream_advanced error : %s \n", ZSTD_getErrorName(error));
|
printf("ZSTD_initCStream_advanced error : %s \n", ZSTD_getErrorName(error));
|
||||||
return 1;
|
return 1;
|
||||||
|
13
lib/BUCK
13
lib/BUCK
@ -25,6 +25,9 @@ cxx_library(
|
|||||||
name='decompress',
|
name='decompress',
|
||||||
header_namespace='',
|
header_namespace='',
|
||||||
visibility=['PUBLIC'],
|
visibility=['PUBLIC'],
|
||||||
|
headers=subdir_glob([
|
||||||
|
('decompress', '*_impl.h'),
|
||||||
|
]),
|
||||||
srcs=glob(['decompress/zstd*.c']),
|
srcs=glob(['decompress/zstd*.c']),
|
||||||
deps=[
|
deps=[
|
||||||
':common',
|
':common',
|
||||||
@ -80,6 +83,15 @@ cxx_library(
|
|||||||
]),
|
]),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
cxx_library(
|
||||||
|
name='cpu',
|
||||||
|
header_namespace='',
|
||||||
|
visibility=['PUBLIC'],
|
||||||
|
exported_headers=subdir_glob([
|
||||||
|
('common', 'cpu.h'),
|
||||||
|
]),
|
||||||
|
)
|
||||||
|
|
||||||
cxx_library(
|
cxx_library(
|
||||||
name='bitstream',
|
name='bitstream',
|
||||||
header_namespace='',
|
header_namespace='',
|
||||||
@ -196,6 +208,7 @@ cxx_library(
|
|||||||
deps=[
|
deps=[
|
||||||
':bitstream',
|
':bitstream',
|
||||||
':compiler',
|
':compiler',
|
||||||
|
':cpu',
|
||||||
':entropy',
|
':entropy',
|
||||||
':errors',
|
':errors',
|
||||||
':mem',
|
':mem',
|
||||||
|
@ -63,6 +63,25 @@
|
|||||||
# endif
|
# endif
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
/* target attribute */
|
||||||
|
#if defined(__GNUC__)
|
||||||
|
# define TARGET_ATTRIBUTE(target) __attribute__((__target__(target)))
|
||||||
|
#else
|
||||||
|
# define TARGET_ATTRIBUTE(target)
|
||||||
|
#endif
|
||||||
|
|
||||||
|
/* Enable runtime BMI2 dispatch based on the CPU.
|
||||||
|
* Enabled for clang & gcc >=4.8 on x86 when BMI2 isn't enabled by default.
|
||||||
|
*/
|
||||||
|
#ifndef DYNAMIC_BMI2
|
||||||
|
#if defined(__GNUC__) && (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)) \
|
||||||
|
&& (defined(__x86_64__) || defined(_M_X86)) && !defined(__BMI2__)
|
||||||
|
# define DYNAMIC_BMI2 1
|
||||||
|
#else
|
||||||
|
# define DYNAMIC_BMI2 0
|
||||||
|
#endif
|
||||||
|
#endif
|
||||||
|
|
||||||
/* prefetch */
|
/* prefetch */
|
||||||
#if defined(_MSC_VER) && (defined(_M_X64) || defined(_M_I86)) /* _mm_prefetch() is not defined outside of x86/x64 */
|
#if defined(_MSC_VER) && (defined(_M_X64) || defined(_M_I86)) /* _mm_prefetch() is not defined outside of x86/x64 */
|
||||||
# include <mmintrin.h> /* https://msdn.microsoft.com/fr-fr/library/84szxsww(v=vs.90).aspx */
|
# include <mmintrin.h> /* https://msdn.microsoft.com/fr-fr/library/84szxsww(v=vs.90).aspx */
|
||||||
|
216
lib/common/cpu.h
Normal file
216
lib/common/cpu.h
Normal file
@ -0,0 +1,216 @@
|
|||||||
|
/*
|
||||||
|
* Copyright (c) 2018-present, Facebook, Inc.
|
||||||
|
* All rights reserved.
|
||||||
|
*
|
||||||
|
* This source code is licensed under both the BSD-style license (found in the
|
||||||
|
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||||
|
* in the COPYING file in the root directory of this source tree).
|
||||||
|
* You may select, at your option, one of the above-listed licenses.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#ifndef ZSTD_COMMON_CPU_H
|
||||||
|
#define ZSTD_COMMON_CPU_H
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Implementation taken from folly/CpuId.h
|
||||||
|
* https://github.com/facebook/folly/blob/master/folly/CpuId.h
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <string.h>
|
||||||
|
|
||||||
|
#include "mem.h"
|
||||||
|
|
||||||
|
#ifdef _MSC_VER
|
||||||
|
#include <intrin.h>
|
||||||
|
#endif
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
U32 f1c;
|
||||||
|
U32 f1d;
|
||||||
|
U32 f7b;
|
||||||
|
U32 f7c;
|
||||||
|
} ZSTD_cpuid_t;
|
||||||
|
|
||||||
|
MEM_STATIC ZSTD_cpuid_t ZSTD_cpuid(void) {
|
||||||
|
U32 f1c = 0;
|
||||||
|
U32 f1d = 0;
|
||||||
|
U32 f7b = 0;
|
||||||
|
U32 f7c = 0;
|
||||||
|
#ifdef _MSC_VER
|
||||||
|
int reg[4];
|
||||||
|
__cpuid((int*)reg, 0);
|
||||||
|
{
|
||||||
|
int const n = reg[0];
|
||||||
|
if (n >= 1) {
|
||||||
|
__cpuid((int*)reg, 1);
|
||||||
|
f1c = (U32)reg[2];
|
||||||
|
f1d = (U32)reg[3];
|
||||||
|
}
|
||||||
|
if (n >= 7) {
|
||||||
|
__cpuidex((int*)reg, 7, 0);
|
||||||
|
f7b = (U32)reg[1];
|
||||||
|
f7c = (U32)reg[2];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#elif defined(__i386__) && defined(__PIC__) && !defined(__clang__) && defined(__GNUC__)
|
||||||
|
/* The following block like the normal cpuid branch below, but gcc
|
||||||
|
* reserves ebx for use of its pic register so we must specially
|
||||||
|
* handle the save and restore to avoid clobbering the register
|
||||||
|
*/
|
||||||
|
U32 n;
|
||||||
|
__asm__(
|
||||||
|
"pushl %%ebx\n\t"
|
||||||
|
"cpuid\n\t"
|
||||||
|
"popl %%ebx\n\t"
|
||||||
|
: "=a"(n)
|
||||||
|
: "a"(0)
|
||||||
|
: "ecx", "edx");
|
||||||
|
if (n >= 1) {
|
||||||
|
U32 f1a;
|
||||||
|
__asm__(
|
||||||
|
"pushl %%ebx\n\t"
|
||||||
|
"cpuid\n\t"
|
||||||
|
"popl %%ebx\n\t"
|
||||||
|
: "=a"(f1a), "=c"(f1c), "=d"(f1d)
|
||||||
|
: "a"(1)
|
||||||
|
:);
|
||||||
|
}
|
||||||
|
if (n >= 7) {
|
||||||
|
__asm__(
|
||||||
|
"pushl %%ebx\n\t"
|
||||||
|
"cpuid\n\t"
|
||||||
|
"movl %%ebx, %%eax\n\r"
|
||||||
|
"popl %%ebx"
|
||||||
|
: "=a"(f7b), "=c"(f7c)
|
||||||
|
: "a"(7), "c"(0)
|
||||||
|
: "edx");
|
||||||
|
}
|
||||||
|
#elif defined(__x86_64__) || defined(_M_X64) || defined(__i386__)
|
||||||
|
U32 n;
|
||||||
|
__asm__("cpuid" : "=a"(n) : "a"(0) : "ebx", "ecx", "edx");
|
||||||
|
if (n >= 1) {
|
||||||
|
U32 f1a;
|
||||||
|
__asm__("cpuid" : "=a"(f1a), "=c"(f1c), "=d"(f1d) : "a"(1) : "ebx");
|
||||||
|
}
|
||||||
|
if (n >= 7) {
|
||||||
|
U32 f7a;
|
||||||
|
__asm__("cpuid"
|
||||||
|
: "=a"(f7a), "=b"(f7b), "=c"(f7c)
|
||||||
|
: "a"(7), "c"(0)
|
||||||
|
: "edx");
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
{
|
||||||
|
ZSTD_cpuid_t cpuid;
|
||||||
|
cpuid.f1c = f1c;
|
||||||
|
cpuid.f1d = f1d;
|
||||||
|
cpuid.f7b = f7b;
|
||||||
|
cpuid.f7c = f7c;
|
||||||
|
return cpuid;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#define X(name, r, bit) \
|
||||||
|
MEM_STATIC int ZSTD_cpuid_##name(ZSTD_cpuid_t const cpuid) { \
|
||||||
|
return ((cpuid.r) & (1U << bit)) != 0; \
|
||||||
|
}
|
||||||
|
|
||||||
|
/* cpuid(1): Processor Info and Feature Bits. */
|
||||||
|
#define C(name, bit) X(name, f1c, bit)
|
||||||
|
C(sse3, 0)
|
||||||
|
C(pclmuldq, 1)
|
||||||
|
C(dtes64, 2)
|
||||||
|
C(monitor, 3)
|
||||||
|
C(dscpl, 4)
|
||||||
|
C(vmx, 5)
|
||||||
|
C(smx, 6)
|
||||||
|
C(eist, 7)
|
||||||
|
C(tm2, 8)
|
||||||
|
C(ssse3, 9)
|
||||||
|
C(cnxtid, 10)
|
||||||
|
C(fma, 12)
|
||||||
|
C(cx16, 13)
|
||||||
|
C(xtpr, 14)
|
||||||
|
C(pdcm, 15)
|
||||||
|
C(pcid, 17)
|
||||||
|
C(dca, 18)
|
||||||
|
C(sse41, 19)
|
||||||
|
C(sse42, 20)
|
||||||
|
C(x2apic, 21)
|
||||||
|
C(movbe, 22)
|
||||||
|
C(popcnt, 23)
|
||||||
|
C(tscdeadline, 24)
|
||||||
|
C(aes, 25)
|
||||||
|
C(xsave, 26)
|
||||||
|
C(osxsave, 27)
|
||||||
|
C(avx, 28)
|
||||||
|
C(f16c, 29)
|
||||||
|
C(rdrand, 30)
|
||||||
|
#undef C
|
||||||
|
#define D(name, bit) X(name, f1d, bit)
|
||||||
|
D(fpu, 0)
|
||||||
|
D(vme, 1)
|
||||||
|
D(de, 2)
|
||||||
|
D(pse, 3)
|
||||||
|
D(tsc, 4)
|
||||||
|
D(msr, 5)
|
||||||
|
D(pae, 6)
|
||||||
|
D(mce, 7)
|
||||||
|
D(cx8, 8)
|
||||||
|
D(apic, 9)
|
||||||
|
D(sep, 11)
|
||||||
|
D(mtrr, 12)
|
||||||
|
D(pge, 13)
|
||||||
|
D(mca, 14)
|
||||||
|
D(cmov, 15)
|
||||||
|
D(pat, 16)
|
||||||
|
D(pse36, 17)
|
||||||
|
D(psn, 18)
|
||||||
|
D(clfsh, 19)
|
||||||
|
D(ds, 21)
|
||||||
|
D(acpi, 22)
|
||||||
|
D(mmx, 23)
|
||||||
|
D(fxsr, 24)
|
||||||
|
D(sse, 25)
|
||||||
|
D(sse2, 26)
|
||||||
|
D(ss, 27)
|
||||||
|
D(htt, 28)
|
||||||
|
D(tm, 29)
|
||||||
|
D(pbe, 31)
|
||||||
|
#undef D
|
||||||
|
|
||||||
|
/* cpuid(7): Extended Features. */
|
||||||
|
#define B(name, bit) X(name, f7b, bit)
|
||||||
|
B(bmi1, 3)
|
||||||
|
B(hle, 4)
|
||||||
|
B(avx2, 5)
|
||||||
|
B(smep, 7)
|
||||||
|
B(bmi2, 8)
|
||||||
|
B(erms, 9)
|
||||||
|
B(invpcid, 10)
|
||||||
|
B(rtm, 11)
|
||||||
|
B(mpx, 14)
|
||||||
|
B(avx512f, 16)
|
||||||
|
B(avx512dq, 17)
|
||||||
|
B(rdseed, 18)
|
||||||
|
B(adx, 19)
|
||||||
|
B(smap, 20)
|
||||||
|
B(avx512ifma, 21)
|
||||||
|
B(pcommit, 22)
|
||||||
|
B(clflushopt, 23)
|
||||||
|
B(clwb, 24)
|
||||||
|
B(avx512pf, 26)
|
||||||
|
B(avx512er, 27)
|
||||||
|
B(avx512cd, 28)
|
||||||
|
B(sha, 29)
|
||||||
|
B(avx512bw, 30)
|
||||||
|
B(avx512vl, 31)
|
||||||
|
#undef B
|
||||||
|
#define C(name, bit) X(name, f7c, bit)
|
||||||
|
C(prefetchwt1, 0)
|
||||||
|
C(avx512vbmi, 1)
|
||||||
|
#undef C
|
||||||
|
|
||||||
|
#undef X
|
||||||
|
|
||||||
|
#endif /* ZSTD_COMMON_CPU_H */
|
@ -29,6 +29,7 @@ const char* ERR_getErrorString(ERR_enum code)
|
|||||||
case PREFIX(parameter_outOfBound): return "Parameter is out of bound";
|
case PREFIX(parameter_outOfBound): return "Parameter is out of bound";
|
||||||
case PREFIX(init_missing): return "Context should be init first";
|
case PREFIX(init_missing): return "Context should be init first";
|
||||||
case PREFIX(memory_allocation): return "Allocation error : not enough memory";
|
case PREFIX(memory_allocation): return "Allocation error : not enough memory";
|
||||||
|
case PREFIX(workSpace_tooSmall): return "workSpace buffer is not large enough";
|
||||||
case PREFIX(stage_wrong): return "Operation not authorized at current processing stage";
|
case PREFIX(stage_wrong): return "Operation not authorized at current processing stage";
|
||||||
case PREFIX(tableLog_tooLarge): return "tableLog requires too much memory : unsupported";
|
case PREFIX(tableLog_tooLarge): return "tableLog requires too much memory : unsupported";
|
||||||
case PREFIX(maxSymbolValue_tooLarge): return "Unsupported max Symbol Value : too large";
|
case PREFIX(maxSymbolValue_tooLarge): return "Unsupported max Symbol Value : too large";
|
||||||
|
@ -67,7 +67,6 @@ HUF_compress() :
|
|||||||
`srcSize` must be <= `HUF_BLOCKSIZE_MAX` == 128 KB.
|
`srcSize` must be <= `HUF_BLOCKSIZE_MAX` == 128 KB.
|
||||||
@return : size of compressed data (<= `dstCapacity`).
|
@return : size of compressed data (<= `dstCapacity`).
|
||||||
Special values : if return == 0, srcData is not compressible => Nothing is stored within dst !!!
|
Special values : if return == 0, srcData is not compressible => Nothing is stored within dst !!!
|
||||||
if return == 1, srcData is a single repeated byte symbol (RLE compression).
|
|
||||||
if HUF_isError(return), compression failed (more details using HUF_getErrorName())
|
if HUF_isError(return), compression failed (more details using HUF_getErrorName())
|
||||||
*/
|
*/
|
||||||
HUF_PUBLIC_API size_t HUF_compress(void* dst, size_t dstCapacity,
|
HUF_PUBLIC_API size_t HUF_compress(void* dst, size_t dstCapacity,
|
||||||
@ -80,7 +79,7 @@ HUF_decompress() :
|
|||||||
`originalSize` : **must** be the ***exact*** size of original (uncompressed) data.
|
`originalSize` : **must** be the ***exact*** size of original (uncompressed) data.
|
||||||
Note : in contrast with FSE, HUF_decompress can regenerate
|
Note : in contrast with FSE, HUF_decompress can regenerate
|
||||||
RLE (cSrcSize==1) and uncompressed (cSrcSize==dstSize) data,
|
RLE (cSrcSize==1) and uncompressed (cSrcSize==dstSize) data,
|
||||||
because it knows size to regenerate.
|
because it knows size to regenerate (originalSize).
|
||||||
@return : size of regenerated data (== originalSize),
|
@return : size of regenerated data (== originalSize),
|
||||||
or an error code, which can be tested using HUF_isError()
|
or an error code, which can be tested using HUF_isError()
|
||||||
*/
|
*/
|
||||||
@ -101,6 +100,7 @@ HUF_PUBLIC_API const char* HUF_getErrorName(size_t code); /**< provides error c
|
|||||||
|
|
||||||
/** HUF_compress2() :
|
/** HUF_compress2() :
|
||||||
* Same as HUF_compress(), but offers direct control over `maxSymbolValue` and `tableLog`.
|
* Same as HUF_compress(), but offers direct control over `maxSymbolValue` and `tableLog`.
|
||||||
|
* `maxSymbolValue` must be <= HUF_SYMBOLVALUE_MAX .
|
||||||
* `tableLog` must be `<= HUF_TABLELOG_MAX` . */
|
* `tableLog` must be `<= HUF_TABLELOG_MAX` . */
|
||||||
HUF_PUBLIC_API size_t HUF_compress2 (void* dst, size_t dstCapacity, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog);
|
HUF_PUBLIC_API size_t HUF_compress2 (void* dst, size_t dstCapacity, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog);
|
||||||
|
|
||||||
@ -129,7 +129,7 @@ HUF_PUBLIC_API size_t HUF_compress4X_wksp (void* dst, size_t dstCapacity, const
|
|||||||
/* ******************************************************************
|
/* ******************************************************************
|
||||||
* WARNING !!
|
* WARNING !!
|
||||||
* The following section contains advanced and experimental definitions
|
* The following section contains advanced and experimental definitions
|
||||||
* which shall never be used in the context of dll
|
* which shall never be used in the context of a dynamic library,
|
||||||
* because they are not guaranteed to remain stable in the future.
|
* because they are not guaranteed to remain stable in the future.
|
||||||
* Only consider them in association with static linking.
|
* Only consider them in association with static linking.
|
||||||
*******************************************************************/
|
*******************************************************************/
|
||||||
@ -141,11 +141,11 @@ HUF_PUBLIC_API size_t HUF_compress4X_wksp (void* dst, size_t dstCapacity, const
|
|||||||
|
|
||||||
|
|
||||||
/* *** Constants *** */
|
/* *** Constants *** */
|
||||||
#define HUF_TABLELOG_MAX 12 /* max configured tableLog (for static allocation); can be modified up to HUF_ABSOLUTEMAX_TABLELOG */
|
#define HUF_TABLELOG_MAX 12 /* max runtime value of tableLog (due to static allocation); can be modified up to HUF_ABSOLUTEMAX_TABLELOG */
|
||||||
#define HUF_TABLELOG_DEFAULT 11 /* tableLog by default, when not specified */
|
#define HUF_TABLELOG_DEFAULT 11 /* default tableLog value when none specified */
|
||||||
#define HUF_SYMBOLVALUE_MAX 255
|
#define HUF_SYMBOLVALUE_MAX 255
|
||||||
|
|
||||||
#define HUF_TABLELOG_ABSOLUTEMAX 15 /* absolute limit of HUF_MAX_TABLELOG. Beyond that value, code does not work */
|
#define HUF_TABLELOG_ABSOLUTEMAX 15 /* absolute limit of HUF_MAX_TABLELOG. Beyond that value, code does not work */
|
||||||
#if (HUF_TABLELOG_MAX > HUF_TABLELOG_ABSOLUTEMAX)
|
#if (HUF_TABLELOG_MAX > HUF_TABLELOG_ABSOLUTEMAX)
|
||||||
# error "HUF_TABLELOG_MAX is too large !"
|
# error "HUF_TABLELOG_MAX is too large !"
|
||||||
#endif
|
#endif
|
||||||
@ -223,12 +223,13 @@ typedef enum {
|
|||||||
* If it uses hufTable it does not modify hufTable or repeat.
|
* If it uses hufTable it does not modify hufTable or repeat.
|
||||||
* If it doesn't, it sets *repeat = HUF_repeat_none, and it sets hufTable to the table used.
|
* If it doesn't, it sets *repeat = HUF_repeat_none, and it sets hufTable to the table used.
|
||||||
* If preferRepeat then the old table will always be used if valid. */
|
* If preferRepeat then the old table will always be used if valid. */
|
||||||
size_t HUF_compress4X_repeat(void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog, void* workSpace, size_t wkspSize, HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat); /**< `workSpace` must be a table of at least HUF_WORKSPACE_SIZE_U32 unsigned */
|
size_t HUF_compress4X_repeat(void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog, void* workSpace, size_t wkspSize, HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2); /**< `workSpace` must be a table of at least HUF_WORKSPACE_SIZE_U32 unsigned */
|
||||||
|
|
||||||
/** HUF_buildCTable_wksp() :
|
/** HUF_buildCTable_wksp() :
|
||||||
* Same as HUF_buildCTable(), but using externally allocated scratch buffer.
|
* Same as HUF_buildCTable(), but using externally allocated scratch buffer.
|
||||||
* `workSpace` must be aligned on 4-bytes boundaries, and be at least as large as a table of 1024 unsigned.
|
* `workSpace` must be aligned on 4-bytes boundaries, and be at least as large as a table of HUF_CTABLE_WORKSPACE_SIZE_U32 unsigned.
|
||||||
*/
|
*/
|
||||||
|
#define HUF_CTABLE_WORKSPACE_SIZE_U32 (2*HUF_SYMBOLVALUE_MAX +1 +1)
|
||||||
size_t HUF_buildCTable_wksp (HUF_CElt* tree, const U32* count, U32 maxSymbolValue, U32 maxNbBits, void* workSpace, size_t wkspSize);
|
size_t HUF_buildCTable_wksp (HUF_CElt* tree, const U32* count, U32 maxSymbolValue, U32 maxNbBits, void* workSpace, size_t wkspSize);
|
||||||
|
|
||||||
/*! HUF_readStats() :
|
/*! HUF_readStats() :
|
||||||
@ -236,8 +237,8 @@ size_t HUF_buildCTable_wksp (HUF_CElt* tree, const U32* count, U32 maxSymbolValu
|
|||||||
`huffWeight` is destination buffer.
|
`huffWeight` is destination buffer.
|
||||||
@return : size read from `src` , or an error Code .
|
@return : size read from `src` , or an error Code .
|
||||||
Note : Needed by HUF_readCTable() and HUF_readDTableXn() . */
|
Note : Needed by HUF_readCTable() and HUF_readDTableXn() . */
|
||||||
size_t HUF_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
|
size_t HUF_readStats(BYTE* huffWeight, size_t hwSize,
|
||||||
U32* nbSymbolsPtr, U32* tableLogPtr,
|
U32* rankStats, U32* nbSymbolsPtr, U32* tableLogPtr,
|
||||||
const void* src, size_t srcSize);
|
const void* src, size_t srcSize);
|
||||||
|
|
||||||
/** HUF_readCTable() :
|
/** HUF_readCTable() :
|
||||||
@ -279,7 +280,7 @@ size_t HUF_compress1X_usingCTable(void* dst, size_t dstSize, const void* src, si
|
|||||||
* If it uses hufTable it does not modify hufTable or repeat.
|
* If it uses hufTable it does not modify hufTable or repeat.
|
||||||
* If it doesn't, it sets *repeat = HUF_repeat_none, and it sets hufTable to the table used.
|
* If it doesn't, it sets *repeat = HUF_repeat_none, and it sets hufTable to the table used.
|
||||||
* If preferRepeat then the old table will always be used if valid. */
|
* If preferRepeat then the old table will always be used if valid. */
|
||||||
size_t HUF_compress1X_repeat(void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog, void* workSpace, size_t wkspSize, HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat); /**< `workSpace` must be a table of at least HUF_WORKSPACE_SIZE_U32 unsigned */
|
size_t HUF_compress1X_repeat(void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog, void* workSpace, size_t wkspSize, HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2); /**< `workSpace` must be a table of at least HUF_WORKSPACE_SIZE_U32 unsigned */
|
||||||
|
|
||||||
size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* single-symbol decoder */
|
size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* single-symbol decoder */
|
||||||
size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* double-symbol decoder */
|
size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* double-symbol decoder */
|
||||||
@ -295,6 +296,14 @@ size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize, const void* cS
|
|||||||
size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
|
size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
|
||||||
size_t HUF_decompress1X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
|
size_t HUF_decompress1X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
|
||||||
|
|
||||||
|
/* BMI2 variants.
|
||||||
|
* If the CPU has BMI2 support pass bmi2=1, otherwise pass bmi2=0.
|
||||||
|
*/
|
||||||
|
size_t HUF_decompress1X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2);
|
||||||
|
size_t HUF_decompress1X2_DCtx_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2);
|
||||||
|
size_t HUF_decompress4X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2);
|
||||||
|
size_t HUF_decompress4X_hufOnly_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2);
|
||||||
|
|
||||||
#endif /* HUF_STATIC_LINKING_ONLY */
|
#endif /* HUF_STATIC_LINKING_ONLY */
|
||||||
|
|
||||||
#if defined (__cplusplus)
|
#if defined (__cplusplus)
|
||||||
|
@ -35,12 +35,20 @@ extern "C" {
|
|||||||
# define ZSTDERRORLIB_API ZSTDERRORLIB_VISIBILITY
|
# define ZSTDERRORLIB_API ZSTDERRORLIB_VISIBILITY
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/*-****************************************
|
/*-*********************************************
|
||||||
* error codes list
|
* Error codes list
|
||||||
* note : this API is still considered unstable
|
*-*********************************************
|
||||||
* and shall not be used with a dynamic library.
|
* Error codes _values_ are pinned down since v1.3.1 only.
|
||||||
* only static linking is allowed
|
* Therefore, don't rely on values if you may link to any version < v1.3.1.
|
||||||
******************************************/
|
*
|
||||||
|
* Only values < 100 are considered stable.
|
||||||
|
*
|
||||||
|
* note 1 : this API shall be used with static linking only.
|
||||||
|
* dynamic linking is not yet officially supported.
|
||||||
|
* note 2 : Prefer relying on the enum than on its value whenever possible
|
||||||
|
* This is the only supported way to use the error list < v1.3.1
|
||||||
|
* note 3 : ZSTD_isError() is always correct, whatever the library version.
|
||||||
|
**********************************************/
|
||||||
typedef enum {
|
typedef enum {
|
||||||
ZSTD_error_no_error = 0,
|
ZSTD_error_no_error = 0,
|
||||||
ZSTD_error_GENERIC = 1,
|
ZSTD_error_GENERIC = 1,
|
||||||
@ -61,9 +69,10 @@ typedef enum {
|
|||||||
ZSTD_error_stage_wrong = 60,
|
ZSTD_error_stage_wrong = 60,
|
||||||
ZSTD_error_init_missing = 62,
|
ZSTD_error_init_missing = 62,
|
||||||
ZSTD_error_memory_allocation = 64,
|
ZSTD_error_memory_allocation = 64,
|
||||||
|
ZSTD_error_workSpace_tooSmall= 66,
|
||||||
ZSTD_error_dstSize_tooSmall = 70,
|
ZSTD_error_dstSize_tooSmall = 70,
|
||||||
ZSTD_error_srcSize_wrong = 72,
|
ZSTD_error_srcSize_wrong = 72,
|
||||||
/* following error codes are not stable and may be removed or changed in a future version */
|
/* following error codes are __NOT STABLE__, they can be removed or changed in future versions */
|
||||||
ZSTD_error_frameIndex_tooLarge = 100,
|
ZSTD_error_frameIndex_tooLarge = 100,
|
||||||
ZSTD_error_seekableIO = 102,
|
ZSTD_error_seekableIO = 102,
|
||||||
ZSTD_error_maxCode = 120 /* never EVER use this value directly, it can change in future versions! Use ZSTD_isError() instead */
|
ZSTD_error_maxCode = 120 /* never EVER use this value directly, it can change in future versions! Use ZSTD_isError() instead */
|
||||||
|
@ -132,14 +132,15 @@ typedef enum { set_basic, set_rle, set_compressed, set_repeat } symbolEncodingTy
|
|||||||
|
|
||||||
#define Litbits 8
|
#define Litbits 8
|
||||||
#define MaxLit ((1<<Litbits) - 1)
|
#define MaxLit ((1<<Litbits) - 1)
|
||||||
#define MaxML 52
|
#define MaxML 52
|
||||||
#define MaxLL 35
|
#define MaxLL 35
|
||||||
#define DefaultMaxOff 28
|
#define DefaultMaxOff 28
|
||||||
#define MaxOff 31
|
#define MaxOff 31
|
||||||
#define MaxSeq MAX(MaxLL, MaxML) /* Assumption : MaxOff < MaxLL,MaxML */
|
#define MaxSeq MAX(MaxLL, MaxML) /* Assumption : MaxOff < MaxLL,MaxML */
|
||||||
#define MLFSELog 9
|
#define MLFSELog 9
|
||||||
#define LLFSELog 9
|
#define LLFSELog 9
|
||||||
#define OffFSELog 8
|
#define OffFSELog 8
|
||||||
|
#define MaxFSELog MAX(MAX(MLFSELog, LLFSELog), OffFSELog)
|
||||||
|
|
||||||
static const U32 LL_bits[MaxLL+1] = { 0, 0, 0, 0, 0, 0, 0, 0,
|
static const U32 LL_bits[MaxLL+1] = { 0, 0, 0, 0, 0, 0, 0, 0,
|
||||||
0, 0, 0, 0, 0, 0, 0, 0,
|
0, 0, 0, 0, 0, 0, 0, 0,
|
||||||
@ -212,6 +213,12 @@ MEM_STATIC void ZSTD_wildcopy_e(void* dst, const void* src, void* dstEnd) /* s
|
|||||||
/*-*******************************************
|
/*-*******************************************
|
||||||
* Private declarations
|
* Private declarations
|
||||||
*********************************************/
|
*********************************************/
|
||||||
|
typedef struct rawSeq_s {
|
||||||
|
U32 offset;
|
||||||
|
U32 litLength;
|
||||||
|
U32 matchLength;
|
||||||
|
} rawSeq;
|
||||||
|
|
||||||
typedef struct seqDef_s {
|
typedef struct seqDef_s {
|
||||||
U32 offset;
|
U32 offset;
|
||||||
U16 litLength;
|
U16 litLength;
|
||||||
|
@ -248,7 +248,7 @@ static size_t FSE_writeNCount_generic (void* header, size_t headerBufferSize,
|
|||||||
bitCount -= (count<max);
|
bitCount -= (count<max);
|
||||||
previous0 = (count==1);
|
previous0 = (count==1);
|
||||||
if (remaining<1) return ERROR(GENERIC);
|
if (remaining<1) return ERROR(GENERIC);
|
||||||
while (remaining<threshold) nbBits--, threshold>>=1;
|
while (remaining<threshold) { nbBits--; threshold>>=1; }
|
||||||
}
|
}
|
||||||
if (bitCount>16) {
|
if (bitCount>16) {
|
||||||
if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
|
if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
|
||||||
@ -540,7 +540,7 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count,
|
|||||||
find max, then give all remaining points to max */
|
find max, then give all remaining points to max */
|
||||||
U32 maxV = 0, maxC = 0;
|
U32 maxV = 0, maxC = 0;
|
||||||
for (s=0; s<=maxSymbolValue; s++)
|
for (s=0; s<=maxSymbolValue; s++)
|
||||||
if (count[s] > maxC) maxV=s, maxC=count[s];
|
if (count[s] > maxC) { maxV=s; maxC=count[s]; }
|
||||||
norm[maxV] += (short)ToDistribute;
|
norm[maxV] += (short)ToDistribute;
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
@ -548,7 +548,7 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count,
|
|||||||
if (total == 0) {
|
if (total == 0) {
|
||||||
/* all of the symbols were low enough for the lowOne or lowThreshold */
|
/* all of the symbols were low enough for the lowOne or lowThreshold */
|
||||||
for (s=0; ToDistribute > 0; s = (s+1)%(maxSymbolValue+1))
|
for (s=0; ToDistribute > 0; s = (s+1)%(maxSymbolValue+1))
|
||||||
if (norm[s] > 0) ToDistribute--, norm[s]++;
|
if (norm[s] > 0) { ToDistribute--; norm[s]++; }
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -604,7 +604,7 @@ size_t FSE_normalizeCount (short* normalizedCounter, unsigned tableLog,
|
|||||||
U64 restToBeat = vStep * rtbTable[proba];
|
U64 restToBeat = vStep * rtbTable[proba];
|
||||||
proba += (count[s]*step) - ((U64)proba<<scale) > restToBeat;
|
proba += (count[s]*step) - ((U64)proba<<scale) > restToBeat;
|
||||||
}
|
}
|
||||||
if (proba > largestP) largestP=proba, largest=s;
|
if (proba > largestP) { largestP=proba; largest=s; }
|
||||||
normalizedCounter[s] = proba;
|
normalizedCounter[s] = proba;
|
||||||
stillToDistribute -= proba;
|
stillToDistribute -= proba;
|
||||||
} }
|
} }
|
||||||
|
@ -46,6 +46,7 @@
|
|||||||
#include <string.h> /* memcpy, memset */
|
#include <string.h> /* memcpy, memset */
|
||||||
#include <stdio.h> /* printf (debug) */
|
#include <stdio.h> /* printf (debug) */
|
||||||
#include "bitstream.h"
|
#include "bitstream.h"
|
||||||
|
#include "compiler.h"
|
||||||
#define FSE_STATIC_LINKING_ONLY /* FSE_optimalTableLog_internal */
|
#define FSE_STATIC_LINKING_ONLY /* FSE_optimalTableLog_internal */
|
||||||
#include "fse.h" /* header compression */
|
#include "fse.h" /* header compression */
|
||||||
#define HUF_STATIC_LINKING_ONLY
|
#define HUF_STATIC_LINKING_ONLY
|
||||||
@ -322,7 +323,10 @@ static void HUF_sort(nodeElt* huffNode, const U32* count, U32 maxSymbolValue)
|
|||||||
U32 const c = count[n];
|
U32 const c = count[n];
|
||||||
U32 const r = BIT_highbit32(c+1) + 1;
|
U32 const r = BIT_highbit32(c+1) + 1;
|
||||||
U32 pos = rank[r].current++;
|
U32 pos = rank[r].current++;
|
||||||
while ((pos > rank[r].base) && (c > huffNode[pos-1].count)) huffNode[pos]=huffNode[pos-1], pos--;
|
while ((pos > rank[r].base) && (c > huffNode[pos-1].count)) {
|
||||||
|
huffNode[pos] = huffNode[pos-1];
|
||||||
|
pos--;
|
||||||
|
}
|
||||||
huffNode[pos].count = c;
|
huffNode[pos].count = c;
|
||||||
huffNode[pos].byte = (BYTE)n;
|
huffNode[pos].byte = (BYTE)n;
|
||||||
}
|
}
|
||||||
@ -331,10 +335,10 @@ static void HUF_sort(nodeElt* huffNode, const U32* count, U32 maxSymbolValue)
|
|||||||
|
|
||||||
/** HUF_buildCTable_wksp() :
|
/** HUF_buildCTable_wksp() :
|
||||||
* Same as HUF_buildCTable(), but using externally allocated scratch buffer.
|
* Same as HUF_buildCTable(), but using externally allocated scratch buffer.
|
||||||
* `workSpace` must be aligned on 4-bytes boundaries, and be at least as large as a table of 1024 unsigned.
|
* `workSpace` must be aligned on 4-bytes boundaries, and be at least as large as a table of HUF_CTABLE_WORKSPACE_SIZE_U32 unsigned.
|
||||||
*/
|
*/
|
||||||
#define STARTNODE (HUF_SYMBOLVALUE_MAX+1)
|
#define STARTNODE (HUF_SYMBOLVALUE_MAX+1)
|
||||||
typedef nodeElt huffNodeTable[2*HUF_SYMBOLVALUE_MAX+1 +1];
|
typedef nodeElt huffNodeTable[HUF_CTABLE_WORKSPACE_SIZE_U32];
|
||||||
size_t HUF_buildCTable_wksp (HUF_CElt* tree, const U32* count, U32 maxSymbolValue, U32 maxNbBits, void* workSpace, size_t wkspSize)
|
size_t HUF_buildCTable_wksp (HUF_CElt* tree, const U32* count, U32 maxSymbolValue, U32 maxNbBits, void* workSpace, size_t wkspSize)
|
||||||
{
|
{
|
||||||
nodeElt* const huffNode0 = (nodeElt*)workSpace;
|
nodeElt* const huffNode0 = (nodeElt*)workSpace;
|
||||||
@ -345,9 +349,10 @@ size_t HUF_buildCTable_wksp (HUF_CElt* tree, const U32* count, U32 maxSymbolValu
|
|||||||
U32 nodeRoot;
|
U32 nodeRoot;
|
||||||
|
|
||||||
/* safety checks */
|
/* safety checks */
|
||||||
if (wkspSize < sizeof(huffNodeTable)) return ERROR(GENERIC); /* workSpace is not large enough */
|
if (((size_t)workSpace & 3) != 0) return ERROR(GENERIC); /* must be aligned on 4-bytes boundaries */
|
||||||
|
if (wkspSize < sizeof(huffNodeTable)) return ERROR(workSpace_tooSmall);
|
||||||
if (maxNbBits == 0) maxNbBits = HUF_TABLELOG_DEFAULT;
|
if (maxNbBits == 0) maxNbBits = HUF_TABLELOG_DEFAULT;
|
||||||
if (maxSymbolValue > HUF_SYMBOLVALUE_MAX) return ERROR(GENERIC);
|
if (maxSymbolValue > HUF_SYMBOLVALUE_MAX) return ERROR(maxSymbolValue_tooLarge);
|
||||||
memset(huffNode0, 0, sizeof(huffNodeTable));
|
memset(huffNode0, 0, sizeof(huffNodeTable));
|
||||||
|
|
||||||
/* sort, decreasing order */
|
/* sort, decreasing order */
|
||||||
@ -433,117 +438,69 @@ static int HUF_validateCTable(const HUF_CElt* CTable, const unsigned* count, uns
|
|||||||
return !bad;
|
return !bad;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void HUF_encodeSymbol(BIT_CStream_t* bitCPtr, U32 symbol, const HUF_CElt* CTable)
|
|
||||||
{
|
|
||||||
BIT_addBitsFast(bitCPtr, CTable[symbol].val, CTable[symbol].nbBits);
|
|
||||||
}
|
|
||||||
|
|
||||||
size_t HUF_compressBound(size_t size) { return HUF_COMPRESSBOUND(size); }
|
size_t HUF_compressBound(size_t size) { return HUF_COMPRESSBOUND(size); }
|
||||||
|
|
||||||
#define HUF_FLUSHBITS(s) BIT_flushBits(s)
|
|
||||||
|
|
||||||
#define HUF_FLUSHBITS_1(stream) \
|
#define FUNCTION(fn) fn##_default
|
||||||
if (sizeof((stream)->bitContainer)*8 < HUF_TABLELOG_MAX*2+7) HUF_FLUSHBITS(stream)
|
#define TARGET
|
||||||
|
#include "huf_compress_impl.h"
|
||||||
|
#undef TARGET
|
||||||
|
#undef FUNCTION
|
||||||
|
|
||||||
#define HUF_FLUSHBITS_2(stream) \
|
#if DYNAMIC_BMI2
|
||||||
if (sizeof((stream)->bitContainer)*8 < HUF_TABLELOG_MAX*4+7) HUF_FLUSHBITS(stream)
|
|
||||||
|
#define FUNCTION(fn) fn##_bmi2
|
||||||
|
#define TARGET TARGET_ATTRIBUTE("bmi2")
|
||||||
|
#include "huf_compress_impl.h"
|
||||||
|
#undef TARGET
|
||||||
|
#undef FUNCTION
|
||||||
|
|
||||||
|
#endif
|
||||||
|
|
||||||
|
static size_t HUF_compress1X_usingCTable_internal(void* dst, size_t dstSize,
|
||||||
|
const void* src, size_t srcSize,
|
||||||
|
const HUF_CElt* CTable, const int bmi2)
|
||||||
|
{
|
||||||
|
#if DYNAMIC_BMI2
|
||||||
|
if (bmi2) {
|
||||||
|
return HUF_compress1X_usingCTable_internal_bmi2(dst, dstSize, src, srcSize, CTable);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
(void)bmi2;
|
||||||
|
return HUF_compress1X_usingCTable_internal_default(dst, dstSize, src, srcSize, CTable);
|
||||||
|
}
|
||||||
|
|
||||||
|
static size_t HUF_compress4X_usingCTable_internal(void* dst, size_t dstSize,
|
||||||
|
const void* src, size_t srcSize,
|
||||||
|
const HUF_CElt* CTable, const int bmi2)
|
||||||
|
{
|
||||||
|
#if DYNAMIC_BMI2
|
||||||
|
if (bmi2) {
|
||||||
|
return HUF_compress4X_usingCTable_internal_bmi2(dst, dstSize, src, srcSize, CTable);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
(void)bmi2;
|
||||||
|
return HUF_compress4X_usingCTable_internal_default(dst, dstSize, src, srcSize, CTable);
|
||||||
|
}
|
||||||
|
|
||||||
size_t HUF_compress1X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
|
size_t HUF_compress1X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
|
||||||
{
|
{
|
||||||
const BYTE* ip = (const BYTE*) src;
|
return HUF_compress1X_usingCTable_internal(dst, dstSize, src, srcSize, CTable, /* bmi2 */ 0);
|
||||||
BYTE* const ostart = (BYTE*)dst;
|
|
||||||
BYTE* const oend = ostart + dstSize;
|
|
||||||
BYTE* op = ostart;
|
|
||||||
size_t n;
|
|
||||||
BIT_CStream_t bitC;
|
|
||||||
|
|
||||||
/* init */
|
|
||||||
if (dstSize < 8) return 0; /* not enough space to compress */
|
|
||||||
{ size_t const initErr = BIT_initCStream(&bitC, op, oend-op);
|
|
||||||
if (HUF_isError(initErr)) return 0; }
|
|
||||||
|
|
||||||
n = srcSize & ~3; /* join to mod 4 */
|
|
||||||
switch (srcSize & 3)
|
|
||||||
{
|
|
||||||
case 3 : HUF_encodeSymbol(&bitC, ip[n+ 2], CTable);
|
|
||||||
HUF_FLUSHBITS_2(&bitC);
|
|
||||||
/* fall-through */
|
|
||||||
case 2 : HUF_encodeSymbol(&bitC, ip[n+ 1], CTable);
|
|
||||||
HUF_FLUSHBITS_1(&bitC);
|
|
||||||
/* fall-through */
|
|
||||||
case 1 : HUF_encodeSymbol(&bitC, ip[n+ 0], CTable);
|
|
||||||
HUF_FLUSHBITS(&bitC);
|
|
||||||
/* fall-through */
|
|
||||||
case 0 : /* fall-through */
|
|
||||||
default: break;
|
|
||||||
}
|
|
||||||
|
|
||||||
for (; n>0; n-=4) { /* note : n&3==0 at this stage */
|
|
||||||
HUF_encodeSymbol(&bitC, ip[n- 1], CTable);
|
|
||||||
HUF_FLUSHBITS_1(&bitC);
|
|
||||||
HUF_encodeSymbol(&bitC, ip[n- 2], CTable);
|
|
||||||
HUF_FLUSHBITS_2(&bitC);
|
|
||||||
HUF_encodeSymbol(&bitC, ip[n- 3], CTable);
|
|
||||||
HUF_FLUSHBITS_1(&bitC);
|
|
||||||
HUF_encodeSymbol(&bitC, ip[n- 4], CTable);
|
|
||||||
HUF_FLUSHBITS(&bitC);
|
|
||||||
}
|
|
||||||
|
|
||||||
return BIT_closeCStream(&bitC);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
size_t HUF_compress4X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
|
size_t HUF_compress4X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
|
||||||
{
|
{
|
||||||
size_t const segmentSize = (srcSize+3)/4; /* first 3 segments */
|
return HUF_compress4X_usingCTable_internal(dst, dstSize, src, srcSize, CTable, /* bmi2 */ 0);
|
||||||
const BYTE* ip = (const BYTE*) src;
|
|
||||||
const BYTE* const iend = ip + srcSize;
|
|
||||||
BYTE* const ostart = (BYTE*) dst;
|
|
||||||
BYTE* const oend = ostart + dstSize;
|
|
||||||
BYTE* op = ostart;
|
|
||||||
|
|
||||||
if (dstSize < 6 + 1 + 1 + 1 + 8) return 0; /* minimum space to compress successfully */
|
|
||||||
if (srcSize < 12) return 0; /* no saving possible : too small input */
|
|
||||||
op += 6; /* jumpTable */
|
|
||||||
|
|
||||||
{ CHECK_V_F(cSize, HUF_compress1X_usingCTable(op, oend-op, ip, segmentSize, CTable) );
|
|
||||||
if (cSize==0) return 0;
|
|
||||||
MEM_writeLE16(ostart, (U16)cSize);
|
|
||||||
op += cSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
ip += segmentSize;
|
|
||||||
{ CHECK_V_F(cSize, HUF_compress1X_usingCTable(op, oend-op, ip, segmentSize, CTable) );
|
|
||||||
if (cSize==0) return 0;
|
|
||||||
MEM_writeLE16(ostart+2, (U16)cSize);
|
|
||||||
op += cSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
ip += segmentSize;
|
|
||||||
{ CHECK_V_F(cSize, HUF_compress1X_usingCTable(op, oend-op, ip, segmentSize, CTable) );
|
|
||||||
if (cSize==0) return 0;
|
|
||||||
MEM_writeLE16(ostart+4, (U16)cSize);
|
|
||||||
op += cSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
ip += segmentSize;
|
|
||||||
{ CHECK_V_F(cSize, HUF_compress1X_usingCTable(op, oend-op, ip, iend-ip, CTable) );
|
|
||||||
if (cSize==0) return 0;
|
|
||||||
op += cSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
return op-ostart;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static size_t HUF_compressCTable_internal(
|
static size_t HUF_compressCTable_internal(
|
||||||
BYTE* const ostart, BYTE* op, BYTE* const oend,
|
BYTE* const ostart, BYTE* op, BYTE* const oend,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
unsigned singleStream, const HUF_CElt* CTable)
|
unsigned singleStream, const HUF_CElt* CTable, const int bmi2)
|
||||||
{
|
{
|
||||||
size_t const cSize = singleStream ?
|
size_t const cSize = singleStream ?
|
||||||
HUF_compress1X_usingCTable(op, oend - op, src, srcSize, CTable) :
|
HUF_compress1X_usingCTable_internal(op, oend - op, src, srcSize, CTable, bmi2) :
|
||||||
HUF_compress4X_usingCTable(op, oend - op, src, srcSize, CTable);
|
HUF_compress4X_usingCTable_internal(op, oend - op, src, srcSize, CTable, bmi2);
|
||||||
if (HUF_isError(cSize)) { return cSize; }
|
if (HUF_isError(cSize)) { return cSize; }
|
||||||
if (cSize==0) { return 0; } /* uncompressible */
|
if (cSize==0) { return 0; } /* uncompressible */
|
||||||
op += cSize;
|
op += cSize;
|
||||||
@ -552,86 +509,98 @@ static size_t HUF_compressCTable_internal(
|
|||||||
return op-ostart;
|
return op-ostart;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
U32 count[HUF_SYMBOLVALUE_MAX + 1];
|
||||||
|
HUF_CElt CTable[HUF_SYMBOLVALUE_MAX + 1];
|
||||||
|
huffNodeTable nodeTable;
|
||||||
|
} HUF_compress_tables_t;
|
||||||
|
|
||||||
/* `workSpace` must a table of at least 1024 unsigned */
|
/* HUF_compress_internal() :
|
||||||
|
* `workSpace` must a table of at least HUF_WORKSPACE_SIZE_U32 unsigned */
|
||||||
static size_t HUF_compress_internal (
|
static size_t HUF_compress_internal (
|
||||||
void* dst, size_t dstSize,
|
void* dst, size_t dstSize,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
unsigned maxSymbolValue, unsigned huffLog,
|
unsigned maxSymbolValue, unsigned huffLog,
|
||||||
unsigned singleStream,
|
unsigned singleStream,
|
||||||
void* workSpace, size_t wkspSize,
|
void* workSpace, size_t wkspSize,
|
||||||
HUF_CElt* oldHufTable, HUF_repeat* repeat, int preferRepeat)
|
HUF_CElt* oldHufTable, HUF_repeat* repeat, int preferRepeat,
|
||||||
|
const int bmi2)
|
||||||
{
|
{
|
||||||
|
HUF_compress_tables_t* const table = (HUF_compress_tables_t*)workSpace;
|
||||||
BYTE* const ostart = (BYTE*)dst;
|
BYTE* const ostart = (BYTE*)dst;
|
||||||
BYTE* const oend = ostart + dstSize;
|
BYTE* const oend = ostart + dstSize;
|
||||||
BYTE* op = ostart;
|
BYTE* op = ostart;
|
||||||
|
|
||||||
U32* count;
|
|
||||||
size_t const countSize = sizeof(U32) * (HUF_SYMBOLVALUE_MAX + 1);
|
|
||||||
HUF_CElt* CTable;
|
|
||||||
size_t const CTableSize = sizeof(HUF_CElt) * (HUF_SYMBOLVALUE_MAX + 1);
|
|
||||||
|
|
||||||
/* checks & inits */
|
/* checks & inits */
|
||||||
if (wkspSize < sizeof(huffNodeTable) + countSize + CTableSize) return ERROR(GENERIC);
|
if (((size_t)workSpace & 3) != 0) return ERROR(GENERIC); /* must be aligned on 4-bytes boundaries */
|
||||||
if (!srcSize) return 0; /* Uncompressed (note : 1 means rle, so first byte must be correct) */
|
if (wkspSize < sizeof(*table)) return ERROR(workSpace_tooSmall);
|
||||||
if (!dstSize) return 0; /* cannot fit within dst budget */
|
if (!srcSize) return 0; /* Uncompressed */
|
||||||
|
if (!dstSize) return 0; /* cannot fit anything within dst budget */
|
||||||
if (srcSize > HUF_BLOCKSIZE_MAX) return ERROR(srcSize_wrong); /* current block size limit */
|
if (srcSize > HUF_BLOCKSIZE_MAX) return ERROR(srcSize_wrong); /* current block size limit */
|
||||||
if (huffLog > HUF_TABLELOG_MAX) return ERROR(tableLog_tooLarge);
|
if (huffLog > HUF_TABLELOG_MAX) return ERROR(tableLog_tooLarge);
|
||||||
|
if (maxSymbolValue > HUF_SYMBOLVALUE_MAX) return ERROR(maxSymbolValue_tooLarge);
|
||||||
if (!maxSymbolValue) maxSymbolValue = HUF_SYMBOLVALUE_MAX;
|
if (!maxSymbolValue) maxSymbolValue = HUF_SYMBOLVALUE_MAX;
|
||||||
if (!huffLog) huffLog = HUF_TABLELOG_DEFAULT;
|
if (!huffLog) huffLog = HUF_TABLELOG_DEFAULT;
|
||||||
|
|
||||||
count = (U32*)workSpace;
|
/* Heuristic : If old table is valid, use it for small inputs */
|
||||||
workSpace = (BYTE*)workSpace + countSize;
|
|
||||||
wkspSize -= countSize;
|
|
||||||
CTable = (HUF_CElt*)workSpace;
|
|
||||||
workSpace = (BYTE*)workSpace + CTableSize;
|
|
||||||
wkspSize -= CTableSize;
|
|
||||||
|
|
||||||
/* Heuristic : If we don't need to check the validity of the old table use the old table for small inputs */
|
|
||||||
if (preferRepeat && repeat && *repeat == HUF_repeat_valid) {
|
if (preferRepeat && repeat && *repeat == HUF_repeat_valid) {
|
||||||
return HUF_compressCTable_internal(ostart, op, oend, src, srcSize, singleStream, oldHufTable);
|
return HUF_compressCTable_internal(ostart, op, oend,
|
||||||
|
src, srcSize,
|
||||||
|
singleStream, oldHufTable, bmi2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Scan input and build symbol stats */
|
/* Scan input and build symbol stats */
|
||||||
{ CHECK_V_F(largest, FSE_count_wksp (count, &maxSymbolValue, (const BYTE*)src, srcSize, (U32*)workSpace) );
|
{ CHECK_V_F(largest, FSE_count_wksp (table->count, &maxSymbolValue, (const BYTE*)src, srcSize, table->count) );
|
||||||
if (largest == srcSize) { *ostart = ((const BYTE*)src)[0]; return 1; } /* single symbol, rle */
|
if (largest == srcSize) { *ostart = ((const BYTE*)src)[0]; return 1; } /* single symbol, rle */
|
||||||
if (largest <= (srcSize >> 7)+1) return 0; /* Fast heuristic : not compressible enough */
|
if (largest <= (srcSize >> 7)+1) return 0; /* heuristic : probably not compressible enough */
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Check validity of previous table */
|
/* Check validity of previous table */
|
||||||
if (repeat && *repeat == HUF_repeat_check && !HUF_validateCTable(oldHufTable, count, maxSymbolValue)) {
|
if ( repeat
|
||||||
|
&& *repeat == HUF_repeat_check
|
||||||
|
&& !HUF_validateCTable(oldHufTable, table->count, maxSymbolValue)) {
|
||||||
*repeat = HUF_repeat_none;
|
*repeat = HUF_repeat_none;
|
||||||
}
|
}
|
||||||
/* Heuristic : use existing table for small inputs */
|
/* Heuristic : use existing table for small inputs */
|
||||||
if (preferRepeat && repeat && *repeat != HUF_repeat_none) {
|
if (preferRepeat && repeat && *repeat != HUF_repeat_none) {
|
||||||
return HUF_compressCTable_internal(ostart, op, oend, src, srcSize, singleStream, oldHufTable);
|
return HUF_compressCTable_internal(ostart, op, oend,
|
||||||
|
src, srcSize,
|
||||||
|
singleStream, oldHufTable, bmi2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Build Huffman Tree */
|
/* Build Huffman Tree */
|
||||||
huffLog = HUF_optimalTableLog(huffLog, srcSize, maxSymbolValue);
|
huffLog = HUF_optimalTableLog(huffLog, srcSize, maxSymbolValue);
|
||||||
{ CHECK_V_F(maxBits, HUF_buildCTable_wksp (CTable, count, maxSymbolValue, huffLog, workSpace, wkspSize) );
|
{ CHECK_V_F(maxBits, HUF_buildCTable_wksp(table->CTable, table->count,
|
||||||
|
maxSymbolValue, huffLog,
|
||||||
|
table->nodeTable, sizeof(table->nodeTable)) );
|
||||||
huffLog = (U32)maxBits;
|
huffLog = (U32)maxBits;
|
||||||
/* Zero the unused symbols so we can check it for validity */
|
/* Zero unused symbols in CTable, so we can check it for validity */
|
||||||
memset(CTable + maxSymbolValue + 1, 0, CTableSize - (maxSymbolValue + 1) * sizeof(HUF_CElt));
|
memset(table->CTable + (maxSymbolValue + 1), 0,
|
||||||
|
sizeof(table->CTable) - ((maxSymbolValue + 1) * sizeof(HUF_CElt)));
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Write table description header */
|
/* Write table description header */
|
||||||
{ CHECK_V_F(hSize, HUF_writeCTable (op, dstSize, CTable, maxSymbolValue, huffLog) );
|
{ CHECK_V_F(hSize, HUF_writeCTable (op, dstSize, table->CTable, maxSymbolValue, huffLog) );
|
||||||
/* Check if using the previous table will be beneficial */
|
/* Check if using previous huffman table is beneficial */
|
||||||
if (repeat && *repeat != HUF_repeat_none) {
|
if (repeat && *repeat != HUF_repeat_none) {
|
||||||
size_t const oldSize = HUF_estimateCompressedSize(oldHufTable, count, maxSymbolValue);
|
size_t const oldSize = HUF_estimateCompressedSize(oldHufTable, table->count, maxSymbolValue);
|
||||||
size_t const newSize = HUF_estimateCompressedSize(CTable, count, maxSymbolValue);
|
size_t const newSize = HUF_estimateCompressedSize(table->CTable, table->count, maxSymbolValue);
|
||||||
if (oldSize <= hSize + newSize || hSize + 12 >= srcSize) {
|
if (oldSize <= hSize + newSize || hSize + 12 >= srcSize) {
|
||||||
return HUF_compressCTable_internal(ostart, op, oend, src, srcSize, singleStream, oldHufTable);
|
return HUF_compressCTable_internal(ostart, op, oend,
|
||||||
}
|
src, srcSize,
|
||||||
}
|
singleStream, oldHufTable, bmi2);
|
||||||
/* Use the new table */
|
} }
|
||||||
|
|
||||||
|
/* Use the new huffman table */
|
||||||
if (hSize + 12ul >= srcSize) { return 0; }
|
if (hSize + 12ul >= srcSize) { return 0; }
|
||||||
op += hSize;
|
op += hSize;
|
||||||
if (repeat) { *repeat = HUF_repeat_none; }
|
if (repeat) { *repeat = HUF_repeat_none; }
|
||||||
if (oldHufTable) { memcpy(oldHufTable, CTable, CTableSize); } /* Save the new table */
|
if (oldHufTable)
|
||||||
|
memcpy(oldHufTable, table->CTable, sizeof(table->CTable)); /* Save new table */
|
||||||
}
|
}
|
||||||
return HUF_compressCTable_internal(ostart, op, oend, src, srcSize, singleStream, CTable);
|
return HUF_compressCTable_internal(ostart, op, oend,
|
||||||
|
src, srcSize,
|
||||||
|
singleStream, table->CTable, bmi2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -640,52 +609,70 @@ size_t HUF_compress1X_wksp (void* dst, size_t dstSize,
|
|||||||
unsigned maxSymbolValue, unsigned huffLog,
|
unsigned maxSymbolValue, unsigned huffLog,
|
||||||
void* workSpace, size_t wkspSize)
|
void* workSpace, size_t wkspSize)
|
||||||
{
|
{
|
||||||
return HUF_compress_internal(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, 1 /* single stream */, workSpace, wkspSize, NULL, NULL, 0);
|
return HUF_compress_internal(dst, dstSize, src, srcSize,
|
||||||
|
maxSymbolValue, huffLog, 1 /*single stream*/,
|
||||||
|
workSpace, wkspSize,
|
||||||
|
NULL, NULL, 0, 0 /*bmi2*/);
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t HUF_compress1X_repeat (void* dst, size_t dstSize,
|
size_t HUF_compress1X_repeat (void* dst, size_t dstSize,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
unsigned maxSymbolValue, unsigned huffLog,
|
unsigned maxSymbolValue, unsigned huffLog,
|
||||||
void* workSpace, size_t wkspSize,
|
void* workSpace, size_t wkspSize,
|
||||||
HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat)
|
HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2)
|
||||||
{
|
{
|
||||||
return HUF_compress_internal(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, 1 /* single stream */, workSpace, wkspSize, hufTable, repeat, preferRepeat);
|
return HUF_compress_internal(dst, dstSize, src, srcSize,
|
||||||
|
maxSymbolValue, huffLog, 1 /*single stream*/,
|
||||||
|
workSpace, wkspSize, hufTable,
|
||||||
|
repeat, preferRepeat, bmi2);
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t HUF_compress1X (void* dst, size_t dstSize,
|
size_t HUF_compress1X (void* dst, size_t dstSize,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
unsigned maxSymbolValue, unsigned huffLog)
|
unsigned maxSymbolValue, unsigned huffLog)
|
||||||
{
|
{
|
||||||
unsigned workSpace[1024];
|
unsigned workSpace[HUF_WORKSPACE_SIZE_U32];
|
||||||
return HUF_compress1X_wksp(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, workSpace, sizeof(workSpace));
|
return HUF_compress1X_wksp(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, workSpace, sizeof(workSpace));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* HUF_compress4X_repeat():
|
||||||
|
* compress input using 4 streams.
|
||||||
|
* provide workspace to generate compression tables */
|
||||||
size_t HUF_compress4X_wksp (void* dst, size_t dstSize,
|
size_t HUF_compress4X_wksp (void* dst, size_t dstSize,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
unsigned maxSymbolValue, unsigned huffLog,
|
unsigned maxSymbolValue, unsigned huffLog,
|
||||||
void* workSpace, size_t wkspSize)
|
void* workSpace, size_t wkspSize)
|
||||||
{
|
{
|
||||||
return HUF_compress_internal(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, 0 /* 4 streams */, workSpace, wkspSize, NULL, NULL, 0);
|
return HUF_compress_internal(dst, dstSize, src, srcSize,
|
||||||
|
maxSymbolValue, huffLog, 0 /*4 streams*/,
|
||||||
|
workSpace, wkspSize,
|
||||||
|
NULL, NULL, 0, 0 /*bmi2*/);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* HUF_compress4X_repeat():
|
||||||
|
* compress input using 4 streams.
|
||||||
|
* re-use an existing huffman compression table */
|
||||||
size_t HUF_compress4X_repeat (void* dst, size_t dstSize,
|
size_t HUF_compress4X_repeat (void* dst, size_t dstSize,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
unsigned maxSymbolValue, unsigned huffLog,
|
unsigned maxSymbolValue, unsigned huffLog,
|
||||||
void* workSpace, size_t wkspSize,
|
void* workSpace, size_t wkspSize,
|
||||||
HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat)
|
HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2)
|
||||||
{
|
{
|
||||||
return HUF_compress_internal(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, 0 /* 4 streams */, workSpace, wkspSize, hufTable, repeat, preferRepeat);
|
return HUF_compress_internal(dst, dstSize, src, srcSize,
|
||||||
|
maxSymbolValue, huffLog, 0 /* 4 streams */,
|
||||||
|
workSpace, wkspSize,
|
||||||
|
hufTable, repeat, preferRepeat, bmi2);
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t HUF_compress2 (void* dst, size_t dstSize,
|
size_t HUF_compress2 (void* dst, size_t dstSize,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
unsigned maxSymbolValue, unsigned huffLog)
|
unsigned maxSymbolValue, unsigned huffLog)
|
||||||
{
|
{
|
||||||
unsigned workSpace[1024];
|
unsigned workSpace[HUF_WORKSPACE_SIZE_U32];
|
||||||
return HUF_compress4X_wksp(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, workSpace, sizeof(workSpace));
|
return HUF_compress4X_wksp(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, workSpace, sizeof(workSpace));
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t HUF_compress (void* dst, size_t maxDstSize, const void* src, size_t srcSize)
|
size_t HUF_compress (void* dst, size_t maxDstSize, const void* src, size_t srcSize)
|
||||||
{
|
{
|
||||||
return HUF_compress2(dst, maxDstSize, src, (U32)srcSize, 255, HUF_TABLELOG_DEFAULT);
|
return HUF_compress2(dst, maxDstSize, src, srcSize, 255, HUF_TABLELOG_DEFAULT);
|
||||||
}
|
}
|
||||||
|
120
lib/compress/huf_compress_impl.h
Normal file
120
lib/compress/huf_compress_impl.h
Normal file
@ -0,0 +1,120 @@
|
|||||||
|
/*
|
||||||
|
* Copyright (c) 2018-present, Facebook, Inc.
|
||||||
|
* All rights reserved.
|
||||||
|
*
|
||||||
|
* This source code is licensed under both the BSD-style license (found in the
|
||||||
|
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||||
|
* in the COPYING file in the root directory of this source tree).
|
||||||
|
* You may select, at your option, one of the above-listed licenses.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#ifndef FUNCTION
|
||||||
|
# error "FUNCTION(name) must be defined"
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#ifndef TARGET
|
||||||
|
# error "TARGET must be defined"
|
||||||
|
#endif
|
||||||
|
|
||||||
|
|
||||||
|
static void FUNCTION(HUF_encodeSymbol)(BIT_CStream_t* bitCPtr, U32 symbol, const HUF_CElt* CTable)
|
||||||
|
{
|
||||||
|
BIT_addBitsFast(bitCPtr, CTable[symbol].val, CTable[symbol].nbBits);
|
||||||
|
}
|
||||||
|
|
||||||
|
#define HUF_FLUSHBITS(s) BIT_flushBits(s)
|
||||||
|
|
||||||
|
#define HUF_FLUSHBITS_1(stream) \
|
||||||
|
if (sizeof((stream)->bitContainer)*8 < HUF_TABLELOG_MAX*2+7) HUF_FLUSHBITS(stream)
|
||||||
|
|
||||||
|
#define HUF_FLUSHBITS_2(stream) \
|
||||||
|
if (sizeof((stream)->bitContainer)*8 < HUF_TABLELOG_MAX*4+7) HUF_FLUSHBITS(stream)
|
||||||
|
|
||||||
|
static TARGET
|
||||||
|
size_t FUNCTION(HUF_compress1X_usingCTable_internal)(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
|
||||||
|
{
|
||||||
|
const BYTE* ip = (const BYTE*) src;
|
||||||
|
BYTE* const ostart = (BYTE*)dst;
|
||||||
|
BYTE* const oend = ostart + dstSize;
|
||||||
|
BYTE* op = ostart;
|
||||||
|
size_t n;
|
||||||
|
BIT_CStream_t bitC;
|
||||||
|
|
||||||
|
/* init */
|
||||||
|
if (dstSize < 8) return 0; /* not enough space to compress */
|
||||||
|
{ size_t const initErr = BIT_initCStream(&bitC, op, oend-op);
|
||||||
|
if (HUF_isError(initErr)) return 0; }
|
||||||
|
|
||||||
|
n = srcSize & ~3; /* join to mod 4 */
|
||||||
|
switch (srcSize & 3)
|
||||||
|
{
|
||||||
|
case 3 : FUNCTION(HUF_encodeSymbol)(&bitC, ip[n+ 2], CTable);
|
||||||
|
HUF_FLUSHBITS_2(&bitC);
|
||||||
|
/* fall-through */
|
||||||
|
case 2 : FUNCTION(HUF_encodeSymbol)(&bitC, ip[n+ 1], CTable);
|
||||||
|
HUF_FLUSHBITS_1(&bitC);
|
||||||
|
/* fall-through */
|
||||||
|
case 1 : FUNCTION(HUF_encodeSymbol)(&bitC, ip[n+ 0], CTable);
|
||||||
|
HUF_FLUSHBITS(&bitC);
|
||||||
|
/* fall-through */
|
||||||
|
case 0 : /* fall-through */
|
||||||
|
default: break;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (; n>0; n-=4) { /* note : n&3==0 at this stage */
|
||||||
|
FUNCTION(HUF_encodeSymbol)(&bitC, ip[n- 1], CTable);
|
||||||
|
HUF_FLUSHBITS_1(&bitC);
|
||||||
|
FUNCTION(HUF_encodeSymbol)(&bitC, ip[n- 2], CTable);
|
||||||
|
HUF_FLUSHBITS_2(&bitC);
|
||||||
|
FUNCTION(HUF_encodeSymbol)(&bitC, ip[n- 3], CTable);
|
||||||
|
HUF_FLUSHBITS_1(&bitC);
|
||||||
|
FUNCTION(HUF_encodeSymbol)(&bitC, ip[n- 4], CTable);
|
||||||
|
HUF_FLUSHBITS(&bitC);
|
||||||
|
}
|
||||||
|
|
||||||
|
return BIT_closeCStream(&bitC);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
static TARGET
|
||||||
|
size_t FUNCTION(HUF_compress4X_usingCTable_internal)(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
|
||||||
|
{
|
||||||
|
size_t const segmentSize = (srcSize+3)/4; /* first 3 segments */
|
||||||
|
const BYTE* ip = (const BYTE*) src;
|
||||||
|
const BYTE* const iend = ip + srcSize;
|
||||||
|
BYTE* const ostart = (BYTE*) dst;
|
||||||
|
BYTE* const oend = ostart + dstSize;
|
||||||
|
BYTE* op = ostart;
|
||||||
|
|
||||||
|
if (dstSize < 6 + 1 + 1 + 1 + 8) return 0; /* minimum space to compress successfully */
|
||||||
|
if (srcSize < 12) return 0; /* no saving possible : too small input */
|
||||||
|
op += 6; /* jumpTable */
|
||||||
|
|
||||||
|
{ CHECK_V_F(cSize, FUNCTION(HUF_compress1X_usingCTable_internal)(op, oend-op, ip, segmentSize, CTable) );
|
||||||
|
if (cSize==0) return 0;
|
||||||
|
MEM_writeLE16(ostart, (U16)cSize);
|
||||||
|
op += cSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
ip += segmentSize;
|
||||||
|
{ CHECK_V_F(cSize, FUNCTION(HUF_compress1X_usingCTable_internal)(op, oend-op, ip, segmentSize, CTable) );
|
||||||
|
if (cSize==0) return 0;
|
||||||
|
MEM_writeLE16(ostart+2, (U16)cSize);
|
||||||
|
op += cSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
ip += segmentSize;
|
||||||
|
{ CHECK_V_F(cSize, FUNCTION(HUF_compress1X_usingCTable_internal)(op, oend-op, ip, segmentSize, CTable) );
|
||||||
|
if (cSize==0) return 0;
|
||||||
|
MEM_writeLE16(ostart+4, (U16)cSize);
|
||||||
|
op += cSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
ip += segmentSize;
|
||||||
|
{ CHECK_V_F(cSize, FUNCTION(HUF_compress1X_usingCTable_internal)(op, oend-op, ip, iend-ip, CTable) );
|
||||||
|
if (cSize==0) return 0;
|
||||||
|
op += cSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
return op-ostart;
|
||||||
|
}
|
@ -21,6 +21,7 @@
|
|||||||
* Dependencies
|
* Dependencies
|
||||||
***************************************/
|
***************************************/
|
||||||
#include <string.h> /* memset */
|
#include <string.h> /* memset */
|
||||||
|
#include "cpu.h"
|
||||||
#include "mem.h"
|
#include "mem.h"
|
||||||
#define FSE_STATIC_LINKING_ONLY /* FSE_encodeSymbol */
|
#define FSE_STATIC_LINKING_ONLY /* FSE_encodeSymbol */
|
||||||
#include "fse.h"
|
#include "fse.h"
|
||||||
@ -73,6 +74,7 @@ ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem)
|
|||||||
cctx->customMem = customMem;
|
cctx->customMem = customMem;
|
||||||
cctx->requestedParams.compressionLevel = ZSTD_CLEVEL_DEFAULT;
|
cctx->requestedParams.compressionLevel = ZSTD_CLEVEL_DEFAULT;
|
||||||
cctx->requestedParams.fParams.contentSizeFlag = 1;
|
cctx->requestedParams.fParams.contentSizeFlag = 1;
|
||||||
|
cctx->bmi2 = ZSTD_cpuid_bmi2(ZSTD_cpuid());
|
||||||
return cctx;
|
return cctx;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -96,6 +98,7 @@ ZSTD_CCtx* ZSTD_initStaticCCtx(void *workspace, size_t workspaceSize)
|
|||||||
void* const ptr = cctx->blockState.nextCBlock + 1;
|
void* const ptr = cctx->blockState.nextCBlock + 1;
|
||||||
cctx->entropyWorkspace = (U32*)ptr;
|
cctx->entropyWorkspace = (U32*)ptr;
|
||||||
}
|
}
|
||||||
|
cctx->bmi2 = ZSTD_cpuid_bmi2(ZSTD_cpuid());
|
||||||
return cctx;
|
return cctx;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -140,8 +143,6 @@ size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs)
|
|||||||
/* private API call, for dictBuilder only */
|
/* private API call, for dictBuilder only */
|
||||||
const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx) { return &(ctx->seqStore); }
|
const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx) { return &(ctx->seqStore); }
|
||||||
|
|
||||||
#define ZSTD_CLEVEL_CUSTOM 999
|
|
||||||
|
|
||||||
static ZSTD_compressionParameters ZSTD_getCParamsFromCCtxParams(
|
static ZSTD_compressionParameters ZSTD_getCParamsFromCCtxParams(
|
||||||
ZSTD_CCtx_params CCtxParams, U64 srcSizeHint, size_t dictSize)
|
ZSTD_CCtx_params CCtxParams, U64 srcSizeHint, size_t dictSize)
|
||||||
{
|
{
|
||||||
@ -160,13 +161,6 @@ static void ZSTD_cLevelToCCtxParams_srcSize(ZSTD_CCtx_params* CCtxParams, U64 sr
|
|||||||
CCtxParams->compressionLevel = ZSTD_CLEVEL_CUSTOM;
|
CCtxParams->compressionLevel = ZSTD_CLEVEL_CUSTOM;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void ZSTD_cLevelToCParams(ZSTD_CCtx* cctx)
|
|
||||||
{
|
|
||||||
DEBUGLOG(4, "ZSTD_cLevelToCParams: level=%i", cctx->requestedParams.compressionLevel);
|
|
||||||
ZSTD_cLevelToCCtxParams_srcSize(
|
|
||||||
&cctx->requestedParams, cctx->pledgedSrcSizePlusOne-1);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void ZSTD_cLevelToCCtxParams(ZSTD_CCtx_params* CCtxParams)
|
static void ZSTD_cLevelToCCtxParams(ZSTD_CCtx_params* CCtxParams)
|
||||||
{
|
{
|
||||||
DEBUGLOG(4, "ZSTD_cLevelToCCtxParams");
|
DEBUGLOG(4, "ZSTD_cLevelToCCtxParams");
|
||||||
@ -246,10 +240,48 @@ static ZSTD_CCtx_params ZSTD_assignParamsToCCtxParams(
|
|||||||
return ERROR(parameter_outOfBound); \
|
return ERROR(parameter_outOfBound); \
|
||||||
} }
|
} }
|
||||||
|
|
||||||
|
|
||||||
|
static int ZSTD_isUpdateAuthorized(ZSTD_cParameter param)
|
||||||
|
{
|
||||||
|
switch(param)
|
||||||
|
{
|
||||||
|
case ZSTD_p_compressionLevel:
|
||||||
|
case ZSTD_p_hashLog:
|
||||||
|
case ZSTD_p_chainLog:
|
||||||
|
case ZSTD_p_searchLog:
|
||||||
|
case ZSTD_p_minMatch:
|
||||||
|
case ZSTD_p_targetLength:
|
||||||
|
case ZSTD_p_compressionStrategy:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case ZSTD_p_format:
|
||||||
|
case ZSTD_p_windowLog:
|
||||||
|
case ZSTD_p_contentSizeFlag:
|
||||||
|
case ZSTD_p_checksumFlag:
|
||||||
|
case ZSTD_p_dictIDFlag:
|
||||||
|
case ZSTD_p_forceMaxWindow :
|
||||||
|
case ZSTD_p_nbWorkers:
|
||||||
|
case ZSTD_p_jobSize:
|
||||||
|
case ZSTD_p_overlapSizeLog:
|
||||||
|
case ZSTD_p_enableLongDistanceMatching:
|
||||||
|
case ZSTD_p_ldmHashLog:
|
||||||
|
case ZSTD_p_ldmMinMatch:
|
||||||
|
case ZSTD_p_ldmBucketSizeLog:
|
||||||
|
case ZSTD_p_ldmHashEveryLog:
|
||||||
|
default:
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value)
|
size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value)
|
||||||
{
|
{
|
||||||
DEBUGLOG(4, "ZSTD_CCtx_setParameter (%u, %u)", (U32)param, value);
|
DEBUGLOG(4, "ZSTD_CCtx_setParameter (%u, %u)", (U32)param, value);
|
||||||
if (cctx->streamStage != zcss_init) return ERROR(stage_wrong);
|
if (cctx->streamStage != zcss_init) {
|
||||||
|
if (ZSTD_isUpdateAuthorized(param)) {
|
||||||
|
cctx->cParamsChanged = 1;
|
||||||
|
} else {
|
||||||
|
return ERROR(stage_wrong);
|
||||||
|
} }
|
||||||
|
|
||||||
switch(param)
|
switch(param)
|
||||||
{
|
{
|
||||||
@ -268,7 +300,9 @@ size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned v
|
|||||||
case ZSTD_p_targetLength:
|
case ZSTD_p_targetLength:
|
||||||
case ZSTD_p_compressionStrategy:
|
case ZSTD_p_compressionStrategy:
|
||||||
if (cctx->cdict) return ERROR(stage_wrong);
|
if (cctx->cdict) return ERROR(stage_wrong);
|
||||||
if (value>0) ZSTD_cLevelToCParams(cctx); /* Can optimize if srcSize is known */
|
if (value>0) {
|
||||||
|
ZSTD_cLevelToCCtxParams_srcSize(&cctx->requestedParams, cctx->pledgedSrcSizePlusOne-1); /* Optimize cParams when srcSize is known */
|
||||||
|
}
|
||||||
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
||||||
|
|
||||||
case ZSTD_p_contentSizeFlag:
|
case ZSTD_p_contentSizeFlag:
|
||||||
@ -281,20 +315,20 @@ size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned v
|
|||||||
* default : 0 when using a CDict, 1 when using a Prefix */
|
* default : 0 when using a CDict, 1 when using a Prefix */
|
||||||
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
||||||
|
|
||||||
case ZSTD_p_nbThreads:
|
case ZSTD_p_nbWorkers:
|
||||||
if ((value > 1) && cctx->staticSize) {
|
if ((value>0) && cctx->staticSize) {
|
||||||
return ERROR(parameter_unsupported); /* MT not compatible with static alloc */
|
return ERROR(parameter_unsupported); /* MT not compatible with static alloc */
|
||||||
}
|
}
|
||||||
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
||||||
|
|
||||||
case ZSTD_p_nonBlockingMode:
|
|
||||||
case ZSTD_p_jobSize:
|
case ZSTD_p_jobSize:
|
||||||
case ZSTD_p_overlapSizeLog:
|
case ZSTD_p_overlapSizeLog:
|
||||||
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
||||||
|
|
||||||
case ZSTD_p_enableLongDistanceMatching:
|
case ZSTD_p_enableLongDistanceMatching:
|
||||||
if (cctx->cdict) return ERROR(stage_wrong);
|
if (cctx->cdict) return ERROR(stage_wrong);
|
||||||
if (value>0) ZSTD_cLevelToCParams(cctx);
|
if (value>0)
|
||||||
|
ZSTD_cLevelToCCtxParams_srcSize(&cctx->requestedParams, cctx->pledgedSrcSizePlusOne-1); /* Optimize cParams when srcSize is known */
|
||||||
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
|
||||||
|
|
||||||
case ZSTD_p_ldmHashLog:
|
case ZSTD_p_ldmHashLog:
|
||||||
@ -403,21 +437,12 @@ size_t ZSTD_CCtxParam_setParameter(
|
|||||||
CCtxParams->forceWindow = (value > 0);
|
CCtxParams->forceWindow = (value > 0);
|
||||||
return CCtxParams->forceWindow;
|
return CCtxParams->forceWindow;
|
||||||
|
|
||||||
case ZSTD_p_nbThreads :
|
case ZSTD_p_nbWorkers :
|
||||||
if (value == 0) return CCtxParams->nbThreads;
|
|
||||||
#ifndef ZSTD_MULTITHREAD
|
#ifndef ZSTD_MULTITHREAD
|
||||||
if (value > 1) return ERROR(parameter_unsupported);
|
if (value > 0) return ERROR(parameter_unsupported);
|
||||||
return 1;
|
return 0;
|
||||||
#else
|
#else
|
||||||
return ZSTDMT_CCtxParam_setNbThreads(CCtxParams, value);
|
return ZSTDMT_CCtxParam_setNbWorkers(CCtxParams, value);
|
||||||
#endif
|
|
||||||
|
|
||||||
case ZSTD_p_nonBlockingMode :
|
|
||||||
#ifndef ZSTD_MULTITHREAD
|
|
||||||
return ERROR(parameter_unsupported);
|
|
||||||
#else
|
|
||||||
CCtxParams->nonBlockingMode = (value>0);
|
|
||||||
return CCtxParams->nonBlockingMode;
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
case ZSTD_p_jobSize :
|
case ZSTD_p_jobSize :
|
||||||
@ -476,6 +501,9 @@ size_t ZSTD_CCtxParam_setParameter(
|
|||||||
/** ZSTD_CCtx_setParametersUsingCCtxParams() :
|
/** ZSTD_CCtx_setParametersUsingCCtxParams() :
|
||||||
* just applies `params` into `cctx`
|
* just applies `params` into `cctx`
|
||||||
* no action is performed, parameters are merely stored.
|
* no action is performed, parameters are merely stored.
|
||||||
|
* If ZSTDMT is enabled, parameters are pushed to cctx->mtctx.
|
||||||
|
* This is possible even if a compression is ongoing.
|
||||||
|
* In which case, new parameters will be applied on the fly, starting with next compression job.
|
||||||
*/
|
*/
|
||||||
size_t ZSTD_CCtx_setParametersUsingCCtxParams(
|
size_t ZSTD_CCtx_setParametersUsingCCtxParams(
|
||||||
ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params)
|
ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params)
|
||||||
@ -484,7 +512,6 @@ size_t ZSTD_CCtx_setParametersUsingCCtxParams(
|
|||||||
if (cctx->cdict) return ERROR(stage_wrong);
|
if (cctx->cdict) return ERROR(stage_wrong);
|
||||||
|
|
||||||
cctx->requestedParams = *params;
|
cctx->requestedParams = *params;
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -680,7 +707,7 @@ static size_t ZSTD_sizeof_matchState(ZSTD_compressionParameters const* cParams,
|
|||||||
size_t ZSTD_estimateCCtxSize_usingCCtxParams(const ZSTD_CCtx_params* params)
|
size_t ZSTD_estimateCCtxSize_usingCCtxParams(const ZSTD_CCtx_params* params)
|
||||||
{
|
{
|
||||||
/* Estimate CCtx size is supported for single-threaded compression only. */
|
/* Estimate CCtx size is supported for single-threaded compression only. */
|
||||||
if (params->nbThreads > 1) { return ERROR(GENERIC); }
|
if (params->nbWorkers > 0) { return ERROR(GENERIC); }
|
||||||
{ ZSTD_compressionParameters const cParams =
|
{ ZSTD_compressionParameters const cParams =
|
||||||
ZSTD_getCParamsFromCCtxParams(*params, 0, 0);
|
ZSTD_getCParamsFromCCtxParams(*params, 0, 0);
|
||||||
size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog);
|
size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog);
|
||||||
@ -691,12 +718,11 @@ size_t ZSTD_estimateCCtxSize_usingCCtxParams(const ZSTD_CCtx_params* params)
|
|||||||
size_t const blockStateSpace = 2 * sizeof(ZSTD_compressedBlockState_t);
|
size_t const blockStateSpace = 2 * sizeof(ZSTD_compressedBlockState_t);
|
||||||
size_t const matchStateSize = ZSTD_sizeof_matchState(¶ms->cParams, /* forCCtx */ 1);
|
size_t const matchStateSize = ZSTD_sizeof_matchState(¶ms->cParams, /* forCCtx */ 1);
|
||||||
|
|
||||||
size_t const ldmSpace = params->ldmParams.enableLdm ?
|
size_t const ldmSpace = ZSTD_ldm_getTableSize(params->ldmParams);
|
||||||
ZSTD_ldm_getTableSize(params->ldmParams.hashLog,
|
size_t const ldmSeqSpace = ZSTD_ldm_getMaxNbSeq(params->ldmParams, blockSize) * sizeof(rawSeq);
|
||||||
params->ldmParams.bucketSizeLog) : 0;
|
|
||||||
|
|
||||||
size_t const neededSpace = entropySpace + blockStateSpace + tokenSpace +
|
size_t const neededSpace = entropySpace + blockStateSpace + tokenSpace +
|
||||||
matchStateSize + ldmSpace;
|
matchStateSize + ldmSpace + ldmSeqSpace;
|
||||||
|
|
||||||
DEBUGLOG(5, "sizeof(ZSTD_CCtx) : %u", (U32)sizeof(ZSTD_CCtx));
|
DEBUGLOG(5, "sizeof(ZSTD_CCtx) : %u", (U32)sizeof(ZSTD_CCtx));
|
||||||
DEBUGLOG(5, "estimate workSpace : %u", (U32)neededSpace);
|
DEBUGLOG(5, "estimate workSpace : %u", (U32)neededSpace);
|
||||||
@ -729,7 +755,7 @@ size_t ZSTD_estimateCCtxSize(int compressionLevel)
|
|||||||
|
|
||||||
size_t ZSTD_estimateCStreamSize_usingCCtxParams(const ZSTD_CCtx_params* params)
|
size_t ZSTD_estimateCStreamSize_usingCCtxParams(const ZSTD_CCtx_params* params)
|
||||||
{
|
{
|
||||||
if (params->nbThreads > 1) { return ERROR(GENERIC); }
|
if (params->nbWorkers > 0) { return ERROR(GENERIC); }
|
||||||
{ size_t const CCtxSize = ZSTD_estimateCCtxSize_usingCCtxParams(params);
|
{ size_t const CCtxSize = ZSTD_estimateCCtxSize_usingCCtxParams(params);
|
||||||
size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << params->cParams.windowLog);
|
size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << params->cParams.windowLog);
|
||||||
size_t const inBuffSize = ((size_t)1 << params->cParams.windowLog) + blockSize;
|
size_t const inBuffSize = ((size_t)1 << params->cParams.windowLog) + blockSize;
|
||||||
@ -768,7 +794,7 @@ size_t ZSTD_estimateCStreamSize(int compressionLevel) {
|
|||||||
ZSTD_frameProgression ZSTD_getFrameProgression(const ZSTD_CCtx* cctx)
|
ZSTD_frameProgression ZSTD_getFrameProgression(const ZSTD_CCtx* cctx)
|
||||||
{
|
{
|
||||||
#ifdef ZSTD_MULTITHREAD
|
#ifdef ZSTD_MULTITHREAD
|
||||||
if ((cctx->appliedParams.nbThreads > 1) || (cctx->appliedParams.nonBlockingMode)) {
|
if (cctx->appliedParams.nbWorkers > 0) {
|
||||||
return ZSTDMT_getFrameProgression(cctx->mtctx);
|
return ZSTDMT_getFrameProgression(cctx->mtctx);
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
@ -857,13 +883,9 @@ static void ZSTD_reset_compressedBlockState(ZSTD_compressedBlockState_t* bs)
|
|||||||
*/
|
*/
|
||||||
static void ZSTD_invalidateMatchState(ZSTD_matchState_t* ms)
|
static void ZSTD_invalidateMatchState(ZSTD_matchState_t* ms)
|
||||||
{
|
{
|
||||||
size_t const endT = (size_t)(ms->nextSrc - ms->base);
|
ZSTD_window_clear(&ms->window);
|
||||||
U32 const end = (U32)endT;
|
|
||||||
assert(endT < (3U<<30));
|
|
||||||
|
|
||||||
ms->lowLimit = end;
|
ms->nextToUpdate = ms->window.dictLimit + 1;
|
||||||
ms->dictLimit = end;
|
|
||||||
ms->nextToUpdate = end + 1;
|
|
||||||
ms->loadedDictEnd = 0;
|
ms->loadedDictEnd = 0;
|
||||||
ms->opt.litLengthSum = 0; /* force reset of btopt stats */
|
ms->opt.litLengthSum = 0; /* force reset of btopt stats */
|
||||||
}
|
}
|
||||||
@ -905,10 +927,8 @@ static void* ZSTD_reset_matchState(ZSTD_matchState_t* ms, void* ptr, ZSTD_compre
|
|||||||
|
|
||||||
assert(((size_t)ptr & 3) == 0);
|
assert(((size_t)ptr & 3) == 0);
|
||||||
|
|
||||||
ms->nextSrc = NULL;
|
|
||||||
ms->base = NULL;
|
|
||||||
ms->dictBase = NULL;
|
|
||||||
ms->hashLog3 = hashLog3;
|
ms->hashLog3 = hashLog3;
|
||||||
|
memset(&ms->window, 0, sizeof(ms->window));
|
||||||
ZSTD_invalidateMatchState(ms);
|
ZSTD_invalidateMatchState(ms);
|
||||||
|
|
||||||
/* opt parser space */
|
/* opt parser space */
|
||||||
@ -963,7 +983,8 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc,
|
|||||||
|
|
||||||
if (params.ldmParams.enableLdm) {
|
if (params.ldmParams.enableLdm) {
|
||||||
/* Adjust long distance matching parameters */
|
/* Adjust long distance matching parameters */
|
||||||
ZSTD_ldm_adjustParameters(¶ms.ldmParams, params.cParams.windowLog);
|
params.ldmParams.windowLog = params.cParams.windowLog;
|
||||||
|
ZSTD_ldm_adjustParameters(¶ms.ldmParams, ¶ms.cParams);
|
||||||
assert(params.ldmParams.hashLog >= params.ldmParams.bucketSizeLog);
|
assert(params.ldmParams.hashLog >= params.ldmParams.bucketSizeLog);
|
||||||
assert(params.ldmParams.hashEveryLog < 32);
|
assert(params.ldmParams.hashEveryLog < 32);
|
||||||
zc->ldmState.hashPower =
|
zc->ldmState.hashPower =
|
||||||
@ -978,17 +999,19 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc,
|
|||||||
size_t const buffOutSize = (zbuff==ZSTDb_buffered) ? ZSTD_compressBound(blockSize)+1 : 0;
|
size_t const buffOutSize = (zbuff==ZSTDb_buffered) ? ZSTD_compressBound(blockSize)+1 : 0;
|
||||||
size_t const buffInSize = (zbuff==ZSTDb_buffered) ? windowSize + blockSize : 0;
|
size_t const buffInSize = (zbuff==ZSTDb_buffered) ? windowSize + blockSize : 0;
|
||||||
size_t const matchStateSize = ZSTD_sizeof_matchState(¶ms.cParams, /* forCCtx */ 1);
|
size_t const matchStateSize = ZSTD_sizeof_matchState(¶ms.cParams, /* forCCtx */ 1);
|
||||||
|
size_t const maxNbLdmSeq = ZSTD_ldm_getMaxNbSeq(params.ldmParams, blockSize);
|
||||||
void* ptr;
|
void* ptr;
|
||||||
|
|
||||||
/* Check if workSpace is large enough, alloc a new one if needed */
|
/* Check if workSpace is large enough, alloc a new one if needed */
|
||||||
{ size_t const entropySpace = HUF_WORKSPACE_SIZE;
|
{ size_t const entropySpace = HUF_WORKSPACE_SIZE;
|
||||||
size_t const blockStateSpace = 2 * sizeof(ZSTD_compressedBlockState_t);
|
size_t const blockStateSpace = 2 * sizeof(ZSTD_compressedBlockState_t);
|
||||||
size_t const bufferSpace = buffInSize + buffOutSize;
|
size_t const bufferSpace = buffInSize + buffOutSize;
|
||||||
size_t const ldmSpace = params.ldmParams.enableLdm
|
size_t const ldmSpace = ZSTD_ldm_getTableSize(params.ldmParams);
|
||||||
? ZSTD_ldm_getTableSize(params.ldmParams.hashLog, params.ldmParams.bucketSizeLog)
|
size_t const ldmSeqSpace = maxNbLdmSeq * sizeof(rawSeq);
|
||||||
: 0;
|
|
||||||
size_t const neededSpace = entropySpace + blockStateSpace + ldmSpace +
|
size_t const neededSpace = entropySpace + blockStateSpace + ldmSpace +
|
||||||
matchStateSize + tokenSpace + bufferSpace;
|
ldmSeqSpace + matchStateSize + tokenSpace +
|
||||||
|
bufferSpace;
|
||||||
DEBUGLOG(4, "Need %uKB workspace, including %uKB for match state, and %uKB for buffers",
|
DEBUGLOG(4, "Need %uKB workspace, including %uKB for match state, and %uKB for buffers",
|
||||||
(U32)(neededSpace>>10), (U32)(matchStateSize>>10), (U32)(bufferSpace>>10));
|
(U32)(neededSpace>>10), (U32)(matchStateSize>>10), (U32)(bufferSpace>>10));
|
||||||
DEBUGLOG(4, "windowSize: %u - blockSize: %u", (U32)windowSize, (U32)blockSize);
|
DEBUGLOG(4, "windowSize: %u - blockSize: %u", (U32)windowSize, (U32)blockSize);
|
||||||
@ -1043,7 +1066,12 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc,
|
|||||||
assert(((size_t)ptr & 3) == 0); /* ensure ptr is properly aligned */
|
assert(((size_t)ptr & 3) == 0); /* ensure ptr is properly aligned */
|
||||||
zc->ldmState.hashTable = (ldmEntry_t*)ptr;
|
zc->ldmState.hashTable = (ldmEntry_t*)ptr;
|
||||||
ptr = zc->ldmState.hashTable + ldmHSize;
|
ptr = zc->ldmState.hashTable + ldmHSize;
|
||||||
|
zc->ldmSequences = (rawSeq*)ptr;
|
||||||
|
ptr = zc->ldmSequences + maxNbLdmSeq;
|
||||||
|
|
||||||
|
memset(&zc->ldmState.window, 0, sizeof(zc->ldmState.window));
|
||||||
}
|
}
|
||||||
|
assert(((size_t)ptr & 3) == 0); /* ensure ptr is properly aligned */
|
||||||
|
|
||||||
ptr = ZSTD_reset_matchState(&zc->blockState.matchState, ptr, ¶ms.cParams, crp, /* forCCtx */ 1);
|
ptr = ZSTD_reset_matchState(&zc->blockState.matchState, ptr, ¶ms.cParams, crp, /* forCCtx */ 1);
|
||||||
|
|
||||||
@ -1083,7 +1111,7 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc,
|
|||||||
void ZSTD_invalidateRepCodes(ZSTD_CCtx* cctx) {
|
void ZSTD_invalidateRepCodes(ZSTD_CCtx* cctx) {
|
||||||
int i;
|
int i;
|
||||||
for (i=0; i<ZSTD_REP_NUM; i++) cctx->blockState.prevCBlock->rep[i] = 0;
|
for (i=0; i<ZSTD_REP_NUM; i++) cctx->blockState.prevCBlock->rep[i] = 0;
|
||||||
assert(/* !extDict */ cctx->blockState.matchState.lowLimit == cctx->blockState.matchState.dictLimit);
|
assert(!ZSTD_window_hasExtDict(cctx->blockState.matchState.window));
|
||||||
}
|
}
|
||||||
|
|
||||||
static size_t ZSTD_resetCCtx_usingCDict(ZSTD_CCtx* cctx,
|
static size_t ZSTD_resetCCtx_usingCDict(ZSTD_CCtx* cctx,
|
||||||
@ -1125,13 +1153,9 @@ static size_t ZSTD_resetCCtx_usingCDict(ZSTD_CCtx* cctx,
|
|||||||
{
|
{
|
||||||
ZSTD_matchState_t const* srcMatchState = &cdict->matchState;
|
ZSTD_matchState_t const* srcMatchState = &cdict->matchState;
|
||||||
ZSTD_matchState_t* dstMatchState = &cctx->blockState.matchState;
|
ZSTD_matchState_t* dstMatchState = &cctx->blockState.matchState;
|
||||||
|
dstMatchState->window = srcMatchState->window;
|
||||||
dstMatchState->nextToUpdate = srcMatchState->nextToUpdate;
|
dstMatchState->nextToUpdate = srcMatchState->nextToUpdate;
|
||||||
dstMatchState->nextToUpdate3= srcMatchState->nextToUpdate3;
|
dstMatchState->nextToUpdate3= srcMatchState->nextToUpdate3;
|
||||||
dstMatchState->nextSrc = srcMatchState->nextSrc;
|
|
||||||
dstMatchState->base = srcMatchState->base;
|
|
||||||
dstMatchState->dictBase = srcMatchState->dictBase;
|
|
||||||
dstMatchState->dictLimit = srcMatchState->dictLimit;
|
|
||||||
dstMatchState->lowLimit = srcMatchState->lowLimit;
|
|
||||||
dstMatchState->loadedDictEnd= srcMatchState->loadedDictEnd;
|
dstMatchState->loadedDictEnd= srcMatchState->loadedDictEnd;
|
||||||
}
|
}
|
||||||
cctx->dictID = cdict->dictID;
|
cctx->dictID = cdict->dictID;
|
||||||
@ -1186,13 +1210,9 @@ static size_t ZSTD_copyCCtx_internal(ZSTD_CCtx* dstCCtx,
|
|||||||
{
|
{
|
||||||
ZSTD_matchState_t const* srcMatchState = &srcCCtx->blockState.matchState;
|
ZSTD_matchState_t const* srcMatchState = &srcCCtx->blockState.matchState;
|
||||||
ZSTD_matchState_t* dstMatchState = &dstCCtx->blockState.matchState;
|
ZSTD_matchState_t* dstMatchState = &dstCCtx->blockState.matchState;
|
||||||
|
dstMatchState->window = srcMatchState->window;
|
||||||
dstMatchState->nextToUpdate = srcMatchState->nextToUpdate;
|
dstMatchState->nextToUpdate = srcMatchState->nextToUpdate;
|
||||||
dstMatchState->nextToUpdate3= srcMatchState->nextToUpdate3;
|
dstMatchState->nextToUpdate3= srcMatchState->nextToUpdate3;
|
||||||
dstMatchState->nextSrc = srcMatchState->nextSrc;
|
|
||||||
dstMatchState->base = srcMatchState->base;
|
|
||||||
dstMatchState->dictBase = srcMatchState->dictBase;
|
|
||||||
dstMatchState->dictLimit = srcMatchState->dictLimit;
|
|
||||||
dstMatchState->lowLimit = srcMatchState->lowLimit;
|
|
||||||
dstMatchState->loadedDictEnd= srcMatchState->loadedDictEnd;
|
dstMatchState->loadedDictEnd= srcMatchState->loadedDictEnd;
|
||||||
}
|
}
|
||||||
dstCCtx->dictID = srcCCtx->dictID;
|
dstCCtx->dictID = srcCCtx->dictID;
|
||||||
@ -1260,19 +1280,6 @@ static void ZSTD_reduceTable_btlazy2(U32* const table, U32 const size, U32 const
|
|||||||
ZSTD_reduceTable_internal(table, size, reducerValue, 1);
|
ZSTD_reduceTable_internal(table, size, reducerValue, 1);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/*! ZSTD_ldm_reduceTable() :
|
|
||||||
* reduce table indexes by `reducerValue` */
|
|
||||||
static void ZSTD_ldm_reduceTable(ldmEntry_t* const table, U32 const size,
|
|
||||||
U32 const reducerValue)
|
|
||||||
{
|
|
||||||
U32 u;
|
|
||||||
for (u = 0; u < size; u++) {
|
|
||||||
if (table[u].offset < reducerValue) table[u].offset = 0;
|
|
||||||
else table[u].offset -= reducerValue;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/*! ZSTD_reduceIndex() :
|
/*! ZSTD_reduceIndex() :
|
||||||
* rescale all indexes to avoid future overflow (indexes are U32) */
|
* rescale all indexes to avoid future overflow (indexes are U32) */
|
||||||
static void ZSTD_reduceIndex (ZSTD_CCtx* zc, const U32 reducerValue)
|
static void ZSTD_reduceIndex (ZSTD_CCtx* zc, const U32 reducerValue)
|
||||||
@ -1294,11 +1301,6 @@ static void ZSTD_reduceIndex (ZSTD_CCtx* zc, const U32 reducerValue)
|
|||||||
U32 const h3Size = (U32)1 << ms->hashLog3;
|
U32 const h3Size = (U32)1 << ms->hashLog3;
|
||||||
ZSTD_reduceTable(ms->hashTable3, h3Size, reducerValue);
|
ZSTD_reduceTable(ms->hashTable3, h3Size, reducerValue);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (zc->appliedParams.ldmParams.enableLdm) {
|
|
||||||
U32 const ldmHSize = (U32)1 << zc->appliedParams.ldmParams.hashLog;
|
|
||||||
ZSTD_ldm_reduceTable(zc->ldmState.hashTable, ldmHSize, reducerValue);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -1377,7 +1379,7 @@ static size_t ZSTD_compressLiterals (ZSTD_entropyCTables_t const* prevEntropy,
|
|||||||
ZSTD_strategy strategy,
|
ZSTD_strategy strategy,
|
||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
U32* workspace)
|
U32* workspace, const int bmi2)
|
||||||
{
|
{
|
||||||
size_t const minGain = ZSTD_minGain(srcSize);
|
size_t const minGain = ZSTD_minGain(srcSize);
|
||||||
size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB);
|
size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB);
|
||||||
@ -1402,9 +1404,9 @@ static size_t ZSTD_compressLiterals (ZSTD_entropyCTables_t const* prevEntropy,
|
|||||||
int const preferRepeat = strategy < ZSTD_lazy ? srcSize <= 1024 : 0;
|
int const preferRepeat = strategy < ZSTD_lazy ? srcSize <= 1024 : 0;
|
||||||
if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1;
|
if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1;
|
||||||
cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
|
cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
|
||||||
workspace, HUF_WORKSPACE_SIZE, (HUF_CElt*)nextEntropy->hufCTable, &repeat, preferRepeat)
|
workspace, HUF_WORKSPACE_SIZE, (HUF_CElt*)nextEntropy->hufCTable, &repeat, preferRepeat, bmi2)
|
||||||
: HUF_compress4X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
|
: HUF_compress4X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
|
||||||
workspace, HUF_WORKSPACE_SIZE, (HUF_CElt*)nextEntropy->hufCTable, &repeat, preferRepeat);
|
workspace, HUF_WORKSPACE_SIZE, (HUF_CElt*)nextEntropy->hufCTable, &repeat, preferRepeat, bmi2);
|
||||||
if (repeat != HUF_repeat_none) {
|
if (repeat != HUF_repeat_none) {
|
||||||
/* reused the existing table */
|
/* reused the existing table */
|
||||||
hType = set_repeat;
|
hType = set_repeat;
|
||||||
@ -1559,95 +1561,52 @@ size_t ZSTD_buildCTable(void* dst, size_t dstCapacity,
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
MEM_STATIC
|
#define FUNCTION(fn) fn##_default
|
||||||
|
#define TARGET
|
||||||
|
#include "zstd_compress_impl.h"
|
||||||
|
#undef TARGET
|
||||||
|
#undef FUNCTION
|
||||||
|
|
||||||
|
#if DYNAMIC_BMI2
|
||||||
|
|
||||||
|
#define FUNCTION(fn) fn##_bmi2
|
||||||
|
#define TARGET TARGET_ATTRIBUTE("bmi2")
|
||||||
|
#include "zstd_compress_impl.h"
|
||||||
|
#undef TARGET
|
||||||
|
#undef FUNCTION
|
||||||
|
|
||||||
|
#endif
|
||||||
|
|
||||||
size_t ZSTD_encodeSequences(
|
size_t ZSTD_encodeSequences(
|
||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable,
|
FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable,
|
||||||
FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable,
|
FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable,
|
||||||
FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable,
|
FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable,
|
||||||
seqDef const* sequences, size_t nbSeq, int longOffsets)
|
seqDef const* sequences, size_t nbSeq, int longOffsets, int bmi2)
|
||||||
{
|
{
|
||||||
BIT_CStream_t blockStream;
|
#if DYNAMIC_BMI2
|
||||||
FSE_CState_t stateMatchLength;
|
if (bmi2) {
|
||||||
FSE_CState_t stateOffsetBits;
|
return ZSTD_encodeSequences_bmi2(dst, dstCapacity,
|
||||||
FSE_CState_t stateLitLength;
|
CTable_MatchLength, mlCodeTable,
|
||||||
|
CTable_OffsetBits, ofCodeTable,
|
||||||
CHECK_E(BIT_initCStream(&blockStream, dst, dstCapacity), dstSize_tooSmall); /* not enough space remaining */
|
CTable_LitLength, llCodeTable,
|
||||||
|
sequences, nbSeq, longOffsets);
|
||||||
/* first symbols */
|
|
||||||
FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]);
|
|
||||||
FSE_initCState2(&stateOffsetBits, CTable_OffsetBits, ofCodeTable[nbSeq-1]);
|
|
||||||
FSE_initCState2(&stateLitLength, CTable_LitLength, llCodeTable[nbSeq-1]);
|
|
||||||
BIT_addBits(&blockStream, sequences[nbSeq-1].litLength, LL_bits[llCodeTable[nbSeq-1]]);
|
|
||||||
if (MEM_32bits()) BIT_flushBits(&blockStream);
|
|
||||||
BIT_addBits(&blockStream, sequences[nbSeq-1].matchLength, ML_bits[mlCodeTable[nbSeq-1]]);
|
|
||||||
if (MEM_32bits()) BIT_flushBits(&blockStream);
|
|
||||||
if (longOffsets) {
|
|
||||||
U32 const ofBits = ofCodeTable[nbSeq-1];
|
|
||||||
int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1);
|
|
||||||
if (extraBits) {
|
|
||||||
BIT_addBits(&blockStream, sequences[nbSeq-1].offset, extraBits);
|
|
||||||
BIT_flushBits(&blockStream);
|
|
||||||
}
|
|
||||||
BIT_addBits(&blockStream, sequences[nbSeq-1].offset >> extraBits,
|
|
||||||
ofBits - extraBits);
|
|
||||||
} else {
|
|
||||||
BIT_addBits(&blockStream, sequences[nbSeq-1].offset, ofCodeTable[nbSeq-1]);
|
|
||||||
}
|
|
||||||
BIT_flushBits(&blockStream);
|
|
||||||
|
|
||||||
{ size_t n;
|
|
||||||
for (n=nbSeq-2 ; n<nbSeq ; n--) { /* intentional underflow */
|
|
||||||
BYTE const llCode = llCodeTable[n];
|
|
||||||
BYTE const ofCode = ofCodeTable[n];
|
|
||||||
BYTE const mlCode = mlCodeTable[n];
|
|
||||||
U32 const llBits = LL_bits[llCode];
|
|
||||||
U32 const ofBits = ofCode;
|
|
||||||
U32 const mlBits = ML_bits[mlCode];
|
|
||||||
DEBUGLOG(6, "encoding: litlen:%2u - matchlen:%2u - offCode:%7u",
|
|
||||||
sequences[n].litLength,
|
|
||||||
sequences[n].matchLength + MINMATCH,
|
|
||||||
sequences[n].offset); /* 32b*/ /* 64b*/
|
|
||||||
/* (7)*/ /* (7)*/
|
|
||||||
FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode); /* 15 */ /* 15 */
|
|
||||||
FSE_encodeSymbol(&blockStream, &stateMatchLength, mlCode); /* 24 */ /* 24 */
|
|
||||||
if (MEM_32bits()) BIT_flushBits(&blockStream); /* (7)*/
|
|
||||||
FSE_encodeSymbol(&blockStream, &stateLitLength, llCode); /* 16 */ /* 33 */
|
|
||||||
if (MEM_32bits() || (ofBits+mlBits+llBits >= 64-7-(LLFSELog+MLFSELog+OffFSELog)))
|
|
||||||
BIT_flushBits(&blockStream); /* (7)*/
|
|
||||||
BIT_addBits(&blockStream, sequences[n].litLength, llBits);
|
|
||||||
if (MEM_32bits() && ((llBits+mlBits)>24)) BIT_flushBits(&blockStream);
|
|
||||||
BIT_addBits(&blockStream, sequences[n].matchLength, mlBits);
|
|
||||||
if (MEM_32bits() || (ofBits+mlBits+llBits > 56)) BIT_flushBits(&blockStream);
|
|
||||||
if (longOffsets) {
|
|
||||||
int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1);
|
|
||||||
if (extraBits) {
|
|
||||||
BIT_addBits(&blockStream, sequences[n].offset, extraBits);
|
|
||||||
BIT_flushBits(&blockStream); /* (7)*/
|
|
||||||
}
|
|
||||||
BIT_addBits(&blockStream, sequences[n].offset >> extraBits,
|
|
||||||
ofBits - extraBits); /* 31 */
|
|
||||||
} else {
|
|
||||||
BIT_addBits(&blockStream, sequences[n].offset, ofBits); /* 31 */
|
|
||||||
}
|
|
||||||
BIT_flushBits(&blockStream); /* (7)*/
|
|
||||||
} }
|
|
||||||
|
|
||||||
FSE_flushCState(&blockStream, &stateMatchLength);
|
|
||||||
FSE_flushCState(&blockStream, &stateOffsetBits);
|
|
||||||
FSE_flushCState(&blockStream, &stateLitLength);
|
|
||||||
|
|
||||||
{ size_t const streamSize = BIT_closeCStream(&blockStream);
|
|
||||||
if (streamSize==0) return ERROR(dstSize_tooSmall); /* not enough space */
|
|
||||||
return streamSize;
|
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
(void)bmi2;
|
||||||
|
return ZSTD_encodeSequences_default(dst, dstCapacity,
|
||||||
|
CTable_MatchLength, mlCodeTable,
|
||||||
|
CTable_OffsetBits, ofCodeTable,
|
||||||
|
CTable_LitLength, llCodeTable,
|
||||||
|
sequences, nbSeq, longOffsets);
|
||||||
}
|
}
|
||||||
|
|
||||||
MEM_STATIC size_t ZSTD_compressSequences_internal(seqStore_t* seqStorePtr,
|
MEM_STATIC size_t ZSTD_compressSequences_internal(seqStore_t* seqStorePtr,
|
||||||
ZSTD_entropyCTables_t const* prevEntropy,
|
ZSTD_entropyCTables_t const* prevEntropy,
|
||||||
ZSTD_entropyCTables_t* nextEntropy,
|
ZSTD_entropyCTables_t* nextEntropy,
|
||||||
ZSTD_compressionParameters const* cParams,
|
ZSTD_compressionParameters const* cParams,
|
||||||
void* dst, size_t dstCapacity, U32* workspace)
|
void* dst, size_t dstCapacity, U32* workspace,
|
||||||
|
const int bmi2)
|
||||||
{
|
{
|
||||||
const int longOffsets = cParams->windowLog > STREAM_ACCUMULATOR_MIN;
|
const int longOffsets = cParams->windowLog > STREAM_ACCUMULATOR_MIN;
|
||||||
U32 count[MaxSeq+1];
|
U32 count[MaxSeq+1];
|
||||||
@ -1672,7 +1631,7 @@ MEM_STATIC size_t ZSTD_compressSequences_internal(seqStore_t* seqStorePtr,
|
|||||||
size_t const litSize = seqStorePtr->lit - literals;
|
size_t const litSize = seqStorePtr->lit - literals;
|
||||||
size_t const cSize = ZSTD_compressLiterals(prevEntropy, nextEntropy,
|
size_t const cSize = ZSTD_compressLiterals(prevEntropy, nextEntropy,
|
||||||
cParams->strategy, op, dstCapacity, literals, litSize,
|
cParams->strategy, op, dstCapacity, literals, litSize,
|
||||||
workspace);
|
workspace, bmi2);
|
||||||
if (ZSTD_isError(cSize))
|
if (ZSTD_isError(cSize))
|
||||||
return cSize;
|
return cSize;
|
||||||
assert(cSize <= dstCapacity);
|
assert(cSize <= dstCapacity);
|
||||||
@ -1752,7 +1711,7 @@ MEM_STATIC size_t ZSTD_compressSequences_internal(seqStore_t* seqStorePtr,
|
|||||||
CTable_OffsetBits, ofCodeTable,
|
CTable_OffsetBits, ofCodeTable,
|
||||||
CTable_LitLength, llCodeTable,
|
CTable_LitLength, llCodeTable,
|
||||||
sequences, nbSeq,
|
sequences, nbSeq,
|
||||||
longOffsets);
|
longOffsets, bmi2);
|
||||||
if (ZSTD_isError(bitstreamSize)) return bitstreamSize;
|
if (ZSTD_isError(bitstreamSize)) return bitstreamSize;
|
||||||
op += bitstreamSize;
|
op += bitstreamSize;
|
||||||
}
|
}
|
||||||
@ -1765,11 +1724,11 @@ MEM_STATIC size_t ZSTD_compressSequences(seqStore_t* seqStorePtr,
|
|||||||
ZSTD_entropyCTables_t* nextEntropy,
|
ZSTD_entropyCTables_t* nextEntropy,
|
||||||
ZSTD_compressionParameters const* cParams,
|
ZSTD_compressionParameters const* cParams,
|
||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
size_t srcSize, U32* workspace)
|
size_t srcSize, U32* workspace, int bmi2)
|
||||||
{
|
{
|
||||||
size_t const cSize = ZSTD_compressSequences_internal(
|
size_t const cSize = ZSTD_compressSequences_internal(
|
||||||
seqStorePtr, prevEntropy, nextEntropy, cParams, dst, dstCapacity,
|
seqStorePtr, prevEntropy, nextEntropy, cParams, dst, dstCapacity,
|
||||||
workspace);
|
workspace, bmi2);
|
||||||
/* If the srcSize <= dstCapacity, then there is enough space to write a
|
/* If the srcSize <= dstCapacity, then there is enough space to write a
|
||||||
* raw uncompressed block. Since we ran out of space, the block must not
|
* raw uncompressed block. Since we ran out of space, the block must not
|
||||||
* be compressible, so fall back to a raw uncompressed block.
|
* be compressible, so fall back to a raw uncompressed block.
|
||||||
@ -1836,41 +1795,46 @@ static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc,
|
|||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
const void* src, size_t srcSize)
|
const void* src, size_t srcSize)
|
||||||
{
|
{
|
||||||
|
ZSTD_matchState_t* const ms = &zc->blockState.matchState;
|
||||||
DEBUGLOG(5, "ZSTD_compressBlock_internal (dstCapacity=%u, dictLimit=%u, nextToUpdate=%u)",
|
DEBUGLOG(5, "ZSTD_compressBlock_internal (dstCapacity=%u, dictLimit=%u, nextToUpdate=%u)",
|
||||||
(U32)dstCapacity, zc->blockState.matchState.dictLimit, zc->blockState.matchState.nextToUpdate);
|
(U32)dstCapacity, ms->window.dictLimit, ms->nextToUpdate);
|
||||||
if (srcSize < MIN_CBLOCK_SIZE+ZSTD_blockHeaderSize+1)
|
if (srcSize < MIN_CBLOCK_SIZE+ZSTD_blockHeaderSize+1)
|
||||||
return 0; /* don't even attempt compression below a certain srcSize */
|
return 0; /* don't even attempt compression below a certain srcSize */
|
||||||
ZSTD_resetSeqStore(&(zc->seqStore));
|
ZSTD_resetSeqStore(&(zc->seqStore));
|
||||||
|
|
||||||
/* limited update after a very long match */
|
/* limited update after a very long match */
|
||||||
{ const BYTE* const base = zc->blockState.matchState.base;
|
{ const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const istart = (const BYTE*)src;
|
const BYTE* const istart = (const BYTE*)src;
|
||||||
const U32 current = (U32)(istart-base);
|
const U32 current = (U32)(istart-base);
|
||||||
if (current > zc->blockState.matchState.nextToUpdate + 384)
|
if (current > ms->nextToUpdate + 384)
|
||||||
zc->blockState.matchState.nextToUpdate = current - MIN(192, (U32)(current - zc->blockState.matchState.nextToUpdate - 384));
|
ms->nextToUpdate = current - MIN(192, (U32)(current - ms->nextToUpdate - 384));
|
||||||
}
|
}
|
||||||
|
|
||||||
/* select and store sequences */
|
/* select and store sequences */
|
||||||
{ U32 const extDict = zc->blockState.matchState.lowLimit < zc->blockState.matchState.dictLimit;
|
{ U32 const extDict = ZSTD_window_hasExtDict(ms->window);
|
||||||
size_t lastLLSize;
|
size_t lastLLSize;
|
||||||
{ int i; for (i = 0; i < ZSTD_REP_NUM; ++i) zc->blockState.nextCBlock->rep[i] = zc->blockState.prevCBlock->rep[i]; }
|
{ int i; for (i = 0; i < ZSTD_REP_NUM; ++i) zc->blockState.nextCBlock->rep[i] = zc->blockState.prevCBlock->rep[i]; }
|
||||||
if (zc->appliedParams.ldmParams.enableLdm) {
|
if (zc->appliedParams.ldmParams.enableLdm) {
|
||||||
typedef size_t (*ZSTD_ldmBlockCompressor)(
|
size_t const nbSeq =
|
||||||
ldmState_t* ldms, ZSTD_matchState_t* ms, seqStore_t* seqStore,
|
ZSTD_ldm_generateSequences(&zc->ldmState, zc->ldmSequences,
|
||||||
U32 rep[ZSTD_REP_NUM], ZSTD_CCtx_params const* params,
|
&zc->appliedParams.ldmParams,
|
||||||
void const* src, size_t srcSize);
|
src, srcSize, extDict);
|
||||||
ZSTD_ldmBlockCompressor const ldmBlockCompressor = extDict ? ZSTD_compressBlock_ldm_extDict : ZSTD_compressBlock_ldm;
|
lastLLSize =
|
||||||
lastLLSize = ldmBlockCompressor(&zc->ldmState, &zc->blockState.matchState, &zc->seqStore, zc->blockState.nextCBlock->rep, &zc->appliedParams, src, srcSize);
|
ZSTD_ldm_blockCompress(zc->ldmSequences, nbSeq,
|
||||||
|
ms, &zc->seqStore,
|
||||||
|
zc->blockState.nextCBlock->rep,
|
||||||
|
&zc->appliedParams.cParams,
|
||||||
|
src, srcSize, extDict);
|
||||||
} else { /* not long range mode */
|
} else { /* not long range mode */
|
||||||
ZSTD_blockCompressor const blockCompressor = ZSTD_selectBlockCompressor(zc->appliedParams.cParams.strategy, extDict);
|
ZSTD_blockCompressor const blockCompressor = ZSTD_selectBlockCompressor(zc->appliedParams.cParams.strategy, extDict);
|
||||||
lastLLSize = blockCompressor(&zc->blockState.matchState, &zc->seqStore, zc->blockState.nextCBlock->rep, &zc->appliedParams.cParams, src, srcSize);
|
lastLLSize = blockCompressor(ms, &zc->seqStore, zc->blockState.nextCBlock->rep, &zc->appliedParams.cParams, src, srcSize);
|
||||||
}
|
}
|
||||||
{ const BYTE* const lastLiterals = (const BYTE*)src + srcSize - lastLLSize;
|
{ const BYTE* const lastLiterals = (const BYTE*)src + srcSize - lastLLSize;
|
||||||
ZSTD_storeLastLiterals(&zc->seqStore, lastLiterals, lastLLSize);
|
ZSTD_storeLastLiterals(&zc->seqStore, lastLiterals, lastLLSize);
|
||||||
} }
|
} }
|
||||||
|
|
||||||
/* encode sequences and literals */
|
/* encode sequences and literals */
|
||||||
{ size_t const cSize = ZSTD_compressSequences(&zc->seqStore, &zc->blockState.prevCBlock->entropy, &zc->blockState.nextCBlock->entropy, &zc->appliedParams.cParams, dst, dstCapacity, srcSize, zc->entropyWorkspace);
|
{ size_t const cSize = ZSTD_compressSequences(&zc->seqStore, &zc->blockState.prevCBlock->entropy, &zc->blockState.nextCBlock->entropy, &zc->appliedParams.cParams, dst, dstCapacity, srcSize, zc->entropyWorkspace, zc->bmi2);
|
||||||
if (ZSTD_isError(cSize) || cSize == 0) return cSize;
|
if (ZSTD_isError(cSize) || cSize == 0) return cSize;
|
||||||
/* confirm repcodes and entropy tables */
|
/* confirm repcodes and entropy tables */
|
||||||
{ ZSTD_compressedBlockState_t* const tmp = zc->blockState.prevCBlock;
|
{ ZSTD_compressedBlockState_t* const tmp = zc->blockState.prevCBlock;
|
||||||
@ -1914,52 +1878,19 @@ static size_t ZSTD_compress_frameChunk (ZSTD_CCtx* cctx,
|
|||||||
return ERROR(dstSize_tooSmall); /* not enough space to store compressed block */
|
return ERROR(dstSize_tooSmall); /* not enough space to store compressed block */
|
||||||
if (remaining < blockSize) blockSize = remaining;
|
if (remaining < blockSize) blockSize = remaining;
|
||||||
|
|
||||||
/* preemptive overflow correction:
|
if (ZSTD_window_needOverflowCorrection(ms->window)) {
|
||||||
* 1. correction is large enough:
|
U32 const cycleLog = ZSTD_cycleLog(cctx->appliedParams.cParams.chainLog, cctx->appliedParams.cParams.strategy);
|
||||||
* lowLimit > (3<<29) ==> current > 3<<29 + 1<<windowLog - blockSize
|
U32 const correction = ZSTD_window_correctOverflow(&ms->window, cycleLog, maxDist, ip);
|
||||||
* 1<<windowLog <= newCurrent < 1<<chainLog + 1<<windowLog
|
|
||||||
*
|
|
||||||
* current - newCurrent
|
|
||||||
* > (3<<29 + 1<<windowLog - blockSize) - (1<<windowLog + 1<<chainLog)
|
|
||||||
* > (3<<29 - blockSize) - (1<<chainLog)
|
|
||||||
* > (3<<29 - blockSize) - (1<<30) (NOTE: chainLog <= 30)
|
|
||||||
* > 1<<29 - 1<<17
|
|
||||||
*
|
|
||||||
* 2. (ip+blockSize - cctx->base) doesn't overflow:
|
|
||||||
* In 32 bit mode we limit windowLog to 30 so we don't get
|
|
||||||
* differences larger than 1<<31-1.
|
|
||||||
* 3. cctx->lowLimit < 1<<32:
|
|
||||||
* windowLog <= 31 ==> 3<<29 + 1<<windowLog < 7<<29 < 1<<32.
|
|
||||||
*/
|
|
||||||
if (ms->lowLimit > (3U<<29)) {
|
|
||||||
U32 const cycleMask = ((U32)1 << ZSTD_cycleLog(cctx->appliedParams.cParams.chainLog, cctx->appliedParams.cParams.strategy)) - 1;
|
|
||||||
U32 const current = (U32)(ip - cctx->blockState.matchState.base);
|
|
||||||
U32 const newCurrent = (current & cycleMask) + ((U32)1 << cctx->appliedParams.cParams.windowLog);
|
|
||||||
U32 const correction = current - newCurrent;
|
|
||||||
ZSTD_STATIC_ASSERT(ZSTD_CHAINLOG_MAX <= 30);
|
ZSTD_STATIC_ASSERT(ZSTD_CHAINLOG_MAX <= 30);
|
||||||
ZSTD_STATIC_ASSERT(ZSTD_WINDOWLOG_MAX_32 <= 30);
|
ZSTD_STATIC_ASSERT(ZSTD_WINDOWLOG_MAX_32 <= 30);
|
||||||
ZSTD_STATIC_ASSERT(ZSTD_WINDOWLOG_MAX <= 31);
|
ZSTD_STATIC_ASSERT(ZSTD_WINDOWLOG_MAX <= 31);
|
||||||
assert(current > newCurrent);
|
|
||||||
assert(correction > 1<<28); /* Loose bound, should be about 1<<29 */
|
|
||||||
ZSTD_reduceIndex(cctx, correction);
|
ZSTD_reduceIndex(cctx, correction);
|
||||||
ms->base += correction;
|
|
||||||
ms->dictBase += correction;
|
|
||||||
ms->lowLimit -= correction;
|
|
||||||
ms->dictLimit -= correction;
|
|
||||||
if (ms->nextToUpdate < correction) ms->nextToUpdate = 0;
|
if (ms->nextToUpdate < correction) ms->nextToUpdate = 0;
|
||||||
else ms->nextToUpdate -= correction;
|
else ms->nextToUpdate -= correction;
|
||||||
DEBUGLOG(4, "Correction of 0x%x bytes to lowLimit=0x%x", correction, ms->lowLimit);
|
|
||||||
}
|
|
||||||
/* enforce maxDist */
|
|
||||||
if ((U32)(ip+blockSize - ms->base) > ms->loadedDictEnd + maxDist) {
|
|
||||||
U32 const newLowLimit = (U32)(ip+blockSize - ms->base) - maxDist;
|
|
||||||
if (ms->lowLimit < newLowLimit) ms->lowLimit = newLowLimit;
|
|
||||||
if (ms->dictLimit < ms->lowLimit)
|
|
||||||
DEBUGLOG(5, "ZSTD_compress_frameChunk : update dictLimit from %u to %u ",
|
|
||||||
ms->dictLimit, ms->lowLimit);
|
|
||||||
if (ms->dictLimit < ms->lowLimit) ms->dictLimit = ms->lowLimit;
|
|
||||||
if (ms->nextToUpdate < ms->lowLimit) ms->nextToUpdate = ms->lowLimit;
|
|
||||||
}
|
}
|
||||||
|
ZSTD_window_enforceMaxDist(&ms->window, ip + blockSize, ms->loadedDictEnd + maxDist);
|
||||||
|
if (ms->nextToUpdate < ms->window.lowLimit) ms->nextToUpdate = ms->window.lowLimit;
|
||||||
|
|
||||||
{ size_t cSize = ZSTD_compressBlock_internal(cctx,
|
{ size_t cSize = ZSTD_compressBlock_internal(cctx,
|
||||||
op+ZSTD_blockHeaderSize, dstCapacity-ZSTD_blockHeaderSize,
|
op+ZSTD_blockHeaderSize, dstCapacity-ZSTD_blockHeaderSize,
|
||||||
@ -2051,38 +1982,12 @@ size_t ZSTD_writeLastEmptyBlock(void* dst, size_t dstCapacity)
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static void ZSTD_manageWindowContinuity(ZSTD_matchState_t* ms, void const* src, size_t srcSize)
|
|
||||||
{
|
|
||||||
const BYTE* const ip = (const BYTE*) src;
|
|
||||||
|
|
||||||
/* Check if blocks follow each other */
|
|
||||||
if (src != ms->nextSrc) {
|
|
||||||
/* not contiguous */
|
|
||||||
size_t const distanceFromBase = (size_t)(ms->nextSrc - ms->base);
|
|
||||||
DEBUGLOG(5, "ZSTD_manageWindowContinuity: non contiguous blocks, new segment starts at %u", ms->dictLimit);
|
|
||||||
ms->lowLimit = ms->dictLimit;
|
|
||||||
assert(distanceFromBase == (size_t)(U32)distanceFromBase); /* should never overflow */
|
|
||||||
ms->dictLimit = (U32)distanceFromBase;
|
|
||||||
ms->dictBase = ms->base;
|
|
||||||
ms->base = ip - distanceFromBase;
|
|
||||||
ms->nextToUpdate = ms->dictLimit;
|
|
||||||
if (ms->dictLimit - ms->lowLimit < HASH_READ_SIZE) ms->lowLimit = ms->dictLimit; /* too small extDict */
|
|
||||||
}
|
|
||||||
ms->nextSrc = ip + srcSize;
|
|
||||||
/* if input and dictionary overlap : reduce dictionary (area presumed modified by input) */
|
|
||||||
if ((ip+srcSize > ms->dictBase + ms->lowLimit) & (ip < ms->dictBase + ms->dictLimit)) {
|
|
||||||
ptrdiff_t const highInputIdx = (ip + srcSize) - ms->dictBase;
|
|
||||||
U32 const lowLimitMax = (highInputIdx > (ptrdiff_t)ms->dictLimit) ? ms->dictLimit : (U32)highInputIdx;
|
|
||||||
ms->lowLimit = lowLimitMax;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx,
|
static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx,
|
||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
U32 frame, U32 lastFrameChunk)
|
U32 frame, U32 lastFrameChunk)
|
||||||
{
|
{
|
||||||
|
ZSTD_matchState_t* ms = &cctx->blockState.matchState;
|
||||||
size_t fhSize = 0;
|
size_t fhSize = 0;
|
||||||
|
|
||||||
DEBUGLOG(5, "ZSTD_compressContinue_internal, stage: %u, srcSize: %u",
|
DEBUGLOG(5, "ZSTD_compressContinue_internal, stage: %u, srcSize: %u",
|
||||||
@ -2100,7 +2005,11 @@ static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx,
|
|||||||
|
|
||||||
if (!srcSize) return fhSize; /* do not generate an empty block if no input */
|
if (!srcSize) return fhSize; /* do not generate an empty block if no input */
|
||||||
|
|
||||||
ZSTD_manageWindowContinuity(&cctx->blockState.matchState, src, srcSize);
|
if (!ZSTD_window_update(&ms->window, src, srcSize)) {
|
||||||
|
ms->nextToUpdate = ms->window.dictLimit;
|
||||||
|
}
|
||||||
|
if (cctx->appliedParams.ldmParams.enableLdm)
|
||||||
|
ZSTD_window_update(&cctx->ldmState.window, src, srcSize);
|
||||||
|
|
||||||
DEBUGLOG(5, "ZSTD_compressContinue_internal (blockSize=%u)", (U32)cctx->blockSize);
|
DEBUGLOG(5, "ZSTD_compressContinue_internal (blockSize=%u)", (U32)cctx->blockSize);
|
||||||
{ size_t const cSize = frame ?
|
{ size_t const cSize = frame ?
|
||||||
@ -2152,15 +2061,8 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_matchState_t* ms, ZSTD_CCtx_params
|
|||||||
const BYTE* const iend = ip + srcSize;
|
const BYTE* const iend = ip + srcSize;
|
||||||
ZSTD_compressionParameters const* cParams = ¶ms->cParams;
|
ZSTD_compressionParameters const* cParams = ¶ms->cParams;
|
||||||
|
|
||||||
/* input becomes current prefix */
|
ZSTD_window_update(&ms->window, src, srcSize);
|
||||||
ms->lowLimit = ms->dictLimit;
|
|
||||||
ms->dictLimit = (U32)(ms->nextSrc - ms->base);
|
|
||||||
ms->dictBase = ms->base;
|
|
||||||
ms->base = ip - ms->dictLimit;
|
|
||||||
ms->nextToUpdate = ms->dictLimit;
|
|
||||||
ms->loadedDictEnd = params->forceWindow ? 0 : (U32)(iend - ms->base);
|
|
||||||
|
|
||||||
ms->nextSrc = iend;
|
|
||||||
if (srcSize <= HASH_READ_SIZE) return 0;
|
if (srcSize <= HASH_READ_SIZE) return 0;
|
||||||
|
|
||||||
switch(params->cParams.strategy)
|
switch(params->cParams.strategy)
|
||||||
@ -2190,7 +2092,7 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_matchState_t* ms, ZSTD_CCtx_params
|
|||||||
assert(0); /* not possible : not a valid strategy id */
|
assert(0); /* not possible : not a valid strategy id */
|
||||||
}
|
}
|
||||||
|
|
||||||
ms->nextToUpdate = (U32)(iend - ms->base);
|
ms->nextToUpdate = (U32)(iend - ms->window.base);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -2513,7 +2415,8 @@ size_t ZSTD_compress_advanced_internal(
|
|||||||
const void* dict,size_t dictSize,
|
const void* dict,size_t dictSize,
|
||||||
ZSTD_CCtx_params params)
|
ZSTD_CCtx_params params)
|
||||||
{
|
{
|
||||||
DEBUGLOG(4, "ZSTD_compress_advanced_internal");
|
DEBUGLOG(4, "ZSTD_compress_advanced_internal (srcSize:%u)",
|
||||||
|
(U32)srcSize);
|
||||||
CHECK_F( ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dm_auto, NULL,
|
CHECK_F( ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dm_auto, NULL,
|
||||||
params, srcSize, ZSTDb_not_buffered) );
|
params, srcSize, ZSTDb_not_buffered) );
|
||||||
return ZSTD_compressEnd(cctx, dst, dstCapacity, src, srcSize);
|
return ZSTD_compressEnd(cctx, dst, dstCapacity, src, srcSize);
|
||||||
@ -2993,7 +2896,7 @@ MEM_STATIC size_t ZSTD_limitCopy(void* dst, size_t dstCapacity,
|
|||||||
|
|
||||||
/** ZSTD_compressStream_generic():
|
/** ZSTD_compressStream_generic():
|
||||||
* internal function for all *compressStream*() variants and *compress_generic()
|
* internal function for all *compressStream*() variants and *compress_generic()
|
||||||
* non-static, because can be called from zstdmt.c
|
* non-static, because can be called from zstdmt_compress.c
|
||||||
* @return : hint size for next input */
|
* @return : hint size for next input */
|
||||||
size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs,
|
size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs,
|
||||||
ZSTD_outBuffer* output,
|
ZSTD_outBuffer* output,
|
||||||
@ -3172,28 +3075,28 @@ size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
|
|||||||
|
|
||||||
#ifdef ZSTD_MULTITHREAD
|
#ifdef ZSTD_MULTITHREAD
|
||||||
if ((cctx->pledgedSrcSizePlusOne-1) <= ZSTDMT_JOBSIZE_MIN) {
|
if ((cctx->pledgedSrcSizePlusOne-1) <= ZSTDMT_JOBSIZE_MIN) {
|
||||||
params.nbThreads = 1; /* do not invoke multi-threading when src size is too small */
|
params.nbWorkers = 0; /* do not invoke multi-threading when src size is too small */
|
||||||
params.nonBlockingMode = 0;
|
|
||||||
}
|
}
|
||||||
if ((params.nbThreads > 1) | (params.nonBlockingMode == 1)) {
|
if (params.nbWorkers > 0) {
|
||||||
if (cctx->mtctx == NULL || (params.nbThreads != ZSTDMT_getNbThreads(cctx->mtctx))) {
|
/* mt context creation */
|
||||||
DEBUGLOG(4, "ZSTD_compress_generic: creating new mtctx for nbThreads=%u",
|
if (cctx->mtctx == NULL || (params.nbWorkers != ZSTDMT_getNbWorkers(cctx->mtctx))) {
|
||||||
params.nbThreads);
|
DEBUGLOG(4, "ZSTD_compress_generic: creating new mtctx for nbWorkers=%u",
|
||||||
|
params.nbWorkers);
|
||||||
if (cctx->mtctx != NULL)
|
if (cctx->mtctx != NULL)
|
||||||
DEBUGLOG(4, "ZSTD_compress_generic: previous nbThreads was %u",
|
DEBUGLOG(4, "ZSTD_compress_generic: previous nbWorkers was %u",
|
||||||
ZSTDMT_getNbThreads(cctx->mtctx));
|
ZSTDMT_getNbWorkers(cctx->mtctx));
|
||||||
ZSTDMT_freeCCtx(cctx->mtctx);
|
ZSTDMT_freeCCtx(cctx->mtctx);
|
||||||
cctx->mtctx = ZSTDMT_createCCtx_advanced(params.nbThreads, cctx->customMem);
|
cctx->mtctx = ZSTDMT_createCCtx_advanced(params.nbWorkers, cctx->customMem);
|
||||||
if (cctx->mtctx == NULL) return ERROR(memory_allocation);
|
if (cctx->mtctx == NULL) return ERROR(memory_allocation);
|
||||||
}
|
}
|
||||||
DEBUGLOG(4, "call ZSTDMT_initCStream_internal as nbThreads=%u", params.nbThreads);
|
/* mt compression */
|
||||||
|
DEBUGLOG(4, "call ZSTDMT_initCStream_internal as nbWorkers=%u", params.nbWorkers);
|
||||||
CHECK_F( ZSTDMT_initCStream_internal(
|
CHECK_F( ZSTDMT_initCStream_internal(
|
||||||
cctx->mtctx,
|
cctx->mtctx,
|
||||||
prefixDict.dict, prefixDict.dictSize, ZSTD_dm_rawContent,
|
prefixDict.dict, prefixDict.dictSize, ZSTD_dm_rawContent,
|
||||||
cctx->cdict, params, cctx->pledgedSrcSizePlusOne-1) );
|
cctx->cdict, params, cctx->pledgedSrcSizePlusOne-1) );
|
||||||
cctx->streamStage = zcss_load;
|
cctx->streamStage = zcss_load;
|
||||||
cctx->appliedParams.nbThreads = params.nbThreads;
|
cctx->appliedParams.nbWorkers = params.nbWorkers;
|
||||||
cctx->appliedParams.nonBlockingMode = params.nonBlockingMode;
|
|
||||||
} else
|
} else
|
||||||
#endif
|
#endif
|
||||||
{ CHECK_F( ZSTD_resetCStream_internal(
|
{ CHECK_F( ZSTD_resetCStream_internal(
|
||||||
@ -3201,19 +3104,23 @@ size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
|
|||||||
prefixDict.dictMode, cctx->cdict, params,
|
prefixDict.dictMode, cctx->cdict, params,
|
||||||
cctx->pledgedSrcSizePlusOne-1) );
|
cctx->pledgedSrcSizePlusOne-1) );
|
||||||
assert(cctx->streamStage == zcss_load);
|
assert(cctx->streamStage == zcss_load);
|
||||||
assert(cctx->appliedParams.nbThreads <= 1);
|
assert(cctx->appliedParams.nbWorkers == 0);
|
||||||
} }
|
} }
|
||||||
|
|
||||||
/* compression stage */
|
/* compression stage */
|
||||||
#ifdef ZSTD_MULTITHREAD
|
#ifdef ZSTD_MULTITHREAD
|
||||||
if ((cctx->appliedParams.nbThreads > 1) | (cctx->appliedParams.nonBlockingMode==1)) {
|
if (cctx->appliedParams.nbWorkers > 0) {
|
||||||
size_t const flushMin = ZSTDMT_compressStream_generic(cctx->mtctx, output, input, endOp);
|
if (cctx->cParamsChanged) {
|
||||||
if ( ZSTD_isError(flushMin)
|
ZSTDMT_updateCParams_whileCompressing(cctx->mtctx, cctx->requestedParams.compressionLevel, cctx->requestedParams.cParams);
|
||||||
|| (endOp == ZSTD_e_end && flushMin == 0) ) { /* compression completed */
|
cctx->cParamsChanged = 0;
|
||||||
ZSTD_startNewCompression(cctx);
|
|
||||||
}
|
}
|
||||||
return flushMin;
|
{ size_t const flushMin = ZSTDMT_compressStream_generic(cctx->mtctx, output, input, endOp);
|
||||||
}
|
if ( ZSTD_isError(flushMin)
|
||||||
|
|| (endOp == ZSTD_e_end && flushMin == 0) ) { /* compression completed */
|
||||||
|
ZSTD_startNewCompression(cctx);
|
||||||
|
}
|
||||||
|
return flushMin;
|
||||||
|
} }
|
||||||
#endif
|
#endif
|
||||||
CHECK_F( ZSTD_compressStream_generic(cctx, output, input, endOp) );
|
CHECK_F( ZSTD_compressStream_generic(cctx, output, input, endOp) );
|
||||||
DEBUGLOG(5, "completed ZSTD_compress_generic");
|
DEBUGLOG(5, "completed ZSTD_compress_generic");
|
||||||
@ -3239,7 +3146,7 @@ size_t ZSTD_compress_generic_simpleArgs (
|
|||||||
/*====== Finalize ======*/
|
/*====== Finalize ======*/
|
||||||
|
|
||||||
/*! ZSTD_flushStream() :
|
/*! ZSTD_flushStream() :
|
||||||
* @return : amount of data remaining to flush */
|
* @return : amount of data remaining to flush */
|
||||||
size_t ZSTD_flushStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output)
|
size_t ZSTD_flushStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output)
|
||||||
{
|
{
|
||||||
ZSTD_inBuffer input = { NULL, 0, 0 };
|
ZSTD_inBuffer input = { NULL, 0, 0 };
|
||||||
|
106
lib/compress/zstd_compress_impl.h
Normal file
106
lib/compress/zstd_compress_impl.h
Normal file
@ -0,0 +1,106 @@
|
|||||||
|
/*
|
||||||
|
* Copyright (c) 2018-present, Facebook, Inc.
|
||||||
|
* All rights reserved.
|
||||||
|
*
|
||||||
|
* This source code is licensed under both the BSD-style license (found in the
|
||||||
|
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||||
|
* in the COPYING file in the root directory of this source tree).
|
||||||
|
* You may select, at your option, one of the above-listed licenses.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#ifndef FUNCTION
|
||||||
|
# error "FUNCTION(name) must be defined"
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#ifndef TARGET
|
||||||
|
# error "TARGET must be defined"
|
||||||
|
#endif
|
||||||
|
|
||||||
|
|
||||||
|
MEM_STATIC TARGET
|
||||||
|
size_t FUNCTION(ZSTD_encodeSequences)(
|
||||||
|
void* dst, size_t dstCapacity,
|
||||||
|
FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable,
|
||||||
|
FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable,
|
||||||
|
FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable,
|
||||||
|
seqDef const* sequences, size_t nbSeq, int longOffsets)
|
||||||
|
{
|
||||||
|
BIT_CStream_t blockStream;
|
||||||
|
FSE_CState_t stateMatchLength;
|
||||||
|
FSE_CState_t stateOffsetBits;
|
||||||
|
FSE_CState_t stateLitLength;
|
||||||
|
|
||||||
|
CHECK_E(BIT_initCStream(&blockStream, dst, dstCapacity), dstSize_tooSmall); /* not enough space remaining */
|
||||||
|
|
||||||
|
/* first symbols */
|
||||||
|
FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]);
|
||||||
|
FSE_initCState2(&stateOffsetBits, CTable_OffsetBits, ofCodeTable[nbSeq-1]);
|
||||||
|
FSE_initCState2(&stateLitLength, CTable_LitLength, llCodeTable[nbSeq-1]);
|
||||||
|
BIT_addBits(&blockStream, sequences[nbSeq-1].litLength, LL_bits[llCodeTable[nbSeq-1]]);
|
||||||
|
if (MEM_32bits()) BIT_flushBits(&blockStream);
|
||||||
|
BIT_addBits(&blockStream, sequences[nbSeq-1].matchLength, ML_bits[mlCodeTable[nbSeq-1]]);
|
||||||
|
if (MEM_32bits()) BIT_flushBits(&blockStream);
|
||||||
|
if (longOffsets) {
|
||||||
|
U32 const ofBits = ofCodeTable[nbSeq-1];
|
||||||
|
int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1);
|
||||||
|
if (extraBits) {
|
||||||
|
BIT_addBits(&blockStream, sequences[nbSeq-1].offset, extraBits);
|
||||||
|
BIT_flushBits(&blockStream);
|
||||||
|
}
|
||||||
|
BIT_addBits(&blockStream, sequences[nbSeq-1].offset >> extraBits,
|
||||||
|
ofBits - extraBits);
|
||||||
|
} else {
|
||||||
|
BIT_addBits(&blockStream, sequences[nbSeq-1].offset, ofCodeTable[nbSeq-1]);
|
||||||
|
}
|
||||||
|
BIT_flushBits(&blockStream);
|
||||||
|
|
||||||
|
{ size_t n;
|
||||||
|
for (n=nbSeq-2 ; n<nbSeq ; n--) { /* intentional underflow */
|
||||||
|
BYTE const llCode = llCodeTable[n];
|
||||||
|
BYTE const ofCode = ofCodeTable[n];
|
||||||
|
BYTE const mlCode = mlCodeTable[n];
|
||||||
|
U32 const llBits = LL_bits[llCode];
|
||||||
|
U32 const ofBits = ofCode;
|
||||||
|
U32 const mlBits = ML_bits[mlCode];
|
||||||
|
DEBUGLOG(6, "encoding: litlen:%2u - matchlen:%2u - offCode:%7u",
|
||||||
|
sequences[n].litLength,
|
||||||
|
sequences[n].matchLength + MINMATCH,
|
||||||
|
sequences[n].offset);
|
||||||
|
/* 32b*/ /* 64b*/
|
||||||
|
/* (7)*/ /* (7)*/
|
||||||
|
FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode); /* 15 */ /* 15 */
|
||||||
|
FSE_encodeSymbol(&blockStream, &stateMatchLength, mlCode); /* 24 */ /* 24 */
|
||||||
|
if (MEM_32bits()) BIT_flushBits(&blockStream); /* (7)*/
|
||||||
|
FSE_encodeSymbol(&blockStream, &stateLitLength, llCode); /* 16 */ /* 33 */
|
||||||
|
if (MEM_32bits() || (ofBits+mlBits+llBits >= 64-7-(LLFSELog+MLFSELog+OffFSELog)))
|
||||||
|
BIT_flushBits(&blockStream); /* (7)*/
|
||||||
|
BIT_addBits(&blockStream, sequences[n].litLength, llBits);
|
||||||
|
if (MEM_32bits() && ((llBits+mlBits)>24)) BIT_flushBits(&blockStream);
|
||||||
|
BIT_addBits(&blockStream, sequences[n].matchLength, mlBits);
|
||||||
|
if (MEM_32bits() || (ofBits+mlBits+llBits > 56)) BIT_flushBits(&blockStream);
|
||||||
|
if (longOffsets) {
|
||||||
|
int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1);
|
||||||
|
if (extraBits) {
|
||||||
|
BIT_addBits(&blockStream, sequences[n].offset, extraBits);
|
||||||
|
BIT_flushBits(&blockStream); /* (7)*/
|
||||||
|
}
|
||||||
|
BIT_addBits(&blockStream, sequences[n].offset >> extraBits,
|
||||||
|
ofBits - extraBits); /* 31 */
|
||||||
|
} else {
|
||||||
|
BIT_addBits(&blockStream, sequences[n].offset, ofBits); /* 31 */
|
||||||
|
}
|
||||||
|
BIT_flushBits(&blockStream); /* (7)*/
|
||||||
|
} }
|
||||||
|
|
||||||
|
DEBUGLOG(6, "ZSTD_encodeSequences: flushing ML state with %u bits", stateMatchLength.stateLog);
|
||||||
|
FSE_flushCState(&blockStream, &stateMatchLength);
|
||||||
|
DEBUGLOG(6, "ZSTD_encodeSequences: flushing Off state with %u bits", stateOffsetBits.stateLog);
|
||||||
|
FSE_flushCState(&blockStream, &stateOffsetBits);
|
||||||
|
DEBUGLOG(6, "ZSTD_encodeSequences: flushing LL state with %u bits", stateLitLength.stateLog);
|
||||||
|
FSE_flushCState(&blockStream, &stateLitLength);
|
||||||
|
|
||||||
|
{ size_t const streamSize = BIT_closeCStream(&blockStream);
|
||||||
|
if (streamSize==0) return ERROR(dstSize_tooSmall); /* not enough space */
|
||||||
|
return streamSize;
|
||||||
|
}
|
||||||
|
}
|
@ -30,8 +30,9 @@ extern "C" {
|
|||||||
/*-*************************************
|
/*-*************************************
|
||||||
* Constants
|
* Constants
|
||||||
***************************************/
|
***************************************/
|
||||||
static const U32 g_searchStrength = 8;
|
#define kSearchStrength 8
|
||||||
#define HASH_READ_SIZE 8
|
#define HASH_READ_SIZE 8
|
||||||
|
#define ZSTD_CLEVEL_CUSTOM 999
|
||||||
#define ZSTD_DUBT_UNSORTED_MARK 1 /* For btlazy2 strategy, index 1 now means "unsorted".
|
#define ZSTD_DUBT_UNSORTED_MARK 1 /* For btlazy2 strategy, index 1 now means "unsorted".
|
||||||
It could be confused for a real successor at index "1", if sorted as larger than its predecessor.
|
It could be confused for a real successor at index "1", if sorted as larger than its predecessor.
|
||||||
It's not a big deal though : candidate will just be sorted again.
|
It's not a big deal though : candidate will just be sorted again.
|
||||||
@ -109,10 +110,14 @@ typedef struct {
|
|||||||
BYTE const* dictBase; /* extDict indexes relative to this position */
|
BYTE const* dictBase; /* extDict indexes relative to this position */
|
||||||
U32 dictLimit; /* below that point, need extDict */
|
U32 dictLimit; /* below that point, need extDict */
|
||||||
U32 lowLimit; /* below that point, no more data */
|
U32 lowLimit; /* below that point, no more data */
|
||||||
U32 nextToUpdate; /* index from which to continue table update */
|
} ZSTD_window_t;
|
||||||
U32 nextToUpdate3; /* index from which to continue table update */
|
|
||||||
U32 hashLog3; /* dispatch table : larger == faster, more memory */
|
typedef struct {
|
||||||
U32 loadedDictEnd; /* index of end of dictionary */
|
ZSTD_window_t window; /* State for window round buffer management */
|
||||||
|
U32 loadedDictEnd; /* index of end of dictionary */
|
||||||
|
U32 nextToUpdate; /* index from which to continue table update */
|
||||||
|
U32 nextToUpdate3; /* index from which to continue table update */
|
||||||
|
U32 hashLog3; /* dispatch table : larger == faster, more memory */
|
||||||
U32* hashTable;
|
U32* hashTable;
|
||||||
U32* hashTable3;
|
U32* hashTable3;
|
||||||
U32* chainTable;
|
U32* chainTable;
|
||||||
@ -131,6 +136,7 @@ typedef struct {
|
|||||||
} ldmEntry_t;
|
} ldmEntry_t;
|
||||||
|
|
||||||
typedef struct {
|
typedef struct {
|
||||||
|
ZSTD_window_t window; /* State for the window round buffer management */
|
||||||
ldmEntry_t* hashTable;
|
ldmEntry_t* hashTable;
|
||||||
BYTE* bucketOffsets; /* Next position in bucket to insert entry */
|
BYTE* bucketOffsets; /* Next position in bucket to insert entry */
|
||||||
U64 hashPower; /* Used to compute the rolling hash.
|
U64 hashPower; /* Used to compute the rolling hash.
|
||||||
@ -143,6 +149,7 @@ typedef struct {
|
|||||||
U32 bucketSizeLog; /* Log bucket size for collision resolution, at most 8 */
|
U32 bucketSizeLog; /* Log bucket size for collision resolution, at most 8 */
|
||||||
U32 minMatchLength; /* Minimum match length */
|
U32 minMatchLength; /* Minimum match length */
|
||||||
U32 hashEveryLog; /* Log number of entries to skip */
|
U32 hashEveryLog; /* Log number of entries to skip */
|
||||||
|
U32 windowLog; /* Window log for the LDM */
|
||||||
} ldmParams_t;
|
} ldmParams_t;
|
||||||
|
|
||||||
struct ZSTD_CCtx_params_s {
|
struct ZSTD_CCtx_params_s {
|
||||||
@ -151,12 +158,11 @@ struct ZSTD_CCtx_params_s {
|
|||||||
ZSTD_frameParameters fParams;
|
ZSTD_frameParameters fParams;
|
||||||
|
|
||||||
int compressionLevel;
|
int compressionLevel;
|
||||||
U32 forceWindow; /* force back-references to respect limit of
|
int forceWindow; /* force back-references to respect limit of
|
||||||
* 1<<wLog, even for dictionary */
|
* 1<<wLog, even for dictionary */
|
||||||
|
|
||||||
/* Multithreading: used to pass parameters to mtctx */
|
/* Multithreading: used to pass parameters to mtctx */
|
||||||
U32 nbThreads;
|
unsigned nbWorkers;
|
||||||
int nonBlockingMode; /* will trigger ZSTDMT even with nbThreads==1 */
|
|
||||||
unsigned jobSize;
|
unsigned jobSize;
|
||||||
unsigned overlapSizeLog;
|
unsigned overlapSizeLog;
|
||||||
|
|
||||||
@ -165,14 +171,15 @@ struct ZSTD_CCtx_params_s {
|
|||||||
|
|
||||||
/* For use with createCCtxParams() and freeCCtxParams() only */
|
/* For use with createCCtxParams() and freeCCtxParams() only */
|
||||||
ZSTD_customMem customMem;
|
ZSTD_customMem customMem;
|
||||||
|
|
||||||
}; /* typedef'd to ZSTD_CCtx_params within "zstd.h" */
|
}; /* typedef'd to ZSTD_CCtx_params within "zstd.h" */
|
||||||
|
|
||||||
struct ZSTD_CCtx_s {
|
struct ZSTD_CCtx_s {
|
||||||
ZSTD_compressionStage_e stage;
|
ZSTD_compressionStage_e stage;
|
||||||
U32 dictID;
|
int cParamsChanged; /* == 1 if cParams(except wlog) or compression level are changed in requestedParams. Triggers transmission of new params to ZSTDMT (if available) then reset to 0. */
|
||||||
|
int bmi2; /* == 1 if the CPU supports BMI2 and 0 otherwise. CPU support is determined dynamically once per context lifetime. */
|
||||||
ZSTD_CCtx_params requestedParams;
|
ZSTD_CCtx_params requestedParams;
|
||||||
ZSTD_CCtx_params appliedParams;
|
ZSTD_CCtx_params appliedParams;
|
||||||
|
U32 dictID;
|
||||||
void* workSpace;
|
void* workSpace;
|
||||||
size_t workSpaceSize;
|
size_t workSpaceSize;
|
||||||
size_t blockSize;
|
size_t blockSize;
|
||||||
@ -185,6 +192,7 @@ struct ZSTD_CCtx_s {
|
|||||||
|
|
||||||
seqStore_t seqStore; /* sequences storage ptrs */
|
seqStore_t seqStore; /* sequences storage ptrs */
|
||||||
ldmState_t ldmState; /* long distance matching state */
|
ldmState_t ldmState; /* long distance matching state */
|
||||||
|
rawSeq* ldmSequences; /* Storage for the ldm output sequences */
|
||||||
ZSTD_blockState_t blockState;
|
ZSTD_blockState_t blockState;
|
||||||
U32* entropyWorkspace; /* entropy workspace of HUF_WORKSPACE_SIZE bytes */
|
U32* entropyWorkspace; /* entropy workspace of HUF_WORKSPACE_SIZE bytes */
|
||||||
|
|
||||||
@ -439,6 +447,159 @@ MEM_STATIC size_t ZSTD_hashPtr(const void* p, U32 hBits, U32 mls)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*-*************************************
|
||||||
|
* Round buffer management
|
||||||
|
***************************************/
|
||||||
|
|
||||||
|
#define ZSTD_LOWLIMIT_MAX (3U << 29) /* Max lowLimit allowed */
|
||||||
|
/* Maximum chunk size before overflow correction needs to be called again */
|
||||||
|
#define ZSTD_CHUNKSIZE_MAX \
|
||||||
|
( ((U32)-1) /* Maximum ending current index */ \
|
||||||
|
- (1U << ZSTD_WINDOWLOG_MAX) /* Max distance from lowLimit to current */ \
|
||||||
|
- ZSTD_LOWLIMIT_MAX) /* Maximum beginning lowLimit */
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ZSTD_window_clear():
|
||||||
|
* Clears the window containing the history by simply setting it to empty.
|
||||||
|
*/
|
||||||
|
MEM_STATIC void ZSTD_window_clear(ZSTD_window_t* window)
|
||||||
|
{
|
||||||
|
size_t const endT = (size_t)(window->nextSrc - window->base);
|
||||||
|
U32 const end = (U32)endT;
|
||||||
|
|
||||||
|
window->lowLimit = end;
|
||||||
|
window->dictLimit = end;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ZSTD_window_hasExtDict():
|
||||||
|
* Returns non-zero if the window has a non-empty extDict.
|
||||||
|
*/
|
||||||
|
MEM_STATIC U32 ZSTD_window_hasExtDict(ZSTD_window_t const window)
|
||||||
|
{
|
||||||
|
return window.lowLimit < window.dictLimit;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ZSTD_window_needOverflowCorrection():
|
||||||
|
* Returns non-zero if the indices are getting too large and need overflow
|
||||||
|
* protection.
|
||||||
|
*/
|
||||||
|
MEM_STATIC U32 ZSTD_window_needOverflowCorrection(ZSTD_window_t const window)
|
||||||
|
{
|
||||||
|
return window.lowLimit > ZSTD_LOWLIMIT_MAX;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ZSTD_window_correctOverflow():
|
||||||
|
* Reduces the indices to protect from index overflow.
|
||||||
|
* Returns the correction made to the indices, which must be applied to every
|
||||||
|
* stored index.
|
||||||
|
*
|
||||||
|
* The least significant cycleLog bits of the indices must remain the same,
|
||||||
|
* which may be 0. Every index up to maxDist in the past must be valid.
|
||||||
|
* NOTE: (maxDist & cycleMask) must be zero.
|
||||||
|
*/
|
||||||
|
MEM_STATIC U32 ZSTD_window_correctOverflow(ZSTD_window_t* window, U32 cycleLog,
|
||||||
|
U32 maxDist, void const* src)
|
||||||
|
{
|
||||||
|
/* preemptive overflow correction:
|
||||||
|
* 1. correction is large enough:
|
||||||
|
* lowLimit > (3<<29) ==> current > 3<<29 + 1<<windowLog
|
||||||
|
* 1<<windowLog <= newCurrent < 1<<chainLog + 1<<windowLog
|
||||||
|
*
|
||||||
|
* current - newCurrent
|
||||||
|
* > (3<<29 + 1<<windowLog) - (1<<windowLog + 1<<chainLog)
|
||||||
|
* > (3<<29) - (1<<chainLog)
|
||||||
|
* > (3<<29) - (1<<30) (NOTE: chainLog <= 30)
|
||||||
|
* > 1<<29
|
||||||
|
*
|
||||||
|
* 2. (ip+ZSTD_CHUNKSIZE_MAX - cctx->base) doesn't overflow:
|
||||||
|
* After correction, current is less than (1<<chainLog + 1<<windowLog).
|
||||||
|
* In 64-bit mode we are safe, because we have 64-bit ptrdiff_t.
|
||||||
|
* In 32-bit mode we are safe, because (chainLog <= 29), so
|
||||||
|
* ip+ZSTD_CHUNKSIZE_MAX - cctx->base < 1<<32.
|
||||||
|
* 3. (cctx->lowLimit + 1<<windowLog) < 1<<32:
|
||||||
|
* windowLog <= 31 ==> 3<<29 + 1<<windowLog < 7<<29 < 1<<32.
|
||||||
|
*/
|
||||||
|
U32 const cycleMask = (1U << cycleLog) - 1;
|
||||||
|
U32 const current = (U32)((BYTE const*)src - window->base);
|
||||||
|
U32 const newCurrent = (current & cycleMask) + maxDist;
|
||||||
|
U32 const correction = current - newCurrent;
|
||||||
|
assert((maxDist & cycleMask) == 0);
|
||||||
|
assert(current > newCurrent);
|
||||||
|
/* Loose bound, should be around 1<<29 (see above) */
|
||||||
|
assert(correction > 1<<28);
|
||||||
|
|
||||||
|
window->base += correction;
|
||||||
|
window->dictBase += correction;
|
||||||
|
window->lowLimit -= correction;
|
||||||
|
window->dictLimit -= correction;
|
||||||
|
|
||||||
|
DEBUGLOG(4, "Correction of 0x%x bytes to lowLimit=0x%x", correction,
|
||||||
|
window->lowLimit);
|
||||||
|
return correction;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ZSTD_window_enforceMaxDist():
|
||||||
|
* Sets lowLimit such that indices earlier than (srcEnd - base) - lowLimit are
|
||||||
|
* invalid. This allows a simple check index >= lowLimit to see if it is valid.
|
||||||
|
* Source pointers past srcEnd are not guaranteed to be valid.
|
||||||
|
*/
|
||||||
|
MEM_STATIC void ZSTD_window_enforceMaxDist(ZSTD_window_t* window,
|
||||||
|
void const* srcEnd, U32 maxDist)
|
||||||
|
{
|
||||||
|
U32 const current = (U32)((BYTE const*)srcEnd - window->base);
|
||||||
|
if (current > maxDist) {
|
||||||
|
U32 const newLowLimit = current - maxDist;
|
||||||
|
if (window->lowLimit < newLowLimit) window->lowLimit = newLowLimit;
|
||||||
|
if (window->dictLimit < window->lowLimit) {
|
||||||
|
DEBUGLOG(5, "Update dictLimit from %u to %u", window->dictLimit,
|
||||||
|
window->lowLimit);
|
||||||
|
window->dictLimit = window->lowLimit;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ZSTD_window_update():
|
||||||
|
* Updates the window by appending [src, src + srcSize) to the window.
|
||||||
|
* If it is not contiguous, the current prefix becomes the extDict, and we
|
||||||
|
* forget about the extDict. Handles overlap of the prefix and extDict.
|
||||||
|
* Returns non-zero if the segment is contiguous.
|
||||||
|
*/
|
||||||
|
MEM_STATIC U32 ZSTD_window_update(ZSTD_window_t* window,
|
||||||
|
void const* src, size_t srcSize)
|
||||||
|
{
|
||||||
|
BYTE const* const ip = (BYTE const*)src;
|
||||||
|
U32 contiguous = 1;
|
||||||
|
/* Check if blocks follow each other */
|
||||||
|
if (src != window->nextSrc) {
|
||||||
|
/* not contiguous */
|
||||||
|
size_t const distanceFromBase = (size_t)(window->nextSrc - window->base);
|
||||||
|
DEBUGLOG(5, "Non contiguous blocks, new segment starts at %u",
|
||||||
|
window->dictLimit);
|
||||||
|
window->lowLimit = window->dictLimit;
|
||||||
|
assert(distanceFromBase == (size_t)(U32)distanceFromBase); /* should never overflow */
|
||||||
|
window->dictLimit = (U32)distanceFromBase;
|
||||||
|
window->dictBase = window->base;
|
||||||
|
window->base = ip - distanceFromBase;
|
||||||
|
// ms->nextToUpdate = window->dictLimit;
|
||||||
|
if (window->dictLimit - window->lowLimit < HASH_READ_SIZE) window->lowLimit = window->dictLimit; /* too small extDict */
|
||||||
|
contiguous = 0;
|
||||||
|
}
|
||||||
|
window->nextSrc = ip + srcSize;
|
||||||
|
/* if input and dictionary overlap : reduce dictionary (area presumed modified by input) */
|
||||||
|
if ( (ip+srcSize > window->dictBase + window->lowLimit)
|
||||||
|
& (ip < window->dictBase + window->dictLimit)) {
|
||||||
|
ptrdiff_t const highInputIdx = (ip + srcSize) - window->dictBase;
|
||||||
|
U32 const lowLimitMax = (highInputIdx > (ptrdiff_t)window->dictLimit) ? window->dictLimit : (U32)highInputIdx;
|
||||||
|
window->lowLimit = lowLimitMax;
|
||||||
|
}
|
||||||
|
return contiguous;
|
||||||
|
}
|
||||||
|
|
||||||
#if defined (__cplusplus)
|
#if defined (__cplusplus)
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
@ -21,7 +21,7 @@ void ZSTD_fillDoubleHashTable(ZSTD_matchState_t* ms,
|
|||||||
U32 const mls = cParams->searchLength;
|
U32 const mls = cParams->searchLength;
|
||||||
U32* const hashSmall = ms->chainTable;
|
U32* const hashSmall = ms->chainTable;
|
||||||
U32 const hBitsS = cParams->chainLog;
|
U32 const hBitsS = cParams->chainLog;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* ip = base + ms->nextToUpdate;
|
const BYTE* ip = base + ms->nextToUpdate;
|
||||||
const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
|
const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
|
||||||
const U32 fastHashFillStep = 3;
|
const U32 fastHashFillStep = 3;
|
||||||
@ -55,11 +55,11 @@ size_t ZSTD_compressBlock_doubleFast_generic(
|
|||||||
const U32 hBitsL = cParams->hashLog;
|
const U32 hBitsL = cParams->hashLog;
|
||||||
U32* const hashSmall = ms->chainTable;
|
U32* const hashSmall = ms->chainTable;
|
||||||
const U32 hBitsS = cParams->chainLog;
|
const U32 hBitsS = cParams->chainLog;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const istart = (const BYTE*)src;
|
const BYTE* const istart = (const BYTE*)src;
|
||||||
const BYTE* ip = istart;
|
const BYTE* ip = istart;
|
||||||
const BYTE* anchor = istart;
|
const BYTE* anchor = istart;
|
||||||
const U32 lowestIndex = ms->dictLimit;
|
const U32 lowestIndex = ms->window.dictLimit;
|
||||||
const BYTE* const lowest = base + lowestIndex;
|
const BYTE* const lowest = base + lowestIndex;
|
||||||
const BYTE* const iend = istart + srcSize;
|
const BYTE* const iend = istart + srcSize;
|
||||||
const BYTE* const ilimit = iend - HASH_READ_SIZE;
|
const BYTE* const ilimit = iend - HASH_READ_SIZE;
|
||||||
@ -113,7 +113,7 @@ size_t ZSTD_compressBlock_doubleFast_generic(
|
|||||||
while (((ip>anchor) & (match>lowest)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
|
while (((ip>anchor) & (match>lowest)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
ip += ((ip-anchor) >> g_searchStrength) + 1;
|
ip += ((ip-anchor) >> kSearchStrength) + 1;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -187,14 +187,14 @@ static size_t ZSTD_compressBlock_doubleFast_extDict_generic(
|
|||||||
U32 const hBitsL = cParams->hashLog;
|
U32 const hBitsL = cParams->hashLog;
|
||||||
U32* const hashSmall = ms->chainTable;
|
U32* const hashSmall = ms->chainTable;
|
||||||
U32 const hBitsS = cParams->chainLog;
|
U32 const hBitsS = cParams->chainLog;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
const BYTE* const istart = (const BYTE*)src;
|
const BYTE* const istart = (const BYTE*)src;
|
||||||
const BYTE* ip = istart;
|
const BYTE* ip = istart;
|
||||||
const BYTE* anchor = istart;
|
const BYTE* anchor = istart;
|
||||||
const U32 lowestIndex = ms->lowLimit;
|
const U32 lowestIndex = ms->window.lowLimit;
|
||||||
const BYTE* const dictStart = dictBase + lowestIndex;
|
const BYTE* const dictStart = dictBase + lowestIndex;
|
||||||
const U32 dictLimit = ms->dictLimit;
|
const U32 dictLimit = ms->window.dictLimit;
|
||||||
const BYTE* const lowPrefixPtr = base + dictLimit;
|
const BYTE* const lowPrefixPtr = base + dictLimit;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
const BYTE* const iend = istart + srcSize;
|
const BYTE* const iend = istart + srcSize;
|
||||||
@ -264,7 +264,7 @@ static size_t ZSTD_compressBlock_doubleFast_extDict_generic(
|
|||||||
ZSTD_storeSeq(seqStore, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
|
ZSTD_storeSeq(seqStore, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
|
||||||
|
|
||||||
} else {
|
} else {
|
||||||
ip += ((ip-anchor) >> g_searchStrength) + 1;
|
ip += ((ip-anchor) >> kSearchStrength) + 1;
|
||||||
continue;
|
continue;
|
||||||
} }
|
} }
|
||||||
|
|
||||||
|
@ -19,7 +19,7 @@ void ZSTD_fillHashTable(ZSTD_matchState_t* ms,
|
|||||||
U32* const hashTable = ms->hashTable;
|
U32* const hashTable = ms->hashTable;
|
||||||
U32 const hBits = cParams->hashLog;
|
U32 const hBits = cParams->hashLog;
|
||||||
U32 const mls = cParams->searchLength;
|
U32 const mls = cParams->searchLength;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* ip = base + ms->nextToUpdate;
|
const BYTE* ip = base + ms->nextToUpdate;
|
||||||
const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
|
const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
|
||||||
const U32 fastHashFillStep = 3;
|
const U32 fastHashFillStep = 3;
|
||||||
@ -45,11 +45,11 @@ size_t ZSTD_compressBlock_fast_generic(
|
|||||||
U32 const hlog, U32 const mls)
|
U32 const hlog, U32 const mls)
|
||||||
{
|
{
|
||||||
U32* const hashTable = ms->hashTable;
|
U32* const hashTable = ms->hashTable;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const istart = (const BYTE*)src;
|
const BYTE* const istart = (const BYTE*)src;
|
||||||
const BYTE* ip = istart;
|
const BYTE* ip = istart;
|
||||||
const BYTE* anchor = istart;
|
const BYTE* anchor = istart;
|
||||||
const U32 lowestIndex = ms->dictLimit;
|
const U32 lowestIndex = ms->window.dictLimit;
|
||||||
const BYTE* const lowest = base + lowestIndex;
|
const BYTE* const lowest = base + lowestIndex;
|
||||||
const BYTE* const iend = istart + srcSize;
|
const BYTE* const iend = istart + srcSize;
|
||||||
const BYTE* const ilimit = iend - HASH_READ_SIZE;
|
const BYTE* const ilimit = iend - HASH_READ_SIZE;
|
||||||
@ -79,7 +79,7 @@ size_t ZSTD_compressBlock_fast_generic(
|
|||||||
} else {
|
} else {
|
||||||
U32 offset;
|
U32 offset;
|
||||||
if ( (matchIndex <= lowestIndex) || (MEM_read32(match) != MEM_read32(ip)) ) {
|
if ( (matchIndex <= lowestIndex) || (MEM_read32(match) != MEM_read32(ip)) ) {
|
||||||
ip += ((ip-anchor) >> g_searchStrength) + 1;
|
ip += ((ip-anchor) >> kSearchStrength) + 1;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
mLength = ZSTD_count(ip+4, match+4, iend) + 4;
|
mLength = ZSTD_count(ip+4, match+4, iend) + 4;
|
||||||
@ -149,14 +149,14 @@ static size_t ZSTD_compressBlock_fast_extDict_generic(
|
|||||||
U32 const hlog, U32 const mls)
|
U32 const hlog, U32 const mls)
|
||||||
{
|
{
|
||||||
U32* hashTable = ms->hashTable;
|
U32* hashTable = ms->hashTable;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
const BYTE* const istart = (const BYTE*)src;
|
const BYTE* const istart = (const BYTE*)src;
|
||||||
const BYTE* ip = istart;
|
const BYTE* ip = istart;
|
||||||
const BYTE* anchor = istart;
|
const BYTE* anchor = istart;
|
||||||
const U32 lowestIndex = ms->lowLimit;
|
const U32 lowestIndex = ms->window.lowLimit;
|
||||||
const BYTE* const dictStart = dictBase + lowestIndex;
|
const BYTE* const dictStart = dictBase + lowestIndex;
|
||||||
const U32 dictLimit = ms->dictLimit;
|
const U32 dictLimit = ms->window.dictLimit;
|
||||||
const BYTE* const lowPrefixPtr = base + dictLimit;
|
const BYTE* const lowPrefixPtr = base + dictLimit;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
const BYTE* const iend = istart + srcSize;
|
const BYTE* const iend = istart + srcSize;
|
||||||
@ -185,7 +185,7 @@ static size_t ZSTD_compressBlock_fast_extDict_generic(
|
|||||||
} else {
|
} else {
|
||||||
if ( (matchIndex < lowestIndex) ||
|
if ( (matchIndex < lowestIndex) ||
|
||||||
(MEM_read32(match) != MEM_read32(ip)) ) {
|
(MEM_read32(match) != MEM_read32(ip)) ) {
|
||||||
ip += ((ip-anchor) >> g_searchStrength) + 1;
|
ip += ((ip-anchor) >> kSearchStrength) + 1;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
{ const BYTE* matchEnd = matchIndex < dictLimit ? dictEnd : iend;
|
{ const BYTE* matchEnd = matchIndex < dictLimit ? dictEnd : iend;
|
||||||
|
@ -28,17 +28,17 @@ void ZSTD_updateDUBT(
|
|||||||
U32 const btLog = cParams->chainLog - 1;
|
U32 const btLog = cParams->chainLog - 1;
|
||||||
U32 const btMask = (1 << btLog) - 1;
|
U32 const btMask = (1 << btLog) - 1;
|
||||||
|
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
U32 const target = (U32)(ip - base);
|
U32 const target = (U32)(ip - base);
|
||||||
U32 idx = ms->nextToUpdate;
|
U32 idx = ms->nextToUpdate;
|
||||||
|
|
||||||
if (idx != target)
|
if (idx != target)
|
||||||
DEBUGLOG(7, "ZSTD_updateDUBT, from %u to %u (dictLimit:%u)",
|
DEBUGLOG(7, "ZSTD_updateDUBT, from %u to %u (dictLimit:%u)",
|
||||||
idx, target, ms->dictLimit);
|
idx, target, ms->window.dictLimit);
|
||||||
assert(ip + 8 <= iend); /* condition for ZSTD_hashPtr */
|
assert(ip + 8 <= iend); /* condition for ZSTD_hashPtr */
|
||||||
(void)iend;
|
(void)iend;
|
||||||
|
|
||||||
assert(idx >= ms->dictLimit); /* condition for valid base+idx */
|
assert(idx >= ms->window.dictLimit); /* condition for valid base+idx */
|
||||||
for ( ; idx < target ; idx++) {
|
for ( ; idx < target ; idx++) {
|
||||||
size_t const h = ZSTD_hashPtr(base + idx, hashLog, mls); /* assumption : ip + 8 <= iend */
|
size_t const h = ZSTD_hashPtr(base + idx, hashLog, mls); /* assumption : ip + 8 <= iend */
|
||||||
U32 const matchIndex = hashTable[h];
|
U32 const matchIndex = hashTable[h];
|
||||||
@ -68,9 +68,9 @@ static void ZSTD_insertDUBT1(
|
|||||||
U32 const btLog = cParams->chainLog - 1;
|
U32 const btLog = cParams->chainLog - 1;
|
||||||
U32 const btMask = (1 << btLog) - 1;
|
U32 const btMask = (1 << btLog) - 1;
|
||||||
size_t commonLengthSmaller=0, commonLengthLarger=0;
|
size_t commonLengthSmaller=0, commonLengthLarger=0;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
const U32 dictLimit = ms->dictLimit;
|
const U32 dictLimit = ms->window.dictLimit;
|
||||||
const BYTE* const ip = (current>=dictLimit) ? base + current : dictBase + current;
|
const BYTE* const ip = (current>=dictLimit) ? base + current : dictBase + current;
|
||||||
const BYTE* const iend = (current>=dictLimit) ? inputEnd : dictBase + dictLimit;
|
const BYTE* const iend = (current>=dictLimit) ? inputEnd : dictBase + dictLimit;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
@ -80,7 +80,7 @@ static void ZSTD_insertDUBT1(
|
|||||||
U32* largerPtr = smallerPtr + 1;
|
U32* largerPtr = smallerPtr + 1;
|
||||||
U32 matchIndex = *smallerPtr;
|
U32 matchIndex = *smallerPtr;
|
||||||
U32 dummy32; /* to be nullified at the end */
|
U32 dummy32; /* to be nullified at the end */
|
||||||
U32 const windowLow = ms->lowLimit;
|
U32 const windowLow = ms->window.lowLimit;
|
||||||
|
|
||||||
DEBUGLOG(8, "ZSTD_insertDUBT1(%u) (dictLimit=%u, lowLimit=%u)",
|
DEBUGLOG(8, "ZSTD_insertDUBT1(%u) (dictLimit=%u, lowLimit=%u)",
|
||||||
current, dictLimit, windowLow);
|
current, dictLimit, windowLow);
|
||||||
@ -150,9 +150,9 @@ static size_t ZSTD_DUBT_findBestMatch (
|
|||||||
size_t const h = ZSTD_hashPtr(ip, hashLog, mls);
|
size_t const h = ZSTD_hashPtr(ip, hashLog, mls);
|
||||||
U32 matchIndex = hashTable[h];
|
U32 matchIndex = hashTable[h];
|
||||||
|
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
U32 const current = (U32)(ip-base);
|
U32 const current = (U32)(ip-base);
|
||||||
U32 const windowLow = ms->lowLimit;
|
U32 const windowLow = ms->window.lowLimit;
|
||||||
|
|
||||||
U32* const bt = ms->chainTable;
|
U32* const bt = ms->chainTable;
|
||||||
U32 const btLog = cParams->chainLog - 1;
|
U32 const btLog = cParams->chainLog - 1;
|
||||||
@ -203,8 +203,8 @@ static size_t ZSTD_DUBT_findBestMatch (
|
|||||||
|
|
||||||
/* find longest match */
|
/* find longest match */
|
||||||
{ size_t commonLengthSmaller=0, commonLengthLarger=0;
|
{ size_t commonLengthSmaller=0, commonLengthLarger=0;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
const U32 dictLimit = ms->dictLimit;
|
const U32 dictLimit = ms->window.dictLimit;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
const BYTE* const prefixStart = base + dictLimit;
|
const BYTE* const prefixStart = base + dictLimit;
|
||||||
U32* smallerPtr = bt + 2*(current&btMask);
|
U32* smallerPtr = bt + 2*(current&btMask);
|
||||||
@ -279,7 +279,7 @@ static size_t ZSTD_BtFindBestMatch (
|
|||||||
const U32 mls /* template */)
|
const U32 mls /* template */)
|
||||||
{
|
{
|
||||||
DEBUGLOG(7, "ZSTD_BtFindBestMatch");
|
DEBUGLOG(7, "ZSTD_BtFindBestMatch");
|
||||||
if (ip < ms->base + ms->nextToUpdate) return 0; /* skipped area */
|
if (ip < ms->window.base + ms->nextToUpdate) return 0; /* skipped area */
|
||||||
ZSTD_updateDUBT(ms, cParams, ip, iLimit, mls);
|
ZSTD_updateDUBT(ms, cParams, ip, iLimit, mls);
|
||||||
return ZSTD_DUBT_findBestMatch(ms, cParams, ip, iLimit, offsetPtr, mls, 0);
|
return ZSTD_DUBT_findBestMatch(ms, cParams, ip, iLimit, offsetPtr, mls, 0);
|
||||||
}
|
}
|
||||||
@ -309,7 +309,7 @@ static size_t ZSTD_BtFindBestMatch_extDict (
|
|||||||
const U32 mls)
|
const U32 mls)
|
||||||
{
|
{
|
||||||
DEBUGLOG(7, "ZSTD_BtFindBestMatch_extDict");
|
DEBUGLOG(7, "ZSTD_BtFindBestMatch_extDict");
|
||||||
if (ip < ms->base + ms->nextToUpdate) return 0; /* skipped area */
|
if (ip < ms->window.base + ms->nextToUpdate) return 0; /* skipped area */
|
||||||
ZSTD_updateDUBT(ms, cParams, ip, iLimit, mls);
|
ZSTD_updateDUBT(ms, cParams, ip, iLimit, mls);
|
||||||
return ZSTD_DUBT_findBestMatch(ms, cParams, ip, iLimit, offsetPtr, mls, 1);
|
return ZSTD_DUBT_findBestMatch(ms, cParams, ip, iLimit, offsetPtr, mls, 1);
|
||||||
}
|
}
|
||||||
@ -347,7 +347,7 @@ static U32 ZSTD_insertAndFindFirstIndex_internal(
|
|||||||
const U32 hashLog = cParams->hashLog;
|
const U32 hashLog = cParams->hashLog;
|
||||||
U32* const chainTable = ms->chainTable;
|
U32* const chainTable = ms->chainTable;
|
||||||
const U32 chainMask = (1 << cParams->chainLog) - 1;
|
const U32 chainMask = (1 << cParams->chainLog) - 1;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const U32 target = (U32)(ip - base);
|
const U32 target = (U32)(ip - base);
|
||||||
U32 idx = ms->nextToUpdate;
|
U32 idx = ms->nextToUpdate;
|
||||||
|
|
||||||
@ -381,12 +381,12 @@ size_t ZSTD_HcFindBestMatch_generic (
|
|||||||
U32* const chainTable = ms->chainTable;
|
U32* const chainTable = ms->chainTable;
|
||||||
const U32 chainSize = (1 << cParams->chainLog);
|
const U32 chainSize = (1 << cParams->chainLog);
|
||||||
const U32 chainMask = chainSize-1;
|
const U32 chainMask = chainSize-1;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
const U32 dictLimit = ms->dictLimit;
|
const U32 dictLimit = ms->window.dictLimit;
|
||||||
const BYTE* const prefixStart = base + dictLimit;
|
const BYTE* const prefixStart = base + dictLimit;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
const U32 lowLimit = ms->lowLimit;
|
const U32 lowLimit = ms->window.lowLimit;
|
||||||
const U32 current = (U32)(ip-base);
|
const U32 current = (U32)(ip-base);
|
||||||
const U32 minChain = current > chainSize ? current - chainSize : 0;
|
const U32 minChain = current > chainSize ? current - chainSize : 0;
|
||||||
U32 nbAttempts = 1U << cParams->searchLog;
|
U32 nbAttempts = 1U << cParams->searchLog;
|
||||||
@ -471,7 +471,7 @@ size_t ZSTD_compressBlock_lazy_generic(
|
|||||||
const BYTE* anchor = istart;
|
const BYTE* anchor = istart;
|
||||||
const BYTE* const iend = istart + srcSize;
|
const BYTE* const iend = istart + srcSize;
|
||||||
const BYTE* const ilimit = iend - 8;
|
const BYTE* const ilimit = iend - 8;
|
||||||
const BYTE* const base = ms->base + ms->dictLimit;
|
const BYTE* const base = ms->window.base + ms->window.dictLimit;
|
||||||
|
|
||||||
typedef size_t (*searchMax_f)(
|
typedef size_t (*searchMax_f)(
|
||||||
ZSTD_matchState_t* ms, ZSTD_compressionParameters const* cParams,
|
ZSTD_matchState_t* ms, ZSTD_compressionParameters const* cParams,
|
||||||
@ -508,7 +508,7 @@ size_t ZSTD_compressBlock_lazy_generic(
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (matchLength < 4) {
|
if (matchLength < 4) {
|
||||||
ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */
|
ip += ((ip-anchor) >> kSearchStrength) + 1; /* jump faster over incompressible sections */
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -635,13 +635,13 @@ size_t ZSTD_compressBlock_lazy_extDict_generic(
|
|||||||
const BYTE* anchor = istart;
|
const BYTE* anchor = istart;
|
||||||
const BYTE* const iend = istart + srcSize;
|
const BYTE* const iend = istart + srcSize;
|
||||||
const BYTE* const ilimit = iend - 8;
|
const BYTE* const ilimit = iend - 8;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const U32 dictLimit = ms->dictLimit;
|
const U32 dictLimit = ms->window.dictLimit;
|
||||||
const U32 lowestIndex = ms->lowLimit;
|
const U32 lowestIndex = ms->window.lowLimit;
|
||||||
const BYTE* const prefixStart = base + dictLimit;
|
const BYTE* const prefixStart = base + dictLimit;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
const BYTE* const dictStart = dictBase + ms->lowLimit;
|
const BYTE* const dictStart = dictBase + lowestIndex;
|
||||||
|
|
||||||
typedef size_t (*searchMax_f)(
|
typedef size_t (*searchMax_f)(
|
||||||
ZSTD_matchState_t* ms, ZSTD_compressionParameters const* cParams,
|
ZSTD_matchState_t* ms, ZSTD_compressionParameters const* cParams,
|
||||||
@ -681,7 +681,7 @@ size_t ZSTD_compressBlock_lazy_extDict_generic(
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (matchLength < 4) {
|
if (matchLength < 4) {
|
||||||
ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */
|
ip += ((ip-anchor) >> kSearchStrength) + 1; /* jump faster over incompressible sections */
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -28,8 +28,17 @@ size_t ZSTD_ldm_initializeParameters(ldmParams_t* params, U32 enableLdm)
|
|||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
void ZSTD_ldm_adjustParameters(ldmParams_t* params, U32 windowLog)
|
void ZSTD_ldm_adjustParameters(ldmParams_t* params,
|
||||||
|
ZSTD_compressionParameters const* cParams)
|
||||||
{
|
{
|
||||||
|
U32 const windowLog = cParams->windowLog;
|
||||||
|
if (cParams->strategy >= ZSTD_btopt) {
|
||||||
|
/* Get out of the way of the optimal parser */
|
||||||
|
U32 const minMatch = MAX(cParams->targetLength, params->minMatchLength);
|
||||||
|
assert(minMatch >= ZSTD_LDM_MINMATCH_MIN);
|
||||||
|
assert(minMatch <= ZSTD_LDM_MINMATCH_MAX);
|
||||||
|
params->minMatchLength = minMatch;
|
||||||
|
}
|
||||||
if (params->hashLog == 0) {
|
if (params->hashLog == 0) {
|
||||||
params->hashLog = MAX(ZSTD_HASHLOG_MIN, windowLog - LDM_HASH_RLOG);
|
params->hashLog = MAX(ZSTD_HASHLOG_MIN, windowLog - LDM_HASH_RLOG);
|
||||||
assert(params->hashLog <= ZSTD_HASHLOG_MAX);
|
assert(params->hashLog <= ZSTD_HASHLOG_MAX);
|
||||||
@ -41,12 +50,19 @@ void ZSTD_ldm_adjustParameters(ldmParams_t* params, U32 windowLog)
|
|||||||
params->bucketSizeLog = MIN(params->bucketSizeLog, params->hashLog);
|
params->bucketSizeLog = MIN(params->bucketSizeLog, params->hashLog);
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t ZSTD_ldm_getTableSize(U32 hashLog, U32 bucketSizeLog) {
|
size_t ZSTD_ldm_getTableSize(ldmParams_t params)
|
||||||
size_t const ldmHSize = ((size_t)1) << hashLog;
|
{
|
||||||
size_t const ldmBucketSizeLog = MIN(bucketSizeLog, hashLog);
|
size_t const ldmHSize = ((size_t)1) << params.hashLog;
|
||||||
|
size_t const ldmBucketSizeLog = MIN(params.bucketSizeLog, params.hashLog);
|
||||||
size_t const ldmBucketSize =
|
size_t const ldmBucketSize =
|
||||||
((size_t)1) << (hashLog - ldmBucketSizeLog);
|
((size_t)1) << (params.hashLog - ldmBucketSizeLog);
|
||||||
return ldmBucketSize + (ldmHSize * (sizeof(ldmEntry_t)));
|
size_t const totalSize = ldmBucketSize + ldmHSize * sizeof(ldmEntry_t);
|
||||||
|
return params.enableLdm ? totalSize : 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t ZSTD_ldm_getMaxNbSeq(ldmParams_t params, size_t maxChunkSize)
|
||||||
|
{
|
||||||
|
return params.enableLdm ? (maxChunkSize / params.minMatchLength) : 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
/** ZSTD_ldm_getSmallHash() :
|
/** ZSTD_ldm_getSmallHash() :
|
||||||
@ -215,12 +231,12 @@ static size_t ZSTD_ldm_fillFastTables(ZSTD_matchState_t* ms,
|
|||||||
{
|
{
|
||||||
case ZSTD_fast:
|
case ZSTD_fast:
|
||||||
ZSTD_fillHashTable(ms, cParams, iend);
|
ZSTD_fillHashTable(ms, cParams, iend);
|
||||||
ms->nextToUpdate = (U32)(iend - ms->base);
|
ms->nextToUpdate = (U32)(iend - ms->window.base);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case ZSTD_dfast:
|
case ZSTD_dfast:
|
||||||
ZSTD_fillDoubleHashTable(ms, cParams, iend);
|
ZSTD_fillDoubleHashTable(ms, cParams, iend);
|
||||||
ms->nextToUpdate = (U32)(iend - ms->base);
|
ms->nextToUpdate = (U32)(iend - ms->window.base);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case ZSTD_greedy:
|
case ZSTD_greedy:
|
||||||
@ -271,57 +287,61 @@ static U64 ZSTD_ldm_fillLdmHashTable(ldmState_t* state,
|
|||||||
* (after a long match, only update tables a limited amount). */
|
* (after a long match, only update tables a limited amount). */
|
||||||
static void ZSTD_ldm_limitTableUpdate(ZSTD_matchState_t* ms, const BYTE* anchor)
|
static void ZSTD_ldm_limitTableUpdate(ZSTD_matchState_t* ms, const BYTE* anchor)
|
||||||
{
|
{
|
||||||
U32 const current = (U32)(anchor - ms->base);
|
U32 const current = (U32)(anchor - ms->window.base);
|
||||||
if (current > ms->nextToUpdate + 1024) {
|
if (current > ms->nextToUpdate + 1024) {
|
||||||
ms->nextToUpdate =
|
ms->nextToUpdate =
|
||||||
current - MIN(512, current - ms->nextToUpdate - 1024);
|
current - MIN(512, current - ms->nextToUpdate - 1024);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t ZSTD_compressBlock_ldm(
|
static size_t ZSTD_ldm_generateSequences_internal(
|
||||||
ldmState_t* ldmState, ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
|
ldmState_t* ldmState, rawSeq* sequences,
|
||||||
ZSTD_CCtx_params const* params, void const* src, size_t srcSize)
|
ldmParams_t const* params, void const* src, size_t srcSize,
|
||||||
|
int const extDict)
|
||||||
{
|
{
|
||||||
ZSTD_compressionParameters const* cParams = ¶ms->cParams;
|
rawSeq const* const sequencesStart = sequences;
|
||||||
const ldmParams_t ldmParams = params->ldmParams;
|
/* LDM parameters */
|
||||||
const U64 hashPower = ldmState->hashPower;
|
U32 const minMatchLength = params->minMatchLength;
|
||||||
const U32 hBits = ldmParams.hashLog - ldmParams.bucketSizeLog;
|
U64 const hashPower = ldmState->hashPower;
|
||||||
const U32 ldmBucketSize = ((U32)1 << ldmParams.bucketSizeLog);
|
U32 const hBits = params->hashLog - params->bucketSizeLog;
|
||||||
const U32 ldmTagMask = ((U32)1 << ldmParams.hashEveryLog) - 1;
|
U32 const ldmBucketSize = 1U << params->bucketSizeLog;
|
||||||
const BYTE* const base = ms->base;
|
U32 const hashEveryLog = params->hashEveryLog;
|
||||||
const BYTE* const istart = (const BYTE*)src;
|
U32 const ldmTagMask = (1U << params->hashEveryLog) - 1;
|
||||||
const BYTE* ip = istart;
|
/* Prefix and extDict parameters */
|
||||||
const BYTE* anchor = istart;
|
U32 const dictLimit = ldmState->window.dictLimit;
|
||||||
const U32 lowestIndex = ms->dictLimit;
|
U32 const lowestIndex = extDict ? ldmState->window.lowLimit : dictLimit;
|
||||||
const BYTE* const lowest = base + lowestIndex;
|
BYTE const* const base = ldmState->window.base;
|
||||||
const BYTE* const iend = istart + srcSize;
|
BYTE const* const dictBase = extDict ? ldmState->window.dictBase : NULL;
|
||||||
const BYTE* const ilimit = iend - MAX(ldmParams.minMatchLength, HASH_READ_SIZE);
|
BYTE const* const dictStart = extDict ? dictBase + lowestIndex : NULL;
|
||||||
|
BYTE const* const dictEnd = extDict ? dictBase + dictLimit : NULL;
|
||||||
const ZSTD_blockCompressor blockCompressor =
|
BYTE const* const lowPrefixPtr = base + dictLimit;
|
||||||
ZSTD_selectBlockCompressor(cParams->strategy, 0);
|
/* Input bounds */
|
||||||
|
BYTE const* const istart = (BYTE const*)src;
|
||||||
|
BYTE const* const iend = istart + srcSize;
|
||||||
|
BYTE const* const ilimit = iend - MAX(minMatchLength, HASH_READ_SIZE);
|
||||||
|
/* Input positions */
|
||||||
|
BYTE const* anchor = istart;
|
||||||
|
BYTE const* ip = istart;
|
||||||
|
/* Rolling hash */
|
||||||
|
BYTE const* lastHashed = NULL;
|
||||||
U64 rollingHash = 0;
|
U64 rollingHash = 0;
|
||||||
const BYTE* lastHashed = NULL;
|
|
||||||
size_t i, lastLiterals;
|
|
||||||
|
|
||||||
/* Main Search Loop */
|
while (ip <= ilimit) {
|
||||||
while (ip < ilimit) { /* < instead of <=, because repcode check at (ip+1) */
|
|
||||||
size_t mLength;
|
size_t mLength;
|
||||||
U32 const current = (U32)(ip - base);
|
U32 const current = (U32)(ip - base);
|
||||||
size_t forwardMatchLength = 0, backwardMatchLength = 0;
|
size_t forwardMatchLength = 0, backwardMatchLength = 0;
|
||||||
ldmEntry_t* bestEntry = NULL;
|
ldmEntry_t* bestEntry = NULL;
|
||||||
if (ip != istart) {
|
if (ip != istart) {
|
||||||
rollingHash = ZSTD_ldm_updateHash(rollingHash, lastHashed[0],
|
rollingHash = ZSTD_ldm_updateHash(rollingHash, lastHashed[0],
|
||||||
lastHashed[ldmParams.minMatchLength],
|
lastHashed[minMatchLength],
|
||||||
hashPower);
|
hashPower);
|
||||||
} else {
|
} else {
|
||||||
rollingHash = ZSTD_ldm_getRollingHash(ip, ldmParams.minMatchLength);
|
rollingHash = ZSTD_ldm_getRollingHash(ip, minMatchLength);
|
||||||
}
|
}
|
||||||
lastHashed = ip;
|
lastHashed = ip;
|
||||||
|
|
||||||
/* Do not insert and do not look for a match */
|
/* Do not insert and do not look for a match */
|
||||||
if (ZSTD_ldm_getTag(rollingHash, hBits, ldmParams.hashEveryLog) !=
|
if (ZSTD_ldm_getTag(rollingHash, hBits, hashEveryLog) != ldmTagMask) {
|
||||||
ldmTagMask) {
|
|
||||||
ip++;
|
ip++;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
@ -331,27 +351,49 @@ size_t ZSTD_compressBlock_ldm(
|
|||||||
ldmEntry_t* const bucket =
|
ldmEntry_t* const bucket =
|
||||||
ZSTD_ldm_getBucket(ldmState,
|
ZSTD_ldm_getBucket(ldmState,
|
||||||
ZSTD_ldm_getSmallHash(rollingHash, hBits),
|
ZSTD_ldm_getSmallHash(rollingHash, hBits),
|
||||||
ldmParams);
|
*params);
|
||||||
ldmEntry_t* cur;
|
ldmEntry_t* cur;
|
||||||
size_t bestMatchLength = 0;
|
size_t bestMatchLength = 0;
|
||||||
U32 const checksum = ZSTD_ldm_getChecksum(rollingHash, hBits);
|
U32 const checksum = ZSTD_ldm_getChecksum(rollingHash, hBits);
|
||||||
|
|
||||||
for (cur = bucket; cur < bucket + ldmBucketSize; ++cur) {
|
for (cur = bucket; cur < bucket + ldmBucketSize; ++cur) {
|
||||||
const BYTE* const pMatch = cur->offset + base;
|
|
||||||
size_t curForwardMatchLength, curBackwardMatchLength,
|
size_t curForwardMatchLength, curBackwardMatchLength,
|
||||||
curTotalMatchLength;
|
curTotalMatchLength;
|
||||||
if (cur->checksum != checksum || cur->offset <= lowestIndex) {
|
if (cur->checksum != checksum || cur->offset <= lowestIndex) {
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
if (extDict) {
|
||||||
|
BYTE const* const curMatchBase =
|
||||||
|
cur->offset < dictLimit ? dictBase : base;
|
||||||
|
BYTE const* const pMatch = curMatchBase + cur->offset;
|
||||||
|
BYTE const* const matchEnd =
|
||||||
|
cur->offset < dictLimit ? dictEnd : iend;
|
||||||
|
BYTE const* const lowMatchPtr =
|
||||||
|
cur->offset < dictLimit ? dictStart : lowPrefixPtr;
|
||||||
|
|
||||||
curForwardMatchLength = ZSTD_count(ip, pMatch, iend);
|
curForwardMatchLength = ZSTD_count_2segments(
|
||||||
if (curForwardMatchLength < ldmParams.minMatchLength) {
|
ip, pMatch, iend,
|
||||||
continue;
|
matchEnd, lowPrefixPtr);
|
||||||
|
if (curForwardMatchLength < minMatchLength) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
curBackwardMatchLength =
|
||||||
|
ZSTD_ldm_countBackwardsMatch(ip, anchor, pMatch,
|
||||||
|
lowMatchPtr);
|
||||||
|
curTotalMatchLength = curForwardMatchLength +
|
||||||
|
curBackwardMatchLength;
|
||||||
|
} else { /* !extDict */
|
||||||
|
BYTE const* const pMatch = base + cur->offset;
|
||||||
|
curForwardMatchLength = ZSTD_count(ip, pMatch, iend);
|
||||||
|
if (curForwardMatchLength < minMatchLength) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
curBackwardMatchLength =
|
||||||
|
ZSTD_ldm_countBackwardsMatch(ip, anchor, pMatch,
|
||||||
|
lowPrefixPtr);
|
||||||
|
curTotalMatchLength = curForwardMatchLength +
|
||||||
|
curBackwardMatchLength;
|
||||||
}
|
}
|
||||||
curBackwardMatchLength = ZSTD_ldm_countBackwardsMatch(
|
|
||||||
ip, anchor, pMatch, lowest);
|
|
||||||
curTotalMatchLength = curForwardMatchLength +
|
|
||||||
curBackwardMatchLength;
|
|
||||||
|
|
||||||
if (curTotalMatchLength > bestMatchLength) {
|
if (curTotalMatchLength > bestMatchLength) {
|
||||||
bestMatchLength = curTotalMatchLength;
|
bestMatchLength = curTotalMatchLength;
|
||||||
@ -366,7 +408,7 @@ size_t ZSTD_compressBlock_ldm(
|
|||||||
if (bestEntry == NULL) {
|
if (bestEntry == NULL) {
|
||||||
ZSTD_ldm_makeEntryAndInsertByTag(ldmState, rollingHash,
|
ZSTD_ldm_makeEntryAndInsertByTag(ldmState, rollingHash,
|
||||||
hBits, current,
|
hBits, current,
|
||||||
ldmParams);
|
*params);
|
||||||
ip++;
|
ip++;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
@ -375,280 +417,203 @@ size_t ZSTD_compressBlock_ldm(
|
|||||||
mLength = forwardMatchLength + backwardMatchLength;
|
mLength = forwardMatchLength + backwardMatchLength;
|
||||||
ip -= backwardMatchLength;
|
ip -= backwardMatchLength;
|
||||||
|
|
||||||
/* Call the block compressor on the remaining literals */
|
|
||||||
{
|
{
|
||||||
|
/* Store the sequence:
|
||||||
|
* ip = current - backwardMatchLength
|
||||||
|
* The match is at (bestEntry->offset - backwardMatchLength)
|
||||||
|
*/
|
||||||
U32 const matchIndex = bestEntry->offset;
|
U32 const matchIndex = bestEntry->offset;
|
||||||
const BYTE* const match = base + matchIndex - backwardMatchLength;
|
U32 const offset = current - matchIndex;
|
||||||
U32 const offset = (U32)(ip - match);
|
|
||||||
|
|
||||||
/* Fill tables for block compressor */
|
sequences->litLength = (U32)(ip - anchor);
|
||||||
ZSTD_ldm_limitTableUpdate(ms, anchor);
|
sequences->matchLength = (U32)mLength;
|
||||||
ZSTD_ldm_fillFastTables(ms, cParams, anchor);
|
sequences->offset = offset;
|
||||||
|
++sequences;
|
||||||
/* Call block compressor and get remaining literals */
|
|
||||||
lastLiterals = blockCompressor(ms, seqStore, rep, cParams, anchor, ip - anchor);
|
|
||||||
ms->nextToUpdate = (U32)(ip - base);
|
|
||||||
|
|
||||||
/* Update repToConfirm with the new offset */
|
|
||||||
for (i = ZSTD_REP_NUM - 1; i > 0; i--)
|
|
||||||
rep[i] = rep[i-1];
|
|
||||||
rep[0] = offset;
|
|
||||||
|
|
||||||
/* Store the sequence with the leftover literals */
|
|
||||||
ZSTD_storeSeq(seqStore, lastLiterals, ip - lastLiterals,
|
|
||||||
offset + ZSTD_REP_MOVE, mLength - MINMATCH);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Insert the current entry into the hash table */
|
/* Insert the current entry into the hash table */
|
||||||
ZSTD_ldm_makeEntryAndInsertByTag(ldmState, rollingHash, hBits,
|
ZSTD_ldm_makeEntryAndInsertByTag(ldmState, rollingHash, hBits,
|
||||||
(U32)(lastHashed - base),
|
(U32)(lastHashed - base),
|
||||||
ldmParams);
|
*params);
|
||||||
|
|
||||||
assert(ip + backwardMatchLength == lastHashed);
|
assert(ip + backwardMatchLength == lastHashed);
|
||||||
|
|
||||||
/* Fill the hash table from lastHashed+1 to ip+mLength*/
|
/* Fill the hash table from lastHashed+1 to ip+mLength*/
|
||||||
/* Heuristic: don't need to fill the entire table at end of block */
|
/* Heuristic: don't need to fill the entire table at end of block */
|
||||||
if (ip + mLength < ilimit) {
|
if (ip + mLength <= ilimit) {
|
||||||
rollingHash = ZSTD_ldm_fillLdmHashTable(
|
rollingHash = ZSTD_ldm_fillLdmHashTable(
|
||||||
ldmState, rollingHash, lastHashed,
|
ldmState, rollingHash, lastHashed,
|
||||||
ip + mLength, base, hBits, ldmParams);
|
ip + mLength, base, hBits, *params);
|
||||||
lastHashed = ip + mLength - 1;
|
lastHashed = ip + mLength - 1;
|
||||||
}
|
}
|
||||||
ip += mLength;
|
ip += mLength;
|
||||||
anchor = ip;
|
anchor = ip;
|
||||||
/* Check immediate repcode */
|
|
||||||
while ( (ip < ilimit)
|
|
||||||
&& ( (rep[1] > 0) && (rep[1] <= (U32)(ip-lowest))
|
|
||||||
&& (MEM_read32(ip) == MEM_read32(ip - rep[1])) )) {
|
|
||||||
|
|
||||||
size_t const rLength = ZSTD_count(ip+4, ip+4-rep[1],
|
|
||||||
iend) + 4;
|
|
||||||
/* Swap repToConfirm[1] <=> repToConfirm[0] */
|
|
||||||
{
|
|
||||||
U32 const tmpOff = rep[1];
|
|
||||||
rep[1] = rep[0];
|
|
||||||
rep[0] = tmpOff;
|
|
||||||
}
|
|
||||||
|
|
||||||
ZSTD_storeSeq(seqStore, 0, anchor, 0, rLength-MINMATCH);
|
|
||||||
|
|
||||||
/* Fill the hash table from lastHashed+1 to ip+rLength*/
|
|
||||||
if (ip + rLength < ilimit) {
|
|
||||||
rollingHash = ZSTD_ldm_fillLdmHashTable(
|
|
||||||
ldmState, rollingHash, lastHashed,
|
|
||||||
ip + rLength, base, hBits, ldmParams);
|
|
||||||
lastHashed = ip + rLength - 1;
|
|
||||||
}
|
|
||||||
ip += rLength;
|
|
||||||
anchor = ip;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
/* Return the number of sequences generated */
|
||||||
ZSTD_ldm_limitTableUpdate(ms, anchor);
|
return sequences - sequencesStart;
|
||||||
ZSTD_ldm_fillFastTables(ms, cParams, anchor);
|
|
||||||
|
|
||||||
lastLiterals = blockCompressor(ms, seqStore, rep, cParams, anchor, iend - anchor);
|
|
||||||
ms->nextToUpdate = (U32)(iend - base);
|
|
||||||
|
|
||||||
/* Return the last literals size */
|
|
||||||
return lastLiterals;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t ZSTD_compressBlock_ldm_extDict(
|
/*! ZSTD_ldm_reduceTable() :
|
||||||
ldmState_t* ldmState, ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
|
* reduce table indexes by `reducerValue` */
|
||||||
ZSTD_CCtx_params const* params, void const* src, size_t srcSize)
|
static void ZSTD_ldm_reduceTable(ldmEntry_t* const table, U32 const size,
|
||||||
|
U32 const reducerValue)
|
||||||
{
|
{
|
||||||
const ldmParams_t ldmParams = params->ldmParams;
|
U32 u;
|
||||||
ZSTD_compressionParameters const* cParams = ¶ms->cParams;
|
for (u = 0; u < size; u++) {
|
||||||
const U64 hashPower = ldmState->hashPower;
|
if (table[u].offset < reducerValue) table[u].offset = 0;
|
||||||
const U32 hBits = ldmParams.hashLog - ldmParams.bucketSizeLog;
|
else table[u].offset -= reducerValue;
|
||||||
const U32 ldmBucketSize = ((U32)1 << ldmParams.bucketSizeLog);
|
}
|
||||||
const U32 ldmTagMask = ((U32)1 << ldmParams.hashEveryLog) - 1;
|
}
|
||||||
const BYTE* const base = ms->base;
|
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
|
||||||
const BYTE* const istart = (const BYTE*)src;
|
|
||||||
const BYTE* ip = istart;
|
|
||||||
const BYTE* anchor = istart;
|
|
||||||
const U32 lowestIndex = ms->lowLimit;
|
|
||||||
const BYTE* const dictStart = dictBase + lowestIndex;
|
|
||||||
const U32 dictLimit = ms->dictLimit;
|
|
||||||
const BYTE* const lowPrefixPtr = base + dictLimit;
|
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
|
||||||
const BYTE* const iend = istart + srcSize;
|
|
||||||
const BYTE* const ilimit = iend - MAX(ldmParams.minMatchLength, HASH_READ_SIZE);
|
|
||||||
|
|
||||||
const ZSTD_blockCompressor blockCompressor =
|
size_t ZSTD_ldm_generateSequences(
|
||||||
ZSTD_selectBlockCompressor(cParams->strategy, 1);
|
ldmState_t* ldmState, rawSeq* sequences,
|
||||||
U64 rollingHash = 0;
|
ldmParams_t const* params, void const* src, size_t srcSize,
|
||||||
const BYTE* lastHashed = NULL;
|
int const extDict)
|
||||||
size_t i, lastLiterals;
|
{
|
||||||
|
U32 const maxDist = 1U << params->windowLog;
|
||||||
|
BYTE const* const istart = (BYTE const*)src;
|
||||||
|
size_t const kMaxChunkSize = 1 << 20;
|
||||||
|
size_t const nbChunks = (srcSize / kMaxChunkSize) + ((srcSize % kMaxChunkSize) != 0);
|
||||||
|
size_t nbSeq = 0;
|
||||||
|
size_t chunk;
|
||||||
|
|
||||||
/* Search Loop */
|
assert(ZSTD_CHUNKSIZE_MAX >= kMaxChunkSize);
|
||||||
while (ip < ilimit) { /* < instead of <=, because (ip+1) */
|
/* Check that ZSTD_window_update() has been called for this chunk prior
|
||||||
size_t mLength;
|
* to passing it to this function.
|
||||||
const U32 current = (U32)(ip-base);
|
*/
|
||||||
size_t forwardMatchLength = 0, backwardMatchLength = 0;
|
assert(ldmState->window.nextSrc >= (BYTE const*)src + srcSize);
|
||||||
ldmEntry_t* bestEntry = NULL;
|
for (chunk = 0; chunk < nbChunks; ++chunk) {
|
||||||
if (ip != istart) {
|
size_t const chunkStart = chunk * kMaxChunkSize;
|
||||||
rollingHash = ZSTD_ldm_updateHash(rollingHash, lastHashed[0],
|
size_t const chunkEnd = MIN(chunkStart + kMaxChunkSize, srcSize);
|
||||||
lastHashed[ldmParams.minMatchLength],
|
size_t const chunkSize = chunkEnd - chunkStart;
|
||||||
hashPower);
|
|
||||||
|
assert(chunkStart < srcSize);
|
||||||
|
if (ZSTD_window_needOverflowCorrection(ldmState->window)) {
|
||||||
|
U32 const ldmHSize = 1U << params->hashLog;
|
||||||
|
U32 const correction = ZSTD_window_correctOverflow(
|
||||||
|
&ldmState->window, /* cycleLog */ 0, maxDist, src);
|
||||||
|
ZSTD_ldm_reduceTable(ldmState->hashTable, ldmHSize, correction);
|
||||||
|
}
|
||||||
|
/* kMaxChunkSize should be small enough that we don't lose too much of
|
||||||
|
* the window through early invalidation.
|
||||||
|
* TODO: * Test the chunk size.
|
||||||
|
* * Try invalidation after the sequence generation and test the
|
||||||
|
* the offset against maxDist directly.
|
||||||
|
*/
|
||||||
|
ZSTD_window_enforceMaxDist(&ldmState->window, istart + chunkEnd,
|
||||||
|
maxDist);
|
||||||
|
nbSeq += ZSTD_ldm_generateSequences_internal(
|
||||||
|
ldmState, sequences + nbSeq, params, istart + chunkStart, chunkSize,
|
||||||
|
extDict);
|
||||||
|
}
|
||||||
|
return nbSeq;
|
||||||
|
}
|
||||||
|
|
||||||
|
#if 0
|
||||||
|
/**
|
||||||
|
* If the sequence length is longer than remaining then the sequence is split
|
||||||
|
* between this block and the next.
|
||||||
|
*
|
||||||
|
* Returns the current sequence to handle, or if the rest of the block should
|
||||||
|
* be literals, it returns a sequence with offset == 0.
|
||||||
|
*/
|
||||||
|
static rawSeq maybeSplitSequence(rawSeq* sequences, size_t* nbSeq,
|
||||||
|
size_t const seq, size_t const remaining,
|
||||||
|
U32 const minMatch)
|
||||||
|
{
|
||||||
|
rawSeq sequence = sequences[seq];
|
||||||
|
assert(sequence.offset > 0);
|
||||||
|
/* Handle partial sequences */
|
||||||
|
if (remaining <= sequence.litLength) {
|
||||||
|
/* Split the literals that we have out of the sequence.
|
||||||
|
* They will become the last literals of this block.
|
||||||
|
* The next block starts off with the remaining literals.
|
||||||
|
*/
|
||||||
|
sequences[seq].litLength -= remaining;
|
||||||
|
*nbSeq = seq;
|
||||||
|
sequence.offset = 0;
|
||||||
|
} else if (remaining < sequence.litLength + sequence.matchLength) {
|
||||||
|
/* Split the match up into two sequences. One in this block, and one
|
||||||
|
* in the next with no literals. If either match would be shorter
|
||||||
|
* than searchLength we omit it.
|
||||||
|
*/
|
||||||
|
U32 const matchPrefix = remaining - sequence.litLength;
|
||||||
|
U32 const matchSuffix = sequence.matchLength - matchPrefix;
|
||||||
|
|
||||||
|
assert(remaining > sequence.litLength);
|
||||||
|
assert(matchPrefix < sequence.matchLength);
|
||||||
|
assert(matchPrefix + matchSuffix == sequence.matchLength);
|
||||||
|
/* Update the current sequence */
|
||||||
|
sequence.matchLength = matchPrefix;
|
||||||
|
/* Update the next sequence when long enough, otherwise omit it. */
|
||||||
|
if (matchSuffix >= minMatch) {
|
||||||
|
sequences[seq].litLength = 0;
|
||||||
|
sequences[seq].matchLength = matchSuffix;
|
||||||
|
*nbSeq = seq;
|
||||||
} else {
|
} else {
|
||||||
rollingHash = ZSTD_ldm_getRollingHash(ip, ldmParams.minMatchLength);
|
sequences[seq + 1].litLength += matchSuffix;
|
||||||
|
*nbSeq = seq + 1;
|
||||||
}
|
}
|
||||||
lastHashed = ip;
|
if (sequence.matchLength < minMatch) {
|
||||||
|
/* Skip the current sequence if it is too short */
|
||||||
if (ZSTD_ldm_getTag(rollingHash, hBits, ldmParams.hashEveryLog) !=
|
sequence.offset = 0;
|
||||||
ldmTagMask) {
|
|
||||||
/* Don't insert and don't look for a match */
|
|
||||||
ip++;
|
|
||||||
continue;
|
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
return sequence;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
/* Get the best entry and compute the match lengths */
|
size_t ZSTD_ldm_blockCompress(rawSeq const* sequences, size_t nbSeq,
|
||||||
|
ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
|
||||||
|
ZSTD_compressionParameters const* cParams, void const* src, size_t srcSize,
|
||||||
|
int const extDict)
|
||||||
|
{
|
||||||
|
ZSTD_blockCompressor const blockCompressor =
|
||||||
|
ZSTD_selectBlockCompressor(cParams->strategy, extDict);
|
||||||
|
BYTE const* const base = ms->window.base;
|
||||||
|
/* Input bounds */
|
||||||
|
BYTE const* const istart = (BYTE const*)src;
|
||||||
|
BYTE const* const iend = istart + srcSize;
|
||||||
|
/* Input positions */
|
||||||
|
BYTE const* ip = istart;
|
||||||
|
size_t seq;
|
||||||
|
/* Loop through each sequence and apply the block compressor to the lits */
|
||||||
|
for (seq = 0; seq < nbSeq; ++seq) {
|
||||||
|
rawSeq const sequence = sequences[seq];
|
||||||
|
int i;
|
||||||
|
|
||||||
|
if (sequence.offset == 0)
|
||||||
|
break;
|
||||||
|
|
||||||
|
assert(ip + sequence.litLength + sequence.matchLength <= iend);
|
||||||
|
|
||||||
|
/* Fill tables for block compressor */
|
||||||
|
ZSTD_ldm_limitTableUpdate(ms, ip);
|
||||||
|
ZSTD_ldm_fillFastTables(ms, cParams, ip);
|
||||||
|
/* Run the block compressor */
|
||||||
{
|
{
|
||||||
ldmEntry_t* const bucket =
|
size_t const newLitLength =
|
||||||
ZSTD_ldm_getBucket(ldmState,
|
blockCompressor(ms, seqStore, rep, cParams, ip,
|
||||||
ZSTD_ldm_getSmallHash(rollingHash, hBits),
|
sequence.litLength);
|
||||||
ldmParams);
|
ip += sequence.litLength;
|
||||||
ldmEntry_t* cur;
|
|
||||||
size_t bestMatchLength = 0;
|
|
||||||
U32 const checksum = ZSTD_ldm_getChecksum(rollingHash, hBits);
|
|
||||||
|
|
||||||
for (cur = bucket; cur < bucket + ldmBucketSize; ++cur) {
|
|
||||||
const BYTE* const curMatchBase =
|
|
||||||
cur->offset < dictLimit ? dictBase : base;
|
|
||||||
const BYTE* const pMatch = curMatchBase + cur->offset;
|
|
||||||
const BYTE* const matchEnd =
|
|
||||||
cur->offset < dictLimit ? dictEnd : iend;
|
|
||||||
const BYTE* const lowMatchPtr =
|
|
||||||
cur->offset < dictLimit ? dictStart : lowPrefixPtr;
|
|
||||||
size_t curForwardMatchLength, curBackwardMatchLength,
|
|
||||||
curTotalMatchLength;
|
|
||||||
|
|
||||||
if (cur->checksum != checksum || cur->offset <= lowestIndex) {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
curForwardMatchLength = ZSTD_count_2segments(
|
|
||||||
ip, pMatch, iend,
|
|
||||||
matchEnd, lowPrefixPtr);
|
|
||||||
if (curForwardMatchLength < ldmParams.minMatchLength) {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
curBackwardMatchLength = ZSTD_ldm_countBackwardsMatch(
|
|
||||||
ip, anchor, pMatch, lowMatchPtr);
|
|
||||||
curTotalMatchLength = curForwardMatchLength +
|
|
||||||
curBackwardMatchLength;
|
|
||||||
|
|
||||||
if (curTotalMatchLength > bestMatchLength) {
|
|
||||||
bestMatchLength = curTotalMatchLength;
|
|
||||||
forwardMatchLength = curForwardMatchLength;
|
|
||||||
backwardMatchLength = curBackwardMatchLength;
|
|
||||||
bestEntry = cur;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/* No match found -- continue searching */
|
|
||||||
if (bestEntry == NULL) {
|
|
||||||
ZSTD_ldm_makeEntryAndInsertByTag(ldmState, rollingHash, hBits,
|
|
||||||
(U32)(lastHashed - base),
|
|
||||||
ldmParams);
|
|
||||||
ip++;
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Match found */
|
|
||||||
mLength = forwardMatchLength + backwardMatchLength;
|
|
||||||
ip -= backwardMatchLength;
|
|
||||||
|
|
||||||
/* Call the block compressor on the remaining literals */
|
|
||||||
{
|
|
||||||
/* ip = current - backwardMatchLength
|
|
||||||
* The match is at (bestEntry->offset - backwardMatchLength) */
|
|
||||||
U32 const matchIndex = bestEntry->offset;
|
|
||||||
U32 const offset = current - matchIndex;
|
|
||||||
|
|
||||||
/* Fill the hash table for the block compressor */
|
|
||||||
ZSTD_ldm_limitTableUpdate(ms, anchor);
|
|
||||||
ZSTD_ldm_fillFastTables(ms, cParams, anchor);
|
|
||||||
|
|
||||||
/* Call block compressor and get remaining literals */
|
|
||||||
lastLiterals = blockCompressor(ms, seqStore, rep, cParams, anchor, ip - anchor);
|
|
||||||
ms->nextToUpdate = (U32)(ip - base);
|
ms->nextToUpdate = (U32)(ip - base);
|
||||||
|
/* Update the repcodes */
|
||||||
/* Update repToConfirm with the new offset */
|
|
||||||
for (i = ZSTD_REP_NUM - 1; i > 0; i--)
|
for (i = ZSTD_REP_NUM - 1; i > 0; i--)
|
||||||
rep[i] = rep[i-1];
|
rep[i] = rep[i-1];
|
||||||
rep[0] = offset;
|
rep[0] = sequence.offset;
|
||||||
|
/* Store the sequence */
|
||||||
/* Store the sequence with the leftover literals */
|
ZSTD_storeSeq(seqStore, newLitLength, ip - newLitLength,
|
||||||
ZSTD_storeSeq(seqStore, lastLiterals, ip - lastLiterals,
|
sequence.offset + ZSTD_REP_MOVE,
|
||||||
offset + ZSTD_REP_MOVE, mLength - MINMATCH);
|
sequence.matchLength - MINMATCH);
|
||||||
}
|
ip += sequence.matchLength;
|
||||||
|
|
||||||
/* Insert the current entry into the hash table */
|
|
||||||
ZSTD_ldm_makeEntryAndInsertByTag(ldmState, rollingHash, hBits,
|
|
||||||
(U32)(lastHashed - base),
|
|
||||||
ldmParams);
|
|
||||||
|
|
||||||
/* Fill the hash table from lastHashed+1 to ip+mLength */
|
|
||||||
assert(ip + backwardMatchLength == lastHashed);
|
|
||||||
if (ip + mLength < ilimit) {
|
|
||||||
rollingHash = ZSTD_ldm_fillLdmHashTable(
|
|
||||||
ldmState, rollingHash, lastHashed,
|
|
||||||
ip + mLength, base, hBits,
|
|
||||||
ldmParams);
|
|
||||||
lastHashed = ip + mLength - 1;
|
|
||||||
}
|
|
||||||
ip += mLength;
|
|
||||||
anchor = ip;
|
|
||||||
|
|
||||||
/* check immediate repcode */
|
|
||||||
while (ip < ilimit) {
|
|
||||||
U32 const current2 = (U32)(ip-base);
|
|
||||||
U32 const repIndex2 = current2 - rep[1];
|
|
||||||
const BYTE* repMatch2 = repIndex2 < dictLimit ?
|
|
||||||
dictBase + repIndex2 : base + repIndex2;
|
|
||||||
if ( (((U32)((dictLimit-1) - repIndex2) >= 3) &
|
|
||||||
(repIndex2 > lowestIndex)) /* intentional overflow */
|
|
||||||
&& (MEM_read32(repMatch2) == MEM_read32(ip)) ) {
|
|
||||||
const BYTE* const repEnd2 = repIndex2 < dictLimit ?
|
|
||||||
dictEnd : iend;
|
|
||||||
size_t const repLength2 =
|
|
||||||
ZSTD_count_2segments(ip+4, repMatch2+4, iend,
|
|
||||||
repEnd2, lowPrefixPtr) + 4;
|
|
||||||
|
|
||||||
U32 tmpOffset = rep[1];
|
|
||||||
rep[1] = rep[0];
|
|
||||||
rep[0] = tmpOffset;
|
|
||||||
|
|
||||||
ZSTD_storeSeq(seqStore, 0, anchor, 0, repLength2-MINMATCH);
|
|
||||||
|
|
||||||
/* Fill the hash table from lastHashed+1 to ip+repLength2*/
|
|
||||||
if (ip + repLength2 < ilimit) {
|
|
||||||
rollingHash = ZSTD_ldm_fillLdmHashTable(
|
|
||||||
ldmState, rollingHash, lastHashed,
|
|
||||||
ip + repLength2, base, hBits,
|
|
||||||
ldmParams);
|
|
||||||
lastHashed = ip + repLength2 - 1;
|
|
||||||
}
|
|
||||||
ip += repLength2;
|
|
||||||
anchor = ip;
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
break;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
ZSTD_ldm_limitTableUpdate(ms, ip);
|
||||||
ZSTD_ldm_limitTableUpdate(ms, anchor);
|
ZSTD_ldm_fillFastTables(ms, cParams, ip);
|
||||||
ZSTD_ldm_fillFastTables(ms, cParams, anchor);
|
{
|
||||||
|
size_t const lastLiterals = blockCompressor(ms, seqStore, rep, cParams,
|
||||||
/* Call the block compressor one last time on the last literals */
|
ip, iend - ip);
|
||||||
lastLiterals = blockCompressor(ms, seqStore, rep, cParams, anchor, iend - anchor);
|
ms->nextToUpdate = (U32)(iend - base);
|
||||||
ms->nextToUpdate = (U32)(iend - base);
|
return lastLiterals;
|
||||||
|
}
|
||||||
/* Return the last literals size */
|
|
||||||
return lastLiterals;
|
|
||||||
}
|
}
|
||||||
|
@ -24,34 +24,59 @@ extern "C" {
|
|||||||
#define ZSTD_LDM_DEFAULT_WINDOW_LOG ZSTD_WINDOWLOG_DEFAULTMAX
|
#define ZSTD_LDM_DEFAULT_WINDOW_LOG ZSTD_WINDOWLOG_DEFAULTMAX
|
||||||
#define ZSTD_LDM_HASHEVERYLOG_NOTSET 9999
|
#define ZSTD_LDM_HASHEVERYLOG_NOTSET 9999
|
||||||
|
|
||||||
/** ZSTD_compressBlock_ldm_generic() :
|
/**
|
||||||
|
* ZSTD_ldm_generateSequences():
|
||||||
*
|
*
|
||||||
* This is a block compressor intended for long distance matching.
|
* Generates the sequences using the long distance match finder.
|
||||||
|
* The sequences completely parse a prefix of the source, but leave off the last
|
||||||
|
* literals. Returns the number of sequences generated into `sequences`. The
|
||||||
|
* user must have called ZSTD_window_update() for all of the input they have,
|
||||||
|
* even if they pass it to ZSTD_ldm_generateSequences() in chunks.
|
||||||
*
|
*
|
||||||
* The function searches for matches of length at least
|
* NOTE: The source may be any size, assuming it doesn't overflow the hash table
|
||||||
* ldmParams.minMatchLength using a hash table in cctx->ldmState.
|
* indices, and the output sequences table is large enough..
|
||||||
* Matches can be at a distance of up to cParams.windowLog.
|
*/
|
||||||
*
|
size_t ZSTD_ldm_generateSequences(
|
||||||
* Upon finding a match, the unmatched literals are compressed using a
|
ldmState_t* ldms, rawSeq* sequences,
|
||||||
* ZSTD_blockCompressor (depending on the strategy in the compression
|
ldmParams_t const* params, void const* src, size_t srcSize,
|
||||||
* parameters), which stores the matched sequences. The "long distance"
|
int const extDict);
|
||||||
* match is then stored with the remaining literals from the
|
|
||||||
* ZSTD_blockCompressor. */
|
/**
|
||||||
size_t ZSTD_compressBlock_ldm(
|
* ZSTD_ldm_blockCompress():
|
||||||
ldmState_t* ldms, ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
|
*
|
||||||
ZSTD_CCtx_params const* params, void const* src, size_t srcSize);
|
* Compresses a block using the predefined sequences, along with a secondary
|
||||||
|
* block compressor. The literals section of every sequence is passed to the
|
||||||
|
* secondary block compressor, and those sequences are interspersed with the
|
||||||
|
* predefined sequences. Returns the length of the last literals.
|
||||||
|
* `nbSeq` is the number of sequences available in `sequences`.
|
||||||
|
*
|
||||||
|
* NOTE: The source must be at most the maximum block size, but the predefined
|
||||||
|
* sequences can be any size, and may be longer than the block. In the case that
|
||||||
|
* they are longer than the block, the last sequences may need to be split into
|
||||||
|
* two. We handle that case correctly, and update `sequences` and `nbSeq`
|
||||||
|
* appropriately.
|
||||||
|
*/
|
||||||
|
size_t ZSTD_ldm_blockCompress(rawSeq const* sequences, size_t nbSeq,
|
||||||
|
ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
|
||||||
|
ZSTD_compressionParameters const* cParams, void const* src, size_t srcSize,
|
||||||
|
int const extDict);
|
||||||
|
|
||||||
size_t ZSTD_compressBlock_ldm_extDict(
|
|
||||||
ldmState_t* ldms, ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
|
|
||||||
ZSTD_CCtx_params const* params, void const* src, size_t srcSize);
|
|
||||||
|
|
||||||
/** ZSTD_ldm_initializeParameters() :
|
/** ZSTD_ldm_initializeParameters() :
|
||||||
* Initialize the long distance matching parameters to their default values. */
|
* Initialize the long distance matching parameters to their default values. */
|
||||||
size_t ZSTD_ldm_initializeParameters(ldmParams_t* params, U32 enableLdm);
|
size_t ZSTD_ldm_initializeParameters(ldmParams_t* params, U32 enableLdm);
|
||||||
|
|
||||||
/** ZSTD_ldm_getTableSize() :
|
/** ZSTD_ldm_getTableSize() :
|
||||||
* Estimate the space needed for long distance matching tables. */
|
* Estimate the space needed for long distance matching tables or 0 if LDM is
|
||||||
size_t ZSTD_ldm_getTableSize(U32 hashLog, U32 bucketSizeLog);
|
* disabled.
|
||||||
|
*/
|
||||||
|
size_t ZSTD_ldm_getTableSize(ldmParams_t params);
|
||||||
|
|
||||||
|
/** ZSTD_ldm_getSeqSpace() :
|
||||||
|
* Return an upper bound on the number of sequences that can be produced by
|
||||||
|
* the long distance matcher, or 0 if LDM is disabled.
|
||||||
|
*/
|
||||||
|
size_t ZSTD_ldm_getMaxNbSeq(ldmParams_t params, size_t maxChunkSize);
|
||||||
|
|
||||||
/** ZSTD_ldm_getTableSize() :
|
/** ZSTD_ldm_getTableSize() :
|
||||||
* Return prime8bytes^(minMatchLength-1) */
|
* Return prime8bytes^(minMatchLength-1) */
|
||||||
@ -62,8 +87,12 @@ U64 ZSTD_ldm_getHashPower(U32 minMatchLength);
|
|||||||
* windowLog and params->hashLog.
|
* windowLog and params->hashLog.
|
||||||
*
|
*
|
||||||
* Ensures that params->bucketSizeLog is <= params->hashLog (setting it to
|
* Ensures that params->bucketSizeLog is <= params->hashLog (setting it to
|
||||||
* params->hashLog if it is not). */
|
* params->hashLog if it is not).
|
||||||
void ZSTD_ldm_adjustParameters(ldmParams_t* params, U32 windowLog);
|
*
|
||||||
|
* Ensures that the minMatchLength >= targetLength during optimal parsing.
|
||||||
|
*/
|
||||||
|
void ZSTD_ldm_adjustParameters(ldmParams_t* params,
|
||||||
|
ZSTD_compressionParameters const* cParams);
|
||||||
|
|
||||||
#if defined (__cplusplus)
|
#if defined (__cplusplus)
|
||||||
}
|
}
|
||||||
|
@ -247,7 +247,7 @@ static U32 ZSTD_insertAndFindFirstIndexHash3 (ZSTD_matchState_t* ms, const BYTE*
|
|||||||
{
|
{
|
||||||
U32* const hashTable3 = ms->hashTable3;
|
U32* const hashTable3 = ms->hashTable3;
|
||||||
U32 const hashLog3 = ms->hashLog3;
|
U32 const hashLog3 = ms->hashLog3;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
U32 idx = ms->nextToUpdate3;
|
U32 idx = ms->nextToUpdate3;
|
||||||
U32 const target = ms->nextToUpdate3 = (U32)(ip - base);
|
U32 const target = ms->nextToUpdate3 = (U32)(ip - base);
|
||||||
size_t const hash3 = ZSTD_hash3Ptr(ip, hashLog3);
|
size_t const hash3 = ZSTD_hash3Ptr(ip, hashLog3);
|
||||||
@ -281,9 +281,9 @@ static U32 ZSTD_insertBt1(
|
|||||||
U32 const btMask = (1 << btLog) - 1;
|
U32 const btMask = (1 << btLog) - 1;
|
||||||
U32 matchIndex = hashTable[h];
|
U32 matchIndex = hashTable[h];
|
||||||
size_t commonLengthSmaller=0, commonLengthLarger=0;
|
size_t commonLengthSmaller=0, commonLengthLarger=0;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
const U32 dictLimit = ms->dictLimit;
|
const U32 dictLimit = ms->window.dictLimit;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
const BYTE* const prefixStart = base + dictLimit;
|
const BYTE* const prefixStart = base + dictLimit;
|
||||||
const BYTE* match;
|
const BYTE* match;
|
||||||
@ -292,7 +292,7 @@ static U32 ZSTD_insertBt1(
|
|||||||
U32* smallerPtr = bt + 2*(current&btMask);
|
U32* smallerPtr = bt + 2*(current&btMask);
|
||||||
U32* largerPtr = smallerPtr + 1;
|
U32* largerPtr = smallerPtr + 1;
|
||||||
U32 dummy32; /* to be nullified at the end */
|
U32 dummy32; /* to be nullified at the end */
|
||||||
U32 const windowLow = ms->lowLimit;
|
U32 const windowLow = ms->window.lowLimit;
|
||||||
U32 matchEndIdx = current+8+1;
|
U32 matchEndIdx = current+8+1;
|
||||||
size_t bestLength = 8;
|
size_t bestLength = 8;
|
||||||
U32 nbCompares = 1U << cParams->searchLog;
|
U32 nbCompares = 1U << cParams->searchLog;
|
||||||
@ -383,7 +383,7 @@ void ZSTD_updateTree_internal(
|
|||||||
const BYTE* const ip, const BYTE* const iend,
|
const BYTE* const ip, const BYTE* const iend,
|
||||||
const U32 mls, const U32 extDict)
|
const U32 mls, const U32 extDict)
|
||||||
{
|
{
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
U32 const target = (U32)(ip - base);
|
U32 const target = (U32)(ip - base);
|
||||||
U32 idx = ms->nextToUpdate;
|
U32 idx = ms->nextToUpdate;
|
||||||
DEBUGLOG(7, "ZSTD_updateTree_internal, from %u to %u (extDict:%u)",
|
DEBUGLOG(7, "ZSTD_updateTree_internal, from %u to %u (extDict:%u)",
|
||||||
@ -409,7 +409,7 @@ U32 ZSTD_insertBtAndGetAllMatches (
|
|||||||
ZSTD_match_t* matches, const U32 lengthToBeat, U32 const mls /* template */)
|
ZSTD_match_t* matches, const U32 lengthToBeat, U32 const mls /* template */)
|
||||||
{
|
{
|
||||||
U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1);
|
U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1);
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
U32 const current = (U32)(ip-base);
|
U32 const current = (U32)(ip-base);
|
||||||
U32 const hashLog = cParams->hashLog;
|
U32 const hashLog = cParams->hashLog;
|
||||||
U32 const minMatch = (mls==3) ? 3 : 4;
|
U32 const minMatch = (mls==3) ? 3 : 4;
|
||||||
@ -420,12 +420,12 @@ U32 ZSTD_insertBtAndGetAllMatches (
|
|||||||
U32 const btLog = cParams->chainLog - 1;
|
U32 const btLog = cParams->chainLog - 1;
|
||||||
U32 const btMask= (1U << btLog) - 1;
|
U32 const btMask= (1U << btLog) - 1;
|
||||||
size_t commonLengthSmaller=0, commonLengthLarger=0;
|
size_t commonLengthSmaller=0, commonLengthLarger=0;
|
||||||
const BYTE* const dictBase = ms->dictBase;
|
const BYTE* const dictBase = ms->window.dictBase;
|
||||||
U32 const dictLimit = ms->dictLimit;
|
U32 const dictLimit = ms->window.dictLimit;
|
||||||
const BYTE* const dictEnd = dictBase + dictLimit;
|
const BYTE* const dictEnd = dictBase + dictLimit;
|
||||||
const BYTE* const prefixStart = base + dictLimit;
|
const BYTE* const prefixStart = base + dictLimit;
|
||||||
U32 const btLow = btMask >= current ? 0 : current - btMask;
|
U32 const btLow = btMask >= current ? 0 : current - btMask;
|
||||||
U32 const windowLow = ms->lowLimit;
|
U32 const windowLow = ms->window.lowLimit;
|
||||||
U32* smallerPtr = bt + 2*(current&btMask);
|
U32* smallerPtr = bt + 2*(current&btMask);
|
||||||
U32* largerPtr = bt + 2*(current&btMask) + 1;
|
U32* largerPtr = bt + 2*(current&btMask) + 1;
|
||||||
U32 matchEndIdx = current+8+1; /* farthest referenced position of any match => detects repetitive patterns */
|
U32 matchEndIdx = current+8+1; /* farthest referenced position of any match => detects repetitive patterns */
|
||||||
@ -566,7 +566,7 @@ FORCE_INLINE_TEMPLATE U32 ZSTD_BtGetAllMatches (
|
|||||||
{
|
{
|
||||||
U32 const matchLengthSearch = cParams->searchLength;
|
U32 const matchLengthSearch = cParams->searchLength;
|
||||||
DEBUGLOG(7, "ZSTD_BtGetAllMatches");
|
DEBUGLOG(7, "ZSTD_BtGetAllMatches");
|
||||||
if (ip < ms->base + ms->nextToUpdate) return 0; /* skipped area */
|
if (ip < ms->window.base + ms->nextToUpdate) return 0; /* skipped area */
|
||||||
ZSTD_updateTree_internal(ms, cParams, ip, iHighLimit, matchLengthSearch, extDict);
|
ZSTD_updateTree_internal(ms, cParams, ip, iHighLimit, matchLengthSearch, extDict);
|
||||||
switch(matchLengthSearch)
|
switch(matchLengthSearch)
|
||||||
{
|
{
|
||||||
@ -675,8 +675,8 @@ size_t ZSTD_compressBlock_opt_generic(ZSTD_matchState_t* ms,seqStore_t* seqStore
|
|||||||
const BYTE* anchor = istart;
|
const BYTE* anchor = istart;
|
||||||
const BYTE* const iend = istart + srcSize;
|
const BYTE* const iend = istart + srcSize;
|
||||||
const BYTE* const ilimit = iend - 8;
|
const BYTE* const ilimit = iend - 8;
|
||||||
const BYTE* const base = ms->base;
|
const BYTE* const base = ms->window.base;
|
||||||
const BYTE* const prefixStart = base + ms->dictLimit;
|
const BYTE* const prefixStart = base + ms->window.dictLimit;
|
||||||
|
|
||||||
U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1);
|
U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1);
|
||||||
U32 const minMatch = (cParams->searchLength == 3) ? 3 : 4;
|
U32 const minMatch = (cParams->searchLength == 3) ? 3 : 4;
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
|
|
||||||
|
|
||||||
/* ====== Tuning parameters ====== */
|
/* ====== Tuning parameters ====== */
|
||||||
#define ZSTDMT_NBTHREADS_MAX 200
|
#define ZSTDMT_NBWORKERS_MAX 200
|
||||||
#define ZSTDMT_JOBSIZE_MAX (MEM_32bits() ? (512 MB) : (2 GB)) /* note : limited by `jobSize` type, which is `unsigned` */
|
#define ZSTDMT_JOBSIZE_MAX (MEM_32bits() ? (512 MB) : (2 GB)) /* note : limited by `jobSize` type, which is `unsigned` */
|
||||||
#define ZSTDMT_OVERLAPLOG_DEFAULT 6
|
#define ZSTDMT_OVERLAPLOG_DEFAULT 6
|
||||||
|
|
||||||
@ -97,9 +97,9 @@ typedef struct ZSTDMT_bufferPool_s {
|
|||||||
buffer_t bTable[1]; /* variable size */
|
buffer_t bTable[1]; /* variable size */
|
||||||
} ZSTDMT_bufferPool;
|
} ZSTDMT_bufferPool;
|
||||||
|
|
||||||
static ZSTDMT_bufferPool* ZSTDMT_createBufferPool(unsigned nbThreads, ZSTD_customMem cMem)
|
static ZSTDMT_bufferPool* ZSTDMT_createBufferPool(unsigned nbWorkers, ZSTD_customMem cMem)
|
||||||
{
|
{
|
||||||
unsigned const maxNbBuffers = 2*nbThreads + 3;
|
unsigned const maxNbBuffers = 2*nbWorkers + 3;
|
||||||
ZSTDMT_bufferPool* const bufPool = (ZSTDMT_bufferPool*)ZSTD_calloc(
|
ZSTDMT_bufferPool* const bufPool = (ZSTDMT_bufferPool*)ZSTD_calloc(
|
||||||
sizeof(ZSTDMT_bufferPool) + (maxNbBuffers-1) * sizeof(buffer_t), cMem);
|
sizeof(ZSTDMT_bufferPool) + (maxNbBuffers-1) * sizeof(buffer_t), cMem);
|
||||||
if (bufPool==NULL) return NULL;
|
if (bufPool==NULL) return NULL;
|
||||||
@ -236,23 +236,24 @@ static void ZSTDMT_freeCCtxPool(ZSTDMT_CCtxPool* pool)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/* ZSTDMT_createCCtxPool() :
|
/* ZSTDMT_createCCtxPool() :
|
||||||
* implies nbThreads >= 1 , checked by caller ZSTDMT_createCCtx() */
|
* implies nbWorkers >= 1 , checked by caller ZSTDMT_createCCtx() */
|
||||||
static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(unsigned nbThreads,
|
static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(unsigned nbWorkers,
|
||||||
ZSTD_customMem cMem)
|
ZSTD_customMem cMem)
|
||||||
{
|
{
|
||||||
ZSTDMT_CCtxPool* const cctxPool = (ZSTDMT_CCtxPool*) ZSTD_calloc(
|
ZSTDMT_CCtxPool* const cctxPool = (ZSTDMT_CCtxPool*) ZSTD_calloc(
|
||||||
sizeof(ZSTDMT_CCtxPool) + (nbThreads-1)*sizeof(ZSTD_CCtx*), cMem);
|
sizeof(ZSTDMT_CCtxPool) + (nbWorkers-1)*sizeof(ZSTD_CCtx*), cMem);
|
||||||
|
assert(nbWorkers > 0);
|
||||||
if (!cctxPool) return NULL;
|
if (!cctxPool) return NULL;
|
||||||
if (ZSTD_pthread_mutex_init(&cctxPool->poolMutex, NULL)) {
|
if (ZSTD_pthread_mutex_init(&cctxPool->poolMutex, NULL)) {
|
||||||
ZSTD_free(cctxPool, cMem);
|
ZSTD_free(cctxPool, cMem);
|
||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
cctxPool->cMem = cMem;
|
cctxPool->cMem = cMem;
|
||||||
cctxPool->totalCCtx = nbThreads;
|
cctxPool->totalCCtx = nbWorkers;
|
||||||
cctxPool->availCCtx = 1; /* at least one cctx for single-thread mode */
|
cctxPool->availCCtx = 1; /* at least one cctx for single-thread mode */
|
||||||
cctxPool->cctx[0] = ZSTD_createCCtx_advanced(cMem);
|
cctxPool->cctx[0] = ZSTD_createCCtx_advanced(cMem);
|
||||||
if (!cctxPool->cctx[0]) { ZSTDMT_freeCCtxPool(cctxPool); return NULL; }
|
if (!cctxPool->cctx[0]) { ZSTDMT_freeCCtxPool(cctxPool); return NULL; }
|
||||||
DEBUGLOG(3, "cctxPool created, with %u threads", nbThreads);
|
DEBUGLOG(3, "cctxPool created, with %u workers", nbWorkers);
|
||||||
return cctxPool;
|
return cctxPool;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -260,15 +261,16 @@ static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(unsigned nbThreads,
|
|||||||
static size_t ZSTDMT_sizeof_CCtxPool(ZSTDMT_CCtxPool* cctxPool)
|
static size_t ZSTDMT_sizeof_CCtxPool(ZSTDMT_CCtxPool* cctxPool)
|
||||||
{
|
{
|
||||||
ZSTD_pthread_mutex_lock(&cctxPool->poolMutex);
|
ZSTD_pthread_mutex_lock(&cctxPool->poolMutex);
|
||||||
{ unsigned const nbThreads = cctxPool->totalCCtx;
|
{ unsigned const nbWorkers = cctxPool->totalCCtx;
|
||||||
size_t const poolSize = sizeof(*cctxPool)
|
size_t const poolSize = sizeof(*cctxPool)
|
||||||
+ (nbThreads-1)*sizeof(ZSTD_CCtx*);
|
+ (nbWorkers-1) * sizeof(ZSTD_CCtx*);
|
||||||
unsigned u;
|
unsigned u;
|
||||||
size_t totalCCtxSize = 0;
|
size_t totalCCtxSize = 0;
|
||||||
for (u=0; u<nbThreads; u++) {
|
for (u=0; u<nbWorkers; u++) {
|
||||||
totalCCtxSize += ZSTD_sizeof_CCtx(cctxPool->cctx[u]);
|
totalCCtxSize += ZSTD_sizeof_CCtx(cctxPool->cctx[u]);
|
||||||
}
|
}
|
||||||
ZSTD_pthread_mutex_unlock(&cctxPool->poolMutex);
|
ZSTD_pthread_mutex_unlock(&cctxPool->poolMutex);
|
||||||
|
assert(nbWorkers > 0);
|
||||||
return poolSize + totalCCtxSize;
|
return poolSize + totalCCtxSize;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -295,8 +297,8 @@ static void ZSTDMT_releaseCCtx(ZSTDMT_CCtxPool* pool, ZSTD_CCtx* cctx)
|
|||||||
if (pool->availCCtx < pool->totalCCtx)
|
if (pool->availCCtx < pool->totalCCtx)
|
||||||
pool->cctx[pool->availCCtx++] = cctx;
|
pool->cctx[pool->availCCtx++] = cctx;
|
||||||
else {
|
else {
|
||||||
/* pool overflow : should not happen, since totalCCtx==nbThreads */
|
/* pool overflow : should not happen, since totalCCtx==nbWorkers */
|
||||||
DEBUGLOG(5, "CCtx pool overflow : free cctx");
|
DEBUGLOG(4, "CCtx pool overflow : free cctx");
|
||||||
ZSTD_freeCCtx(cctx);
|
ZSTD_freeCCtx(cctx);
|
||||||
}
|
}
|
||||||
ZSTD_pthread_mutex_unlock(&pool->poolMutex);
|
ZSTD_pthread_mutex_unlock(&pool->poolMutex);
|
||||||
@ -502,52 +504,51 @@ static ZSTDMT_jobDescription* ZSTDMT_createJobsTable(U32* nbJobsPtr, ZSTD_custom
|
|||||||
return jobTable;
|
return jobTable;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* ZSTDMT_CCtxParam_setNbThreads():
|
/* ZSTDMT_CCtxParam_setNbWorkers():
|
||||||
* Internal use only */
|
* Internal use only */
|
||||||
size_t ZSTDMT_CCtxParam_setNbThreads(ZSTD_CCtx_params* params, unsigned nbThreads)
|
size_t ZSTDMT_CCtxParam_setNbWorkers(ZSTD_CCtx_params* params, unsigned nbWorkers)
|
||||||
{
|
{
|
||||||
if (nbThreads > ZSTDMT_NBTHREADS_MAX) nbThreads = ZSTDMT_NBTHREADS_MAX;
|
if (nbWorkers > ZSTDMT_NBWORKERS_MAX) nbWorkers = ZSTDMT_NBWORKERS_MAX;
|
||||||
if (nbThreads < 1) nbThreads = 1;
|
params->nbWorkers = nbWorkers;
|
||||||
params->nbThreads = nbThreads;
|
|
||||||
params->overlapSizeLog = ZSTDMT_OVERLAPLOG_DEFAULT;
|
params->overlapSizeLog = ZSTDMT_OVERLAPLOG_DEFAULT;
|
||||||
params->jobSize = 0;
|
params->jobSize = 0;
|
||||||
return nbThreads;
|
return nbWorkers;
|
||||||
}
|
}
|
||||||
|
|
||||||
ZSTDMT_CCtx* ZSTDMT_createCCtx_advanced(unsigned nbThreads, ZSTD_customMem cMem)
|
ZSTDMT_CCtx* ZSTDMT_createCCtx_advanced(unsigned nbWorkers, ZSTD_customMem cMem)
|
||||||
{
|
{
|
||||||
ZSTDMT_CCtx* mtctx;
|
ZSTDMT_CCtx* mtctx;
|
||||||
U32 nbJobs = nbThreads + 2;
|
U32 nbJobs = nbWorkers + 2;
|
||||||
DEBUGLOG(3, "ZSTDMT_createCCtx_advanced (nbThreads = %u)", nbThreads);
|
DEBUGLOG(3, "ZSTDMT_createCCtx_advanced (nbWorkers = %u)", nbWorkers);
|
||||||
|
|
||||||
if (nbThreads < 1) return NULL;
|
if (nbWorkers < 1) return NULL;
|
||||||
nbThreads = MIN(nbThreads , ZSTDMT_NBTHREADS_MAX);
|
nbWorkers = MIN(nbWorkers , ZSTDMT_NBWORKERS_MAX);
|
||||||
if ((cMem.customAlloc!=NULL) ^ (cMem.customFree!=NULL))
|
if ((cMem.customAlloc!=NULL) ^ (cMem.customFree!=NULL))
|
||||||
/* invalid custom allocator */
|
/* invalid custom allocator */
|
||||||
return NULL;
|
return NULL;
|
||||||
|
|
||||||
mtctx = (ZSTDMT_CCtx*) ZSTD_calloc(sizeof(ZSTDMT_CCtx), cMem);
|
mtctx = (ZSTDMT_CCtx*) ZSTD_calloc(sizeof(ZSTDMT_CCtx), cMem);
|
||||||
if (!mtctx) return NULL;
|
if (!mtctx) return NULL;
|
||||||
ZSTDMT_CCtxParam_setNbThreads(&mtctx->params, nbThreads);
|
ZSTDMT_CCtxParam_setNbWorkers(&mtctx->params, nbWorkers);
|
||||||
mtctx->cMem = cMem;
|
mtctx->cMem = cMem;
|
||||||
mtctx->allJobsCompleted = 1;
|
mtctx->allJobsCompleted = 1;
|
||||||
mtctx->factory = POOL_create_advanced(nbThreads, 0, cMem);
|
mtctx->factory = POOL_create_advanced(nbWorkers, 0, cMem);
|
||||||
mtctx->jobs = ZSTDMT_createJobsTable(&nbJobs, cMem);
|
mtctx->jobs = ZSTDMT_createJobsTable(&nbJobs, cMem);
|
||||||
assert(nbJobs > 0); assert((nbJobs & (nbJobs - 1)) == 0); /* ensure nbJobs is a power of 2 */
|
assert(nbJobs > 0); assert((nbJobs & (nbJobs - 1)) == 0); /* ensure nbJobs is a power of 2 */
|
||||||
mtctx->jobIDMask = nbJobs - 1;
|
mtctx->jobIDMask = nbJobs - 1;
|
||||||
mtctx->bufPool = ZSTDMT_createBufferPool(nbThreads, cMem);
|
mtctx->bufPool = ZSTDMT_createBufferPool(nbWorkers, cMem);
|
||||||
mtctx->cctxPool = ZSTDMT_createCCtxPool(nbThreads, cMem);
|
mtctx->cctxPool = ZSTDMT_createCCtxPool(nbWorkers, cMem);
|
||||||
if (!mtctx->factory | !mtctx->jobs | !mtctx->bufPool | !mtctx->cctxPool) {
|
if (!mtctx->factory | !mtctx->jobs | !mtctx->bufPool | !mtctx->cctxPool) {
|
||||||
ZSTDMT_freeCCtx(mtctx);
|
ZSTDMT_freeCCtx(mtctx);
|
||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
DEBUGLOG(3, "mt_cctx created, for %u threads", nbThreads);
|
DEBUGLOG(3, "mt_cctx created, for %u threads", nbWorkers);
|
||||||
return mtctx;
|
return mtctx;
|
||||||
}
|
}
|
||||||
|
|
||||||
ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbThreads)
|
ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbWorkers)
|
||||||
{
|
{
|
||||||
return ZSTDMT_createCCtx_advanced(nbThreads, ZSTD_defaultCMem);
|
return ZSTDMT_createCCtx_advanced(nbWorkers, ZSTD_defaultCMem);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -649,8 +650,8 @@ size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter,
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Sets parameters relevant to the compression job, initializing others to
|
/* Sets parameters relevant to the compression job,
|
||||||
* default values. Notably, nbThreads should probably be zero. */
|
* initializing others to default values. */
|
||||||
static ZSTD_CCtx_params ZSTDMT_initJobCCtxParams(ZSTD_CCtx_params const params)
|
static ZSTD_CCtx_params ZSTDMT_initJobCCtxParams(ZSTD_CCtx_params const params)
|
||||||
{
|
{
|
||||||
ZSTD_CCtx_params jobParams;
|
ZSTD_CCtx_params jobParams;
|
||||||
@ -664,13 +665,29 @@ static ZSTD_CCtx_params ZSTDMT_initJobCCtxParams(ZSTD_CCtx_params const params)
|
|||||||
return jobParams;
|
return jobParams;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* ZSTDMT_getNbThreads():
|
/*! ZSTDMT_updateCParams_whileCompressing() :
|
||||||
|
* Update compression level and parameters (except wlog)
|
||||||
|
* while compression is ongoing.
|
||||||
|
* New parameters will be applied to next compression job. */
|
||||||
|
void ZSTDMT_updateCParams_whileCompressing(ZSTDMT_CCtx* mtctx, int compressionLevel, ZSTD_compressionParameters cParams)
|
||||||
|
{
|
||||||
|
U32 const saved_wlog = mtctx->params.cParams.windowLog; /* Do not modify windowLog while compressing */
|
||||||
|
DEBUGLOG(5, "ZSTDMT_updateCParams_whileCompressing (level:%i)",
|
||||||
|
compressionLevel);
|
||||||
|
mtctx->params.compressionLevel = compressionLevel;
|
||||||
|
if (compressionLevel != ZSTD_CLEVEL_CUSTOM)
|
||||||
|
cParams = ZSTD_getCParams(compressionLevel, mtctx->frameContentSize, 0 /* dictSize */ );
|
||||||
|
cParams.windowLog = saved_wlog;
|
||||||
|
mtctx->params.cParams = cParams;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ZSTDMT_getNbWorkers():
|
||||||
* @return nb threads currently active in mtctx.
|
* @return nb threads currently active in mtctx.
|
||||||
* mtctx must be valid */
|
* mtctx must be valid */
|
||||||
unsigned ZSTDMT_getNbThreads(const ZSTDMT_CCtx* mtctx)
|
unsigned ZSTDMT_getNbWorkers(const ZSTDMT_CCtx* mtctx)
|
||||||
{
|
{
|
||||||
assert(mtctx != NULL);
|
assert(mtctx != NULL);
|
||||||
return mtctx->params.nbThreads;
|
return mtctx->params.nbWorkers;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* ZSTDMT_getFrameProgression():
|
/* ZSTDMT_getFrameProgression():
|
||||||
@ -709,15 +726,15 @@ ZSTD_frameProgression ZSTDMT_getFrameProgression(ZSTDMT_CCtx* mtctx)
|
|||||||
/* ===== Multi-threaded compression ===== */
|
/* ===== Multi-threaded compression ===== */
|
||||||
/* ------------------------------------------ */
|
/* ------------------------------------------ */
|
||||||
|
|
||||||
static unsigned ZSTDMT_computeNbJobs(size_t srcSize, unsigned windowLog, unsigned nbThreads) {
|
static unsigned ZSTDMT_computeNbJobs(size_t srcSize, unsigned windowLog, unsigned nbWorkers) {
|
||||||
assert(nbThreads>0);
|
assert(nbWorkers>0);
|
||||||
{ size_t const jobSizeTarget = (size_t)1 << (windowLog + 2);
|
{ size_t const jobSizeTarget = (size_t)1 << (windowLog + 2);
|
||||||
size_t const jobMaxSize = jobSizeTarget << 2;
|
size_t const jobMaxSize = jobSizeTarget << 2;
|
||||||
size_t const passSizeMax = jobMaxSize * nbThreads;
|
size_t const passSizeMax = jobMaxSize * nbWorkers;
|
||||||
unsigned const multiplier = (unsigned)(srcSize / passSizeMax) + 1;
|
unsigned const multiplier = (unsigned)(srcSize / passSizeMax) + 1;
|
||||||
unsigned const nbJobsLarge = multiplier * nbThreads;
|
unsigned const nbJobsLarge = multiplier * nbWorkers;
|
||||||
unsigned const nbJobsMax = (unsigned)(srcSize / jobSizeTarget) + 1;
|
unsigned const nbJobsMax = (unsigned)(srcSize / jobSizeTarget) + 1;
|
||||||
unsigned const nbJobsSmall = MIN(nbJobsMax, nbThreads);
|
unsigned const nbJobsSmall = MIN(nbJobsMax, nbWorkers);
|
||||||
return (multiplier>1) ? nbJobsLarge : nbJobsSmall;
|
return (multiplier>1) ? nbJobsLarge : nbJobsSmall;
|
||||||
} }
|
} }
|
||||||
|
|
||||||
@ -734,7 +751,7 @@ static size_t ZSTDMT_compress_advanced_internal(
|
|||||||
ZSTD_CCtx_params const jobParams = ZSTDMT_initJobCCtxParams(params);
|
ZSTD_CCtx_params const jobParams = ZSTDMT_initJobCCtxParams(params);
|
||||||
unsigned const overlapRLog = (params.overlapSizeLog>9) ? 0 : 9-params.overlapSizeLog;
|
unsigned const overlapRLog = (params.overlapSizeLog>9) ? 0 : 9-params.overlapSizeLog;
|
||||||
size_t const overlapSize = (overlapRLog>=9) ? 0 : (size_t)1 << (params.cParams.windowLog - overlapRLog);
|
size_t const overlapSize = (overlapRLog>=9) ? 0 : (size_t)1 << (params.cParams.windowLog - overlapRLog);
|
||||||
unsigned const nbJobs = ZSTDMT_computeNbJobs(srcSize, params.cParams.windowLog, params.nbThreads);
|
unsigned const nbJobs = ZSTDMT_computeNbJobs(srcSize, params.cParams.windowLog, params.nbWorkers);
|
||||||
size_t const proposedJobSize = (srcSize + (nbJobs-1)) / nbJobs;
|
size_t const proposedJobSize = (srcSize + (nbJobs-1)) / nbJobs;
|
||||||
size_t const avgJobSize = (((proposedJobSize-1) & 0x1FFFF) < 0x7FFF) ? proposedJobSize + 0xFFFF : proposedJobSize; /* avoid too small last block */
|
size_t const avgJobSize = (((proposedJobSize-1) & 0x1FFFF) < 0x7FFF) ? proposedJobSize + 0xFFFF : proposedJobSize; /* avoid too small last block */
|
||||||
const char* const srcStart = (const char*)src;
|
const char* const srcStart = (const char*)src;
|
||||||
@ -742,13 +759,13 @@ static size_t ZSTDMT_compress_advanced_internal(
|
|||||||
unsigned const compressWithinDst = (dstCapacity >= ZSTD_compressBound(srcSize)) ? nbJobs : (unsigned)(dstCapacity / ZSTD_compressBound(avgJobSize)); /* presumes avgJobSize >= 256 KB, which should be the case */
|
unsigned const compressWithinDst = (dstCapacity >= ZSTD_compressBound(srcSize)) ? nbJobs : (unsigned)(dstCapacity / ZSTD_compressBound(avgJobSize)); /* presumes avgJobSize >= 256 KB, which should be the case */
|
||||||
size_t frameStartPos = 0, dstBufferPos = 0;
|
size_t frameStartPos = 0, dstBufferPos = 0;
|
||||||
XXH64_state_t xxh64;
|
XXH64_state_t xxh64;
|
||||||
assert(jobParams.nbThreads == 0);
|
assert(jobParams.nbWorkers == 0);
|
||||||
assert(mtctx->cctxPool->totalCCtx == params.nbThreads);
|
assert(mtctx->cctxPool->totalCCtx == params.nbWorkers);
|
||||||
|
|
||||||
DEBUGLOG(4, "ZSTDMT_compress_advanced_internal: nbJobs=%2u (rawSize=%u bytes; fixedSize=%u) ",
|
DEBUGLOG(4, "ZSTDMT_compress_advanced_internal: nbJobs=%2u (rawSize=%u bytes; fixedSize=%u) ",
|
||||||
nbJobs, (U32)proposedJobSize, (U32)avgJobSize);
|
nbJobs, (U32)proposedJobSize, (U32)avgJobSize);
|
||||||
|
|
||||||
if ((nbJobs==1) | (params.nbThreads<=1)) { /* fallback to single-thread mode : this is a blocking invocation anyway */
|
if ((nbJobs==1) | (params.nbWorkers<=1)) { /* fallback to single-thread mode : this is a blocking invocation anyway */
|
||||||
ZSTD_CCtx* const cctx = mtctx->cctxPool->cctx[0];
|
ZSTD_CCtx* const cctx = mtctx->cctxPool->cctx[0];
|
||||||
if (cdict) return ZSTD_compress_usingCDict_advanced(cctx, dst, dstCapacity, src, srcSize, cdict, jobParams.fParams);
|
if (cdict) return ZSTD_compress_usingCDict_advanced(cctx, dst, dstCapacity, src, srcSize, cdict, jobParams.fParams);
|
||||||
return ZSTD_compress_advanced_internal(cctx, dst, dstCapacity, src, srcSize, NULL, 0, jobParams);
|
return ZSTD_compress_advanced_internal(cctx, dst, dstCapacity, src, srcSize, NULL, 0, jobParams);
|
||||||
@ -856,7 +873,7 @@ size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx,
|
|||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
const ZSTD_CDict* cdict,
|
const ZSTD_CDict* cdict,
|
||||||
ZSTD_parameters const params,
|
ZSTD_parameters params,
|
||||||
unsigned overlapLog)
|
unsigned overlapLog)
|
||||||
{
|
{
|
||||||
ZSTD_CCtx_params cctxParams = mtctx->params;
|
ZSTD_CCtx_params cctxParams = mtctx->params;
|
||||||
@ -892,12 +909,14 @@ size_t ZSTDMT_initCStream_internal(
|
|||||||
const ZSTD_CDict* cdict, ZSTD_CCtx_params params,
|
const ZSTD_CDict* cdict, ZSTD_CCtx_params params,
|
||||||
unsigned long long pledgedSrcSize)
|
unsigned long long pledgedSrcSize)
|
||||||
{
|
{
|
||||||
DEBUGLOG(4, "ZSTDMT_initCStream_internal (pledgedSrcSize=%u)", (U32)pledgedSrcSize);
|
DEBUGLOG(4, "ZSTDMT_initCStream_internal (pledgedSrcSize=%u, nbWorkers=%u, cctxPool=%u)",
|
||||||
|
(U32)pledgedSrcSize, params.nbWorkers, mtctx->cctxPool->totalCCtx);
|
||||||
/* params are supposed to be fully validated at this point */
|
/* params are supposed to be fully validated at this point */
|
||||||
assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams)));
|
assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams)));
|
||||||
assert(!((dict) && (cdict))); /* either dict or cdict, not both */
|
assert(!((dict) && (cdict))); /* either dict or cdict, not both */
|
||||||
assert(mtctx->cctxPool->totalCCtx == params.nbThreads);
|
assert(mtctx->cctxPool->totalCCtx == params.nbWorkers);
|
||||||
mtctx->singleBlockingThread = (pledgedSrcSize <= ZSTDMT_JOBSIZE_MIN); /* do not trigger multi-threading when srcSize is too small */
|
|
||||||
|
/* init */
|
||||||
if (params.jobSize == 0) {
|
if (params.jobSize == 0) {
|
||||||
if (params.cParams.windowLog >= 29)
|
if (params.cParams.windowLog >= 29)
|
||||||
params.jobSize = ZSTDMT_JOBSIZE_MAX;
|
params.jobSize = ZSTDMT_JOBSIZE_MAX;
|
||||||
@ -906,15 +925,17 @@ size_t ZSTDMT_initCStream_internal(
|
|||||||
}
|
}
|
||||||
if (params.jobSize > ZSTDMT_JOBSIZE_MAX) params.jobSize = ZSTDMT_JOBSIZE_MAX;
|
if (params.jobSize > ZSTDMT_JOBSIZE_MAX) params.jobSize = ZSTDMT_JOBSIZE_MAX;
|
||||||
|
|
||||||
|
mtctx->singleBlockingThread = (pledgedSrcSize <= ZSTDMT_JOBSIZE_MIN); /* do not trigger multi-threading when srcSize is too small */
|
||||||
if (mtctx->singleBlockingThread) {
|
if (mtctx->singleBlockingThread) {
|
||||||
ZSTD_CCtx_params const singleThreadParams = ZSTDMT_initJobCCtxParams(params);
|
ZSTD_CCtx_params const singleThreadParams = ZSTDMT_initJobCCtxParams(params);
|
||||||
DEBUGLOG(4, "ZSTDMT_initCStream_internal: switch to single blocking thread mode");
|
DEBUGLOG(4, "ZSTDMT_initCStream_internal: switch to single blocking thread mode");
|
||||||
assert(singleThreadParams.nbThreads == 0);
|
assert(singleThreadParams.nbWorkers == 0);
|
||||||
return ZSTD_initCStream_internal(mtctx->cctxPool->cctx[0],
|
return ZSTD_initCStream_internal(mtctx->cctxPool->cctx[0],
|
||||||
dict, dictSize, cdict,
|
dict, dictSize, cdict,
|
||||||
singleThreadParams, pledgedSrcSize);
|
singleThreadParams, pledgedSrcSize);
|
||||||
}
|
}
|
||||||
DEBUGLOG(4, "ZSTDMT_initCStream_internal: %u threads", params.nbThreads);
|
|
||||||
|
DEBUGLOG(4, "ZSTDMT_initCStream_internal: %u workers", params.nbWorkers);
|
||||||
|
|
||||||
if (mtctx->allJobsCompleted == 0) { /* previous compression not correctly finished */
|
if (mtctx->allJobsCompleted == 0) { /* previous compression not correctly finished */
|
||||||
ZSTDMT_waitForAllJobsCompleted(mtctx);
|
ZSTDMT_waitForAllJobsCompleted(mtctx);
|
||||||
@ -993,8 +1014,6 @@ size_t ZSTDMT_initCStream_usingCDict(ZSTDMT_CCtx* mtctx,
|
|||||||
size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* mtctx, unsigned long long pledgedSrcSize)
|
size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* mtctx, unsigned long long pledgedSrcSize)
|
||||||
{
|
{
|
||||||
if (!pledgedSrcSize) pledgedSrcSize = ZSTD_CONTENTSIZE_UNKNOWN;
|
if (!pledgedSrcSize) pledgedSrcSize = ZSTD_CONTENTSIZE_UNKNOWN;
|
||||||
if (mtctx->params.nbThreads==1)
|
|
||||||
return ZSTD_resetCStream(mtctx->cctxPool->cctx[0], pledgedSrcSize);
|
|
||||||
return ZSTDMT_initCStream_internal(mtctx, NULL, 0, ZSTD_dm_auto, 0, mtctx->params,
|
return ZSTDMT_initCStream_internal(mtctx, NULL, 0, ZSTD_dm_auto, 0, mtctx->params,
|
||||||
pledgedSrcSize);
|
pledgedSrcSize);
|
||||||
}
|
}
|
||||||
|
@ -30,15 +30,15 @@
|
|||||||
|
|
||||||
/* === Memory management === */
|
/* === Memory management === */
|
||||||
typedef struct ZSTDMT_CCtx_s ZSTDMT_CCtx;
|
typedef struct ZSTDMT_CCtx_s ZSTDMT_CCtx;
|
||||||
ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbThreads);
|
ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbWorkers);
|
||||||
ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx_advanced(unsigned nbThreads,
|
ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx_advanced(unsigned nbWorkers,
|
||||||
ZSTD_customMem cMem);
|
ZSTD_customMem cMem);
|
||||||
ZSTDLIB_API size_t ZSTDMT_freeCCtx(ZSTDMT_CCtx* mtctx);
|
ZSTDLIB_API size_t ZSTDMT_freeCCtx(ZSTDMT_CCtx* mtctx);
|
||||||
|
|
||||||
ZSTDLIB_API size_t ZSTDMT_sizeof_CCtx(ZSTDMT_CCtx* mtctx);
|
ZSTDLIB_API size_t ZSTDMT_sizeof_CCtx(ZSTDMT_CCtx* mtctx);
|
||||||
|
|
||||||
|
|
||||||
/* === Simple buffer-to-butter one-pass function === */
|
/* === Simple one-pass compression function === */
|
||||||
|
|
||||||
ZSTDLIB_API size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx,
|
ZSTDLIB_API size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx,
|
||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
@ -50,7 +50,7 @@ ZSTDLIB_API size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx,
|
|||||||
/* === Streaming functions === */
|
/* === Streaming functions === */
|
||||||
|
|
||||||
ZSTDLIB_API size_t ZSTDMT_initCStream(ZSTDMT_CCtx* mtctx, int compressionLevel);
|
ZSTDLIB_API size_t ZSTDMT_initCStream(ZSTDMT_CCtx* mtctx, int compressionLevel);
|
||||||
ZSTDLIB_API size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* mtctx, unsigned long long pledgedSrcSize); /**< if srcSize is not known at reset time, use ZSTD_CONTENTSIZE_UNKNOWN. Note: for compatibility with older programs, 0 means the same as ZSTD_CONTENTSIZE_UNKNOWN, but it may change in the future, to mean "empty" */
|
ZSTDLIB_API size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* mtctx, unsigned long long pledgedSrcSize); /**< if srcSize is not known at reset time, use ZSTD_CONTENTSIZE_UNKNOWN. Note: for compatibility with older programs, 0 means the same as ZSTD_CONTENTSIZE_UNKNOWN, but it will change in the future to mean "empty" */
|
||||||
|
|
||||||
ZSTDLIB_API size_t ZSTDMT_compressStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output, ZSTD_inBuffer* input);
|
ZSTDLIB_API size_t ZSTDMT_compressStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output, ZSTD_inBuffer* input);
|
||||||
|
|
||||||
@ -68,7 +68,7 @@ ZSTDLIB_API size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx,
|
|||||||
void* dst, size_t dstCapacity,
|
void* dst, size_t dstCapacity,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
const ZSTD_CDict* cdict,
|
const ZSTD_CDict* cdict,
|
||||||
ZSTD_parameters const params,
|
ZSTD_parameters params,
|
||||||
unsigned overlapLog);
|
unsigned overlapLog);
|
||||||
|
|
||||||
ZSTDLIB_API size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx,
|
ZSTDLIB_API size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx,
|
||||||
@ -109,19 +109,28 @@ ZSTDLIB_API size_t ZSTDMT_compressStream_generic(ZSTDMT_CCtx* mtctx,
|
|||||||
ZSTD_EndDirective endOp);
|
ZSTD_EndDirective endOp);
|
||||||
|
|
||||||
|
|
||||||
/* === Private definitions; never ever use directly === */
|
/* ========================================================
|
||||||
|
* === Private interface, for use by ZSTD_compress.c ===
|
||||||
|
* === Not exposed in libzstd. Never invoke directly ===
|
||||||
|
* ======================================================== */
|
||||||
|
|
||||||
size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, ZSTDMT_parameter parameter, unsigned value);
|
size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, ZSTDMT_parameter parameter, unsigned value);
|
||||||
|
|
||||||
/* ZSTDMT_CCtxParam_setNbThreads()
|
/* ZSTDMT_CCtxParam_setNbWorkers()
|
||||||
* Set nbThreads, and clamp it correctly,
|
* Set nbWorkers, and clamp it.
|
||||||
* also reset jobSize and overlapLog */
|
* Also reset jobSize and overlapLog */
|
||||||
size_t ZSTDMT_CCtxParam_setNbThreads(ZSTD_CCtx_params* params, unsigned nbThreads);
|
size_t ZSTDMT_CCtxParam_setNbWorkers(ZSTD_CCtx_params* params, unsigned nbWorkers);
|
||||||
|
|
||||||
/* ZSTDMT_getNbThreads():
|
/*! ZSTDMT_updateCParams_whileCompressing() :
|
||||||
|
* Update compression level and parameters (except wlog)
|
||||||
|
* while compression is ongoing.
|
||||||
|
* New parameters will be applied to next compression job. */
|
||||||
|
void ZSTDMT_updateCParams_whileCompressing(ZSTDMT_CCtx* mtctx, int compressionLevel, ZSTD_compressionParameters cParams);
|
||||||
|
|
||||||
|
/* ZSTDMT_getNbWorkers():
|
||||||
* @return nb threads currently active in mtctx.
|
* @return nb threads currently active in mtctx.
|
||||||
* mtctx must be valid */
|
* mtctx must be valid */
|
||||||
unsigned ZSTDMT_getNbThreads(const ZSTDMT_CCtx* mtctx);
|
unsigned ZSTDMT_getNbWorkers(const ZSTDMT_CCtx* mtctx);
|
||||||
|
|
||||||
/* ZSTDMT_getFrameProgression():
|
/* ZSTDMT_getFrameProgression():
|
||||||
* tells how much data has been consumed (input) and produced (output) for current frame.
|
* tells how much data has been consumed (input) and produced (output) for current frame.
|
||||||
|
@ -49,18 +49,19 @@
|
|||||||
****************************************************************/
|
****************************************************************/
|
||||||
#define HUF_isError ERR_isError
|
#define HUF_isError ERR_isError
|
||||||
#define HUF_STATIC_ASSERT(c) { enum { HUF_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */
|
#define HUF_STATIC_ASSERT(c) { enum { HUF_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */
|
||||||
|
#define CHECK_F(f) { size_t const err_ = (f); if (HUF_isError(err_)) return err_; }
|
||||||
|
|
||||||
|
|
||||||
/* **************************************************************
|
/* **************************************************************
|
||||||
* Byte alignment for workSpace management
|
* Byte alignment for workSpace management
|
||||||
****************************************************************/
|
****************************************************************/
|
||||||
#define HUF_ALIGN(x, a) HUF_ALIGN_MASK((x), (a) - 1)
|
#define HUF_ALIGN(x, a) HUF_ALIGN_MASK((x), (a) - 1)
|
||||||
#define HUF_ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
|
#define HUF_ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
|
||||||
|
|
||||||
|
|
||||||
/*-***************************/
|
/*-***************************/
|
||||||
/* generic DTableDesc */
|
/* generic DTableDesc */
|
||||||
/*-***************************/
|
/*-***************************/
|
||||||
|
|
||||||
typedef struct { BYTE maxTableLog; BYTE tableType; BYTE tableLog; BYTE reserved; } DTableDesc;
|
typedef struct { BYTE maxTableLog; BYTE tableType; BYTE tableLog; BYTE reserved; } DTableDesc;
|
||||||
|
|
||||||
static DTableDesc HUF_getDTableDesc(const HUF_DTable* table)
|
static DTableDesc HUF_getDTableDesc(const HUF_DTable* table)
|
||||||
@ -74,7 +75,6 @@ static DTableDesc HUF_getDTableDesc(const HUF_DTable* table)
|
|||||||
/*-***************************/
|
/*-***************************/
|
||||||
/* single-symbol decoding */
|
/* single-symbol decoding */
|
||||||
/*-***************************/
|
/*-***************************/
|
||||||
|
|
||||||
typedef struct { BYTE byte; BYTE nbBits; } HUF_DEltX2; /* single-symbol decoding */
|
typedef struct { BYTE byte; BYTE nbBits; } HUF_DEltX2; /* single-symbol decoding */
|
||||||
|
|
||||||
size_t HUF_readDTableX2_wksp(HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize)
|
size_t HUF_readDTableX2_wksp(HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize)
|
||||||
@ -94,10 +94,7 @@ size_t HUF_readDTableX2_wksp(HUF_DTable* DTable, const void* src, size_t srcSize
|
|||||||
huffWeight = (BYTE *)((U32 *)workSpace + spaceUsed32);
|
huffWeight = (BYTE *)((U32 *)workSpace + spaceUsed32);
|
||||||
spaceUsed32 += HUF_ALIGN(HUF_SYMBOLVALUE_MAX + 1, sizeof(U32)) >> 2;
|
spaceUsed32 += HUF_ALIGN(HUF_SYMBOLVALUE_MAX + 1, sizeof(U32)) >> 2;
|
||||||
|
|
||||||
if ((spaceUsed32 << 2) > wkspSize)
|
if ((spaceUsed32 << 2) > wkspSize) return ERROR(tableLog_tooLarge);
|
||||||
return ERROR(tableLog_tooLarge);
|
|
||||||
workSpace = (U32 *)workSpace + spaceUsed32;
|
|
||||||
wkspSize -= (spaceUsed32 << 2);
|
|
||||||
|
|
||||||
HUF_STATIC_ASSERT(sizeof(DTableDesc) == sizeof(HUF_DTable));
|
HUF_STATIC_ASSERT(sizeof(DTableDesc) == sizeof(HUF_DTable));
|
||||||
/* memset(huffWeight, 0, sizeof(huffWeight)); */ /* is not necessary, even though some analyzer complain ... */
|
/* memset(huffWeight, 0, sizeof(huffWeight)); */ /* is not necessary, even though some analyzer complain ... */
|
||||||
@ -144,8 +141,10 @@ size_t HUF_readDTableX2(HUF_DTable* DTable, const void* src, size_t srcSize)
|
|||||||
workSpace, sizeof(workSpace));
|
workSpace, sizeof(workSpace));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
typedef struct { U16 sequence; BYTE nbBits; BYTE length; } HUF_DEltX4; /* double-symbols decoding */
|
||||||
|
|
||||||
static BYTE HUF_decodeSymbolX2(BIT_DStream_t* Dstream, const HUF_DEltX2* dt, const U32 dtLog)
|
FORCE_INLINE_TEMPLATE BYTE
|
||||||
|
HUF_decodeSymbolX2(BIT_DStream_t* Dstream, const HUF_DEltX2* dt, const U32 dtLog)
|
||||||
{
|
{
|
||||||
size_t const val = BIT_lookBitsFast(Dstream, dtLog); /* note : dtLog >= 1 */
|
size_t const val = BIT_lookBitsFast(Dstream, dtLog); /* note : dtLog >= 1 */
|
||||||
BYTE const c = dt[val].byte;
|
BYTE const c = dt[val].byte;
|
||||||
@ -156,7 +155,7 @@ static BYTE HUF_decodeSymbolX2(BIT_DStream_t* Dstream, const HUF_DEltX2* dt, con
|
|||||||
#define HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr) \
|
#define HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr) \
|
||||||
*ptr++ = HUF_decodeSymbolX2(DStreamPtr, dt, dtLog)
|
*ptr++ = HUF_decodeSymbolX2(DStreamPtr, dt, dtLog)
|
||||||
|
|
||||||
#define HUF_DECODE_SYMBOLX2_1(ptr, DStreamPtr) \
|
#define HUF_DECODE_SYMBOLX2_1(ptr, DStreamPtr) \
|
||||||
if (MEM_64bits() || (HUF_TABLELOG_MAX<=12)) \
|
if (MEM_64bits() || (HUF_TABLELOG_MAX<=12)) \
|
||||||
HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
|
HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
|
||||||
|
|
||||||
@ -164,30 +163,33 @@ static BYTE HUF_decodeSymbolX2(BIT_DStream_t* Dstream, const HUF_DEltX2* dt, con
|
|||||||
if (MEM_64bits()) \
|
if (MEM_64bits()) \
|
||||||
HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
|
HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
|
||||||
|
|
||||||
HINT_INLINE size_t HUF_decodeStreamX2(BYTE* p, BIT_DStream_t* const bitDPtr, BYTE* const pEnd, const HUF_DEltX2* const dt, const U32 dtLog)
|
HINT_INLINE size_t
|
||||||
|
HUF_decodeStreamX2(BYTE* p, BIT_DStream_t* const bitDPtr, BYTE* const pEnd, const HUF_DEltX2* const dt, const U32 dtLog)
|
||||||
{
|
{
|
||||||
BYTE* const pStart = p;
|
BYTE* const pStart = p;
|
||||||
|
|
||||||
/* up to 4 symbols at a time */
|
/* up to 4 symbols at a time */
|
||||||
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) && (p <= pEnd-4)) {
|
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p < pEnd-3)) {
|
||||||
HUF_DECODE_SYMBOLX2_2(p, bitDPtr);
|
HUF_DECODE_SYMBOLX2_2(p, bitDPtr);
|
||||||
HUF_DECODE_SYMBOLX2_1(p, bitDPtr);
|
HUF_DECODE_SYMBOLX2_1(p, bitDPtr);
|
||||||
HUF_DECODE_SYMBOLX2_2(p, bitDPtr);
|
HUF_DECODE_SYMBOLX2_2(p, bitDPtr);
|
||||||
HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
|
HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* closer to the end */
|
/* [0-3] symbols remaining */
|
||||||
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) && (p < pEnd))
|
if (MEM_32bits())
|
||||||
HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
|
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p < pEnd))
|
||||||
|
HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
|
||||||
|
|
||||||
/* no more data to retrieve from bitstream, hence no need to reload */
|
/* no more data to retrieve from bitstream, no need to reload */
|
||||||
while (p < pEnd)
|
while (p < pEnd)
|
||||||
HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
|
HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
|
||||||
|
|
||||||
return pEnd-pStart;
|
return pEnd-pStart;
|
||||||
}
|
}
|
||||||
|
|
||||||
static size_t HUF_decompress1X2_usingDTable_internal(
|
FORCE_INLINE_TEMPLATE size_t
|
||||||
|
HUF_decompress1X2_usingDTable_internal_body(
|
||||||
void* dst, size_t dstSize,
|
void* dst, size_t dstSize,
|
||||||
const void* cSrc, size_t cSrcSize,
|
const void* cSrc, size_t cSrcSize,
|
||||||
const HUF_DTable* DTable)
|
const HUF_DTable* DTable)
|
||||||
@ -200,58 +202,17 @@ static size_t HUF_decompress1X2_usingDTable_internal(
|
|||||||
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
U32 const dtLog = dtd.tableLog;
|
U32 const dtLog = dtd.tableLog;
|
||||||
|
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD, cSrc, cSrcSize);
|
CHECK_F( BIT_initDStream(&bitD, cSrc, cSrcSize) );
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
|
||||||
|
|
||||||
HUF_decodeStreamX2(op, &bitD, oend, dt, dtLog);
|
HUF_decodeStreamX2(op, &bitD, oend, dt, dtLog);
|
||||||
|
|
||||||
/* check */
|
|
||||||
if (!BIT_endOfDStream(&bitD)) return ERROR(corruption_detected);
|
if (!BIT_endOfDStream(&bitD)) return ERROR(corruption_detected);
|
||||||
|
|
||||||
return dstSize;
|
return dstSize;
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t HUF_decompress1X2_usingDTable(
|
FORCE_INLINE_TEMPLATE size_t
|
||||||
void* dst, size_t dstSize,
|
HUF_decompress4X2_usingDTable_internal_body(
|
||||||
const void* cSrc, size_t cSrcSize,
|
|
||||||
const HUF_DTable* DTable)
|
|
||||||
{
|
|
||||||
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
|
||||||
if (dtd.tableType != 0) return ERROR(GENERIC);
|
|
||||||
return HUF_decompress1X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
|
|
||||||
}
|
|
||||||
|
|
||||||
size_t HUF_decompress1X2_DCtx_wksp(HUF_DTable* DCtx, void* dst, size_t dstSize,
|
|
||||||
const void* cSrc, size_t cSrcSize,
|
|
||||||
void* workSpace, size_t wkspSize)
|
|
||||||
{
|
|
||||||
const BYTE* ip = (const BYTE*) cSrc;
|
|
||||||
|
|
||||||
size_t const hSize = HUF_readDTableX2_wksp(DCtx, cSrc, cSrcSize, workSpace, wkspSize);
|
|
||||||
if (HUF_isError(hSize)) return hSize;
|
|
||||||
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
|
||||||
ip += hSize; cSrcSize -= hSize;
|
|
||||||
|
|
||||||
return HUF_decompress1X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx);
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
size_t HUF_decompress1X2_DCtx(HUF_DTable* DCtx, void* dst, size_t dstSize,
|
|
||||||
const void* cSrc, size_t cSrcSize)
|
|
||||||
{
|
|
||||||
U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32];
|
|
||||||
return HUF_decompress1X2_DCtx_wksp(DCtx, dst, dstSize, cSrc, cSrcSize,
|
|
||||||
workSpace, sizeof(workSpace));
|
|
||||||
}
|
|
||||||
|
|
||||||
size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
|
|
||||||
{
|
|
||||||
HUF_CREATE_STATIC_DTABLEX2(DTable, HUF_TABLELOG_MAX);
|
|
||||||
return HUF_decompress1X2_DCtx (DTable, dst, dstSize, cSrc, cSrcSize);
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
static size_t HUF_decompress4X2_usingDTable_internal(
|
|
||||||
void* dst, size_t dstSize,
|
void* dst, size_t dstSize,
|
||||||
const void* cSrc, size_t cSrcSize,
|
const void* cSrc, size_t cSrcSize,
|
||||||
const HUF_DTable* DTable)
|
const HUF_DTable* DTable)
|
||||||
@ -286,23 +247,19 @@ static size_t HUF_decompress4X2_usingDTable_internal(
|
|||||||
BYTE* op2 = opStart2;
|
BYTE* op2 = opStart2;
|
||||||
BYTE* op3 = opStart3;
|
BYTE* op3 = opStart3;
|
||||||
BYTE* op4 = opStart4;
|
BYTE* op4 = opStart4;
|
||||||
U32 endSignal;
|
U32 endSignal = BIT_DStream_unfinished;
|
||||||
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
U32 const dtLog = dtd.tableLog;
|
U32 const dtLog = dtd.tableLog;
|
||||||
|
|
||||||
if (length4 > cSrcSize) return ERROR(corruption_detected); /* overflow */
|
if (length4 > cSrcSize) return ERROR(corruption_detected); /* overflow */
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD1, istart1, length1);
|
CHECK_F( BIT_initDStream(&bitD1, istart1, length1) );
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
CHECK_F( BIT_initDStream(&bitD2, istart2, length2) );
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD2, istart2, length2);
|
CHECK_F( BIT_initDStream(&bitD3, istart3, length3) );
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
CHECK_F( BIT_initDStream(&bitD4, istart4, length4) );
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD3, istart3, length3);
|
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD4, istart4, length4);
|
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
|
||||||
|
|
||||||
/* 16-32 symbols per loop (4-8 symbols per stream) */
|
/* up to 16 symbols per loop (4 symbols per stream) in 64-bit mode */
|
||||||
endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
|
endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
|
||||||
for ( ; (endSignal==BIT_DStream_unfinished) && (op4<(oend-7)) ; ) {
|
while ( (endSignal==BIT_DStream_unfinished) && (op4<(oend-3)) ) {
|
||||||
HUF_DECODE_SYMBOLX2_2(op1, &bitD1);
|
HUF_DECODE_SYMBOLX2_2(op1, &bitD1);
|
||||||
HUF_DECODE_SYMBOLX2_2(op2, &bitD2);
|
HUF_DECODE_SYMBOLX2_2(op2, &bitD2);
|
||||||
HUF_DECODE_SYMBOLX2_2(op3, &bitD3);
|
HUF_DECODE_SYMBOLX2_2(op3, &bitD3);
|
||||||
@ -319,10 +276,15 @@ static size_t HUF_decompress4X2_usingDTable_internal(
|
|||||||
HUF_DECODE_SYMBOLX2_0(op2, &bitD2);
|
HUF_DECODE_SYMBOLX2_0(op2, &bitD2);
|
||||||
HUF_DECODE_SYMBOLX2_0(op3, &bitD3);
|
HUF_DECODE_SYMBOLX2_0(op3, &bitD3);
|
||||||
HUF_DECODE_SYMBOLX2_0(op4, &bitD4);
|
HUF_DECODE_SYMBOLX2_0(op4, &bitD4);
|
||||||
endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
|
BIT_reloadDStream(&bitD1);
|
||||||
|
BIT_reloadDStream(&bitD2);
|
||||||
|
BIT_reloadDStream(&bitD3);
|
||||||
|
BIT_reloadDStream(&bitD4);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* check corruption */
|
/* check corruption */
|
||||||
|
/* note : should not be necessary : op# advance in lock step, and we control op4.
|
||||||
|
* but curiously, binary generated by gcc 7.2 & 7.3 with -mbmi2 runs faster when >=1 test is present */
|
||||||
if (op1 > opStart2) return ERROR(corruption_detected);
|
if (op1 > opStart2) return ERROR(corruption_detected);
|
||||||
if (op2 > opStart3) return ERROR(corruption_detected);
|
if (op2 > opStart3) return ERROR(corruption_detected);
|
||||||
if (op3 > opStart4) return ERROR(corruption_detected);
|
if (op3 > opStart4) return ERROR(corruption_detected);
|
||||||
@ -335,8 +297,8 @@ static size_t HUF_decompress4X2_usingDTable_internal(
|
|||||||
HUF_decodeStreamX2(op4, &bitD4, oend, dt, dtLog);
|
HUF_decodeStreamX2(op4, &bitD4, oend, dt, dtLog);
|
||||||
|
|
||||||
/* check */
|
/* check */
|
||||||
endSignal = BIT_endOfDStream(&bitD1) & BIT_endOfDStream(&bitD2) & BIT_endOfDStream(&bitD3) & BIT_endOfDStream(&bitD4);
|
{ U32 const endCheck = BIT_endOfDStream(&bitD1) & BIT_endOfDStream(&bitD2) & BIT_endOfDStream(&bitD3) & BIT_endOfDStream(&bitD4);
|
||||||
if (!endSignal) return ERROR(corruption_detected);
|
if (!endCheck) return ERROR(corruption_detected); }
|
||||||
|
|
||||||
/* decoded size */
|
/* decoded size */
|
||||||
return dstSize;
|
return dstSize;
|
||||||
@ -344,6 +306,279 @@ static size_t HUF_decompress4X2_usingDTable_internal(
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
FORCE_INLINE_TEMPLATE U32
|
||||||
|
HUF_decodeSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DEltX4* dt, const U32 dtLog)
|
||||||
|
{
|
||||||
|
size_t const val = BIT_lookBitsFast(DStream, dtLog); /* note : dtLog >= 1 */
|
||||||
|
memcpy(op, dt+val, 2);
|
||||||
|
BIT_skipBits(DStream, dt[val].nbBits);
|
||||||
|
return dt[val].length;
|
||||||
|
}
|
||||||
|
|
||||||
|
FORCE_INLINE_TEMPLATE U32
|
||||||
|
HUF_decodeLastSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DEltX4* dt, const U32 dtLog)
|
||||||
|
{
|
||||||
|
size_t const val = BIT_lookBitsFast(DStream, dtLog); /* note : dtLog >= 1 */
|
||||||
|
memcpy(op, dt+val, 1);
|
||||||
|
if (dt[val].length==1) BIT_skipBits(DStream, dt[val].nbBits);
|
||||||
|
else {
|
||||||
|
if (DStream->bitsConsumed < (sizeof(DStream->bitContainer)*8)) {
|
||||||
|
BIT_skipBits(DStream, dt[val].nbBits);
|
||||||
|
if (DStream->bitsConsumed > (sizeof(DStream->bitContainer)*8))
|
||||||
|
/* ugly hack; works only because it's the last symbol. Note : can't easily extract nbBits from just this symbol */
|
||||||
|
DStream->bitsConsumed = (sizeof(DStream->bitContainer)*8);
|
||||||
|
} }
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
#define HUF_DECODE_SYMBOLX4_0(ptr, DStreamPtr) \
|
||||||
|
ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
|
||||||
|
|
||||||
|
#define HUF_DECODE_SYMBOLX4_1(ptr, DStreamPtr) \
|
||||||
|
if (MEM_64bits() || (HUF_TABLELOG_MAX<=12)) \
|
||||||
|
ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
|
||||||
|
|
||||||
|
#define HUF_DECODE_SYMBOLX4_2(ptr, DStreamPtr) \
|
||||||
|
if (MEM_64bits()) \
|
||||||
|
ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
|
||||||
|
|
||||||
|
HINT_INLINE size_t
|
||||||
|
HUF_decodeStreamX4(BYTE* p, BIT_DStream_t* bitDPtr, BYTE* const pEnd,
|
||||||
|
const HUF_DEltX4* const dt, const U32 dtLog)
|
||||||
|
{
|
||||||
|
BYTE* const pStart = p;
|
||||||
|
|
||||||
|
/* up to 8 symbols at a time */
|
||||||
|
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p < pEnd-(sizeof(bitDPtr->bitContainer)-1))) {
|
||||||
|
HUF_DECODE_SYMBOLX4_2(p, bitDPtr);
|
||||||
|
HUF_DECODE_SYMBOLX4_1(p, bitDPtr);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(p, bitDPtr);
|
||||||
|
HUF_DECODE_SYMBOLX4_0(p, bitDPtr);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* closer to end : up to 2 symbols at a time */
|
||||||
|
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p <= pEnd-2))
|
||||||
|
HUF_DECODE_SYMBOLX4_0(p, bitDPtr);
|
||||||
|
|
||||||
|
while (p <= pEnd-2)
|
||||||
|
HUF_DECODE_SYMBOLX4_0(p, bitDPtr); /* no need to reload : reached the end of DStream */
|
||||||
|
|
||||||
|
if (p < pEnd)
|
||||||
|
p += HUF_decodeLastSymbolX4(p, bitDPtr, dt, dtLog);
|
||||||
|
|
||||||
|
return p-pStart;
|
||||||
|
}
|
||||||
|
|
||||||
|
FORCE_INLINE_TEMPLATE size_t
|
||||||
|
HUF_decompress1X4_usingDTable_internal_body(
|
||||||
|
void* dst, size_t dstSize,
|
||||||
|
const void* cSrc, size_t cSrcSize,
|
||||||
|
const HUF_DTable* DTable)
|
||||||
|
{
|
||||||
|
BIT_DStream_t bitD;
|
||||||
|
|
||||||
|
/* Init */
|
||||||
|
CHECK_F( BIT_initDStream(&bitD, cSrc, cSrcSize) );
|
||||||
|
|
||||||
|
/* decode */
|
||||||
|
{ BYTE* const ostart = (BYTE*) dst;
|
||||||
|
BYTE* const oend = ostart + dstSize;
|
||||||
|
const void* const dtPtr = DTable+1; /* force compiler to not use strict-aliasing */
|
||||||
|
const HUF_DEltX4* const dt = (const HUF_DEltX4*)dtPtr;
|
||||||
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
|
HUF_decodeStreamX4(ostart, &bitD, oend, dt, dtd.tableLog);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* check */
|
||||||
|
if (!BIT_endOfDStream(&bitD)) return ERROR(corruption_detected);
|
||||||
|
|
||||||
|
/* decoded size */
|
||||||
|
return dstSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
FORCE_INLINE_TEMPLATE size_t
|
||||||
|
HUF_decompress4X4_usingDTable_internal_body(
|
||||||
|
void* dst, size_t dstSize,
|
||||||
|
const void* cSrc, size_t cSrcSize,
|
||||||
|
const HUF_DTable* DTable)
|
||||||
|
{
|
||||||
|
if (cSrcSize < 10) return ERROR(corruption_detected); /* strict minimum : jump table + 1 byte per stream */
|
||||||
|
|
||||||
|
{ const BYTE* const istart = (const BYTE*) cSrc;
|
||||||
|
BYTE* const ostart = (BYTE*) dst;
|
||||||
|
BYTE* const oend = ostart + dstSize;
|
||||||
|
const void* const dtPtr = DTable+1;
|
||||||
|
const HUF_DEltX4* const dt = (const HUF_DEltX4*)dtPtr;
|
||||||
|
|
||||||
|
/* Init */
|
||||||
|
BIT_DStream_t bitD1;
|
||||||
|
BIT_DStream_t bitD2;
|
||||||
|
BIT_DStream_t bitD3;
|
||||||
|
BIT_DStream_t bitD4;
|
||||||
|
size_t const length1 = MEM_readLE16(istart);
|
||||||
|
size_t const length2 = MEM_readLE16(istart+2);
|
||||||
|
size_t const length3 = MEM_readLE16(istart+4);
|
||||||
|
size_t const length4 = cSrcSize - (length1 + length2 + length3 + 6);
|
||||||
|
const BYTE* const istart1 = istart + 6; /* jumpTable */
|
||||||
|
const BYTE* const istart2 = istart1 + length1;
|
||||||
|
const BYTE* const istart3 = istart2 + length2;
|
||||||
|
const BYTE* const istart4 = istart3 + length3;
|
||||||
|
size_t const segmentSize = (dstSize+3) / 4;
|
||||||
|
BYTE* const opStart2 = ostart + segmentSize;
|
||||||
|
BYTE* const opStart3 = opStart2 + segmentSize;
|
||||||
|
BYTE* const opStart4 = opStart3 + segmentSize;
|
||||||
|
BYTE* op1 = ostart;
|
||||||
|
BYTE* op2 = opStart2;
|
||||||
|
BYTE* op3 = opStart3;
|
||||||
|
BYTE* op4 = opStart4;
|
||||||
|
U32 endSignal;
|
||||||
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
|
U32 const dtLog = dtd.tableLog;
|
||||||
|
|
||||||
|
if (length4 > cSrcSize) return ERROR(corruption_detected); /* overflow */
|
||||||
|
CHECK_F( BIT_initDStream(&bitD1, istart1, length1) );
|
||||||
|
CHECK_F( BIT_initDStream(&bitD2, istart2, length2) );
|
||||||
|
CHECK_F( BIT_initDStream(&bitD3, istart3, length3) );
|
||||||
|
CHECK_F( BIT_initDStream(&bitD4, istart4, length4) );
|
||||||
|
|
||||||
|
/* 16-32 symbols per loop (4-8 symbols per stream) */
|
||||||
|
endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
|
||||||
|
for ( ; (endSignal==BIT_DStream_unfinished) & (op4<(oend-(sizeof(bitD4.bitContainer)-1))) ; ) {
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op1, &bitD1);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op2, &bitD2);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op3, &bitD3);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op4, &bitD4);
|
||||||
|
HUF_DECODE_SYMBOLX4_1(op1, &bitD1);
|
||||||
|
HUF_DECODE_SYMBOLX4_1(op2, &bitD2);
|
||||||
|
HUF_DECODE_SYMBOLX4_1(op3, &bitD3);
|
||||||
|
HUF_DECODE_SYMBOLX4_1(op4, &bitD4);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op1, &bitD1);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op2, &bitD2);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op3, &bitD3);
|
||||||
|
HUF_DECODE_SYMBOLX4_2(op4, &bitD4);
|
||||||
|
HUF_DECODE_SYMBOLX4_0(op1, &bitD1);
|
||||||
|
HUF_DECODE_SYMBOLX4_0(op2, &bitD2);
|
||||||
|
HUF_DECODE_SYMBOLX4_0(op3, &bitD3);
|
||||||
|
HUF_DECODE_SYMBOLX4_0(op4, &bitD4);
|
||||||
|
|
||||||
|
endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* check corruption */
|
||||||
|
if (op1 > opStart2) return ERROR(corruption_detected);
|
||||||
|
if (op2 > opStart3) return ERROR(corruption_detected);
|
||||||
|
if (op3 > opStart4) return ERROR(corruption_detected);
|
||||||
|
/* note : op4 already verified within main loop */
|
||||||
|
|
||||||
|
/* finish bitStreams one by one */
|
||||||
|
HUF_decodeStreamX4(op1, &bitD1, opStart2, dt, dtLog);
|
||||||
|
HUF_decodeStreamX4(op2, &bitD2, opStart3, dt, dtLog);
|
||||||
|
HUF_decodeStreamX4(op3, &bitD3, opStart4, dt, dtLog);
|
||||||
|
HUF_decodeStreamX4(op4, &bitD4, oend, dt, dtLog);
|
||||||
|
|
||||||
|
/* check */
|
||||||
|
{ U32 const endCheck = BIT_endOfDStream(&bitD1) & BIT_endOfDStream(&bitD2) & BIT_endOfDStream(&bitD3) & BIT_endOfDStream(&bitD4);
|
||||||
|
if (!endCheck) return ERROR(corruption_detected); }
|
||||||
|
|
||||||
|
/* decoded size */
|
||||||
|
return dstSize;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
typedef size_t (*HUF_decompress_usingDTable_t)(void *dst, size_t dstSize,
|
||||||
|
const void *cSrc,
|
||||||
|
size_t cSrcSize,
|
||||||
|
const HUF_DTable *DTable);
|
||||||
|
#if DYNAMIC_BMI2
|
||||||
|
|
||||||
|
#define X(fn) \
|
||||||
|
\
|
||||||
|
static size_t fn##_default( \
|
||||||
|
void* dst, size_t dstSize, \
|
||||||
|
const void* cSrc, size_t cSrcSize, \
|
||||||
|
const HUF_DTable* DTable) \
|
||||||
|
{ \
|
||||||
|
return fn##_body(dst, dstSize, cSrc, cSrcSize, DTable); \
|
||||||
|
} \
|
||||||
|
\
|
||||||
|
static TARGET_ATTRIBUTE("bmi2") size_t fn##_bmi2( \
|
||||||
|
void* dst, size_t dstSize, \
|
||||||
|
const void* cSrc, size_t cSrcSize, \
|
||||||
|
const HUF_DTable* DTable) \
|
||||||
|
{ \
|
||||||
|
return fn##_body(dst, dstSize, cSrc, cSrcSize, DTable); \
|
||||||
|
} \
|
||||||
|
\
|
||||||
|
static size_t fn(void* dst, size_t dstSize, void const* cSrc, \
|
||||||
|
size_t cSrcSize, HUF_DTable const* DTable, int bmi2) \
|
||||||
|
{ \
|
||||||
|
if (bmi2) { \
|
||||||
|
return fn##_bmi2(dst, dstSize, cSrc, cSrcSize, DTable); \
|
||||||
|
} \
|
||||||
|
return fn##_default(dst, dstSize, cSrc, cSrcSize, DTable); \
|
||||||
|
}
|
||||||
|
|
||||||
|
#else
|
||||||
|
|
||||||
|
#define X(fn) \
|
||||||
|
static size_t fn(void* dst, size_t dstSize, void const* cSrc, \
|
||||||
|
size_t cSrcSize, HUF_DTable const* DTable, int bmi2) \
|
||||||
|
{ \
|
||||||
|
(void)bmi2; \
|
||||||
|
return fn##_body(dst, dstSize, cSrc, cSrcSize, DTable); \
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif
|
||||||
|
|
||||||
|
X(HUF_decompress1X2_usingDTable_internal)
|
||||||
|
X(HUF_decompress4X2_usingDTable_internal)
|
||||||
|
X(HUF_decompress1X4_usingDTable_internal)
|
||||||
|
X(HUF_decompress4X4_usingDTable_internal)
|
||||||
|
|
||||||
|
#undef X
|
||||||
|
|
||||||
|
|
||||||
|
size_t HUF_decompress1X2_usingDTable(
|
||||||
|
void* dst, size_t dstSize,
|
||||||
|
const void* cSrc, size_t cSrcSize,
|
||||||
|
const HUF_DTable* DTable)
|
||||||
|
{
|
||||||
|
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
||||||
|
if (dtd.tableType != 0) return ERROR(GENERIC);
|
||||||
|
return HUF_decompress1X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t HUF_decompress1X2_DCtx_wksp(HUF_DTable* DCtx, void* dst, size_t dstSize,
|
||||||
|
const void* cSrc, size_t cSrcSize,
|
||||||
|
void* workSpace, size_t wkspSize)
|
||||||
|
{
|
||||||
|
const BYTE* ip = (const BYTE*) cSrc;
|
||||||
|
|
||||||
|
size_t const hSize = HUF_readDTableX2_wksp(DCtx, cSrc, cSrcSize, workSpace, wkspSize);
|
||||||
|
if (HUF_isError(hSize)) return hSize;
|
||||||
|
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
||||||
|
ip += hSize; cSrcSize -= hSize;
|
||||||
|
|
||||||
|
return HUF_decompress1X2_usingDTable_internal(dst, dstSize, ip, cSrcSize, DCtx, /* bmi2 */ 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
size_t HUF_decompress1X2_DCtx(HUF_DTable* DCtx, void* dst, size_t dstSize,
|
||||||
|
const void* cSrc, size_t cSrcSize)
|
||||||
|
{
|
||||||
|
U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32];
|
||||||
|
return HUF_decompress1X2_DCtx_wksp(DCtx, dst, dstSize, cSrc, cSrcSize,
|
||||||
|
workSpace, sizeof(workSpace));
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
|
||||||
|
{
|
||||||
|
HUF_CREATE_STATIC_DTABLEX2(DTable, HUF_TABLELOG_MAX);
|
||||||
|
return HUF_decompress1X2_DCtx (DTable, dst, dstSize, cSrc, cSrcSize);
|
||||||
|
}
|
||||||
|
|
||||||
size_t HUF_decompress4X2_usingDTable(
|
size_t HUF_decompress4X2_usingDTable(
|
||||||
void* dst, size_t dstSize,
|
void* dst, size_t dstSize,
|
||||||
const void* cSrc, size_t cSrcSize,
|
const void* cSrc, size_t cSrcSize,
|
||||||
@ -351,13 +586,12 @@ size_t HUF_decompress4X2_usingDTable(
|
|||||||
{
|
{
|
||||||
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
||||||
if (dtd.tableType != 0) return ERROR(GENERIC);
|
if (dtd.tableType != 0) return ERROR(GENERIC);
|
||||||
return HUF_decompress4X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
|
return HUF_decompress4X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static size_t HUF_decompress4X2_DCtx_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize,
|
||||||
size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize,
|
|
||||||
const void* cSrc, size_t cSrcSize,
|
const void* cSrc, size_t cSrcSize,
|
||||||
void* workSpace, size_t wkspSize)
|
void* workSpace, size_t wkspSize, int bmi2)
|
||||||
{
|
{
|
||||||
const BYTE* ip = (const BYTE*) cSrc;
|
const BYTE* ip = (const BYTE*) cSrc;
|
||||||
|
|
||||||
@ -367,7 +601,14 @@ size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize,
|
|||||||
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
||||||
ip += hSize; cSrcSize -= hSize;
|
ip += hSize; cSrcSize -= hSize;
|
||||||
|
|
||||||
return HUF_decompress4X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, dctx);
|
return HUF_decompress4X2_usingDTable_internal(dst, dstSize, ip, cSrcSize, dctx, bmi2);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize,
|
||||||
|
const void* cSrc, size_t cSrcSize,
|
||||||
|
void* workSpace, size_t wkspSize)
|
||||||
|
{
|
||||||
|
return HUF_decompress4X2_DCtx_wksp_bmi2(dctx, dst, dstSize, cSrc, cSrcSize, workSpace, wkspSize, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -387,8 +628,6 @@ size_t HUF_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cS
|
|||||||
/* *************************/
|
/* *************************/
|
||||||
/* double-symbols decoding */
|
/* double-symbols decoding */
|
||||||
/* *************************/
|
/* *************************/
|
||||||
typedef struct { U16 sequence; BYTE nbBits; BYTE length; } HUF_DEltX4; /* double-symbols decoding */
|
|
||||||
|
|
||||||
typedef struct { BYTE symbol; BYTE weight; } sortedSymbol_t;
|
typedef struct { BYTE symbol; BYTE weight; } sortedSymbol_t;
|
||||||
|
|
||||||
/* HUF_fillDTableX4Level2() :
|
/* HUF_fillDTableX4Level2() :
|
||||||
@ -508,10 +747,7 @@ size_t HUF_readDTableX4_wksp(HUF_DTable* DTable, const void* src,
|
|||||||
weightList = (BYTE *)((U32 *)workSpace + spaceUsed32);
|
weightList = (BYTE *)((U32 *)workSpace + spaceUsed32);
|
||||||
spaceUsed32 += HUF_ALIGN(HUF_SYMBOLVALUE_MAX + 1, sizeof(U32)) >> 2;
|
spaceUsed32 += HUF_ALIGN(HUF_SYMBOLVALUE_MAX + 1, sizeof(U32)) >> 2;
|
||||||
|
|
||||||
if ((spaceUsed32 << 2) > wkspSize)
|
if ((spaceUsed32 << 2) > wkspSize) return ERROR(tableLog_tooLarge);
|
||||||
return ERROR(tableLog_tooLarge);
|
|
||||||
workSpace = (U32 *)workSpace + spaceUsed32;
|
|
||||||
wkspSize -= (spaceUsed32 << 2);
|
|
||||||
|
|
||||||
rankStart = rankStart0 + 1;
|
rankStart = rankStart0 + 1;
|
||||||
memset(rankStats, 0, sizeof(U32) * (2 * HUF_TABLELOG_MAX + 2 + 1));
|
memset(rankStats, 0, sizeof(U32) * (2 * HUF_TABLELOG_MAX + 2 + 1));
|
||||||
@ -588,95 +824,6 @@ size_t HUF_readDTableX4(HUF_DTable* DTable, const void* src, size_t srcSize)
|
|||||||
workSpace, sizeof(workSpace));
|
workSpace, sizeof(workSpace));
|
||||||
}
|
}
|
||||||
|
|
||||||
static U32 HUF_decodeSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DEltX4* dt, const U32 dtLog)
|
|
||||||
{
|
|
||||||
size_t const val = BIT_lookBitsFast(DStream, dtLog); /* note : dtLog >= 1 */
|
|
||||||
memcpy(op, dt+val, 2);
|
|
||||||
BIT_skipBits(DStream, dt[val].nbBits);
|
|
||||||
return dt[val].length;
|
|
||||||
}
|
|
||||||
|
|
||||||
static U32 HUF_decodeLastSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DEltX4* dt, const U32 dtLog)
|
|
||||||
{
|
|
||||||
size_t const val = BIT_lookBitsFast(DStream, dtLog); /* note : dtLog >= 1 */
|
|
||||||
memcpy(op, dt+val, 1);
|
|
||||||
if (dt[val].length==1) BIT_skipBits(DStream, dt[val].nbBits);
|
|
||||||
else {
|
|
||||||
if (DStream->bitsConsumed < (sizeof(DStream->bitContainer)*8)) {
|
|
||||||
BIT_skipBits(DStream, dt[val].nbBits);
|
|
||||||
if (DStream->bitsConsumed > (sizeof(DStream->bitContainer)*8))
|
|
||||||
/* ugly hack; works only because it's the last symbol. Note : can't easily extract nbBits from just this symbol */
|
|
||||||
DStream->bitsConsumed = (sizeof(DStream->bitContainer)*8);
|
|
||||||
} }
|
|
||||||
return 1;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
#define HUF_DECODE_SYMBOLX4_0(ptr, DStreamPtr) \
|
|
||||||
ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
|
|
||||||
|
|
||||||
#define HUF_DECODE_SYMBOLX4_1(ptr, DStreamPtr) \
|
|
||||||
if (MEM_64bits() || (HUF_TABLELOG_MAX<=12)) \
|
|
||||||
ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
|
|
||||||
|
|
||||||
#define HUF_DECODE_SYMBOLX4_2(ptr, DStreamPtr) \
|
|
||||||
if (MEM_64bits()) \
|
|
||||||
ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
|
|
||||||
|
|
||||||
HINT_INLINE size_t HUF_decodeStreamX4(BYTE* p, BIT_DStream_t* bitDPtr, BYTE* const pEnd, const HUF_DEltX4* const dt, const U32 dtLog)
|
|
||||||
{
|
|
||||||
BYTE* const pStart = p;
|
|
||||||
|
|
||||||
/* up to 8 symbols at a time */
|
|
||||||
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p < pEnd-(sizeof(bitDPtr->bitContainer)-1))) {
|
|
||||||
HUF_DECODE_SYMBOLX4_2(p, bitDPtr);
|
|
||||||
HUF_DECODE_SYMBOLX4_1(p, bitDPtr);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(p, bitDPtr);
|
|
||||||
HUF_DECODE_SYMBOLX4_0(p, bitDPtr);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* closer to end : up to 2 symbols at a time */
|
|
||||||
while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p <= pEnd-2))
|
|
||||||
HUF_DECODE_SYMBOLX4_0(p, bitDPtr);
|
|
||||||
|
|
||||||
while (p <= pEnd-2)
|
|
||||||
HUF_DECODE_SYMBOLX4_0(p, bitDPtr); /* no need to reload : reached the end of DStream */
|
|
||||||
|
|
||||||
if (p < pEnd)
|
|
||||||
p += HUF_decodeLastSymbolX4(p, bitDPtr, dt, dtLog);
|
|
||||||
|
|
||||||
return p-pStart;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
static size_t HUF_decompress1X4_usingDTable_internal(
|
|
||||||
void* dst, size_t dstSize,
|
|
||||||
const void* cSrc, size_t cSrcSize,
|
|
||||||
const HUF_DTable* DTable)
|
|
||||||
{
|
|
||||||
BIT_DStream_t bitD;
|
|
||||||
|
|
||||||
/* Init */
|
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD, cSrc, cSrcSize);
|
|
||||||
if (HUF_isError(errorCode)) return errorCode;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* decode */
|
|
||||||
{ BYTE* const ostart = (BYTE*) dst;
|
|
||||||
BYTE* const oend = ostart + dstSize;
|
|
||||||
const void* const dtPtr = DTable+1; /* force compiler to not use strict-aliasing */
|
|
||||||
const HUF_DEltX4* const dt = (const HUF_DEltX4*)dtPtr;
|
|
||||||
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
|
||||||
HUF_decodeStreamX4(ostart, &bitD, oend, dt, dtd.tableLog);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* check */
|
|
||||||
if (!BIT_endOfDStream(&bitD)) return ERROR(corruption_detected);
|
|
||||||
|
|
||||||
/* decoded size */
|
|
||||||
return dstSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
size_t HUF_decompress1X4_usingDTable(
|
size_t HUF_decompress1X4_usingDTable(
|
||||||
void* dst, size_t dstSize,
|
void* dst, size_t dstSize,
|
||||||
const void* cSrc, size_t cSrcSize,
|
const void* cSrc, size_t cSrcSize,
|
||||||
@ -684,7 +831,7 @@ size_t HUF_decompress1X4_usingDTable(
|
|||||||
{
|
{
|
||||||
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
||||||
if (dtd.tableType != 1) return ERROR(GENERIC);
|
if (dtd.tableType != 1) return ERROR(GENERIC);
|
||||||
return HUF_decompress1X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
|
return HUF_decompress1X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t HUF_decompress1X4_DCtx_wksp(HUF_DTable* DCtx, void* dst, size_t dstSize,
|
size_t HUF_decompress1X4_DCtx_wksp(HUF_DTable* DCtx, void* dst, size_t dstSize,
|
||||||
@ -699,7 +846,7 @@ size_t HUF_decompress1X4_DCtx_wksp(HUF_DTable* DCtx, void* dst, size_t dstSize,
|
|||||||
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
||||||
ip += hSize; cSrcSize -= hSize;
|
ip += hSize; cSrcSize -= hSize;
|
||||||
|
|
||||||
return HUF_decompress1X4_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx);
|
return HUF_decompress1X4_usingDTable_internal(dst, dstSize, ip, cSrcSize, DCtx, /* bmi2 */ 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -717,99 +864,6 @@ size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cS
|
|||||||
return HUF_decompress1X4_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
|
return HUF_decompress1X4_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
|
||||||
}
|
}
|
||||||
|
|
||||||
static size_t HUF_decompress4X4_usingDTable_internal(
|
|
||||||
void* dst, size_t dstSize,
|
|
||||||
const void* cSrc, size_t cSrcSize,
|
|
||||||
const HUF_DTable* DTable)
|
|
||||||
{
|
|
||||||
if (cSrcSize < 10) return ERROR(corruption_detected); /* strict minimum : jump table + 1 byte per stream */
|
|
||||||
|
|
||||||
{ const BYTE* const istart = (const BYTE*) cSrc;
|
|
||||||
BYTE* const ostart = (BYTE*) dst;
|
|
||||||
BYTE* const oend = ostart + dstSize;
|
|
||||||
const void* const dtPtr = DTable+1;
|
|
||||||
const HUF_DEltX4* const dt = (const HUF_DEltX4*)dtPtr;
|
|
||||||
|
|
||||||
/* Init */
|
|
||||||
BIT_DStream_t bitD1;
|
|
||||||
BIT_DStream_t bitD2;
|
|
||||||
BIT_DStream_t bitD3;
|
|
||||||
BIT_DStream_t bitD4;
|
|
||||||
size_t const length1 = MEM_readLE16(istart);
|
|
||||||
size_t const length2 = MEM_readLE16(istart+2);
|
|
||||||
size_t const length3 = MEM_readLE16(istart+4);
|
|
||||||
size_t const length4 = cSrcSize - (length1 + length2 + length3 + 6);
|
|
||||||
const BYTE* const istart1 = istart + 6; /* jumpTable */
|
|
||||||
const BYTE* const istart2 = istart1 + length1;
|
|
||||||
const BYTE* const istart3 = istart2 + length2;
|
|
||||||
const BYTE* const istart4 = istart3 + length3;
|
|
||||||
size_t const segmentSize = (dstSize+3) / 4;
|
|
||||||
BYTE* const opStart2 = ostart + segmentSize;
|
|
||||||
BYTE* const opStart3 = opStart2 + segmentSize;
|
|
||||||
BYTE* const opStart4 = opStart3 + segmentSize;
|
|
||||||
BYTE* op1 = ostart;
|
|
||||||
BYTE* op2 = opStart2;
|
|
||||||
BYTE* op3 = opStart3;
|
|
||||||
BYTE* op4 = opStart4;
|
|
||||||
U32 endSignal;
|
|
||||||
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
|
||||||
U32 const dtLog = dtd.tableLog;
|
|
||||||
|
|
||||||
if (length4 > cSrcSize) return ERROR(corruption_detected); /* overflow */
|
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD1, istart1, length1);
|
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD2, istart2, length2);
|
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD3, istart3, length3);
|
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
|
||||||
{ size_t const errorCode = BIT_initDStream(&bitD4, istart4, length4);
|
|
||||||
if (HUF_isError(errorCode)) return errorCode; }
|
|
||||||
|
|
||||||
/* 16-32 symbols per loop (4-8 symbols per stream) */
|
|
||||||
endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
|
|
||||||
for ( ; (endSignal==BIT_DStream_unfinished) & (op4<(oend-(sizeof(bitD4.bitContainer)-1))) ; ) {
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op1, &bitD1);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op2, &bitD2);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op3, &bitD3);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op4, &bitD4);
|
|
||||||
HUF_DECODE_SYMBOLX4_1(op1, &bitD1);
|
|
||||||
HUF_DECODE_SYMBOLX4_1(op2, &bitD2);
|
|
||||||
HUF_DECODE_SYMBOLX4_1(op3, &bitD3);
|
|
||||||
HUF_DECODE_SYMBOLX4_1(op4, &bitD4);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op1, &bitD1);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op2, &bitD2);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op3, &bitD3);
|
|
||||||
HUF_DECODE_SYMBOLX4_2(op4, &bitD4);
|
|
||||||
HUF_DECODE_SYMBOLX4_0(op1, &bitD1);
|
|
||||||
HUF_DECODE_SYMBOLX4_0(op2, &bitD2);
|
|
||||||
HUF_DECODE_SYMBOLX4_0(op3, &bitD3);
|
|
||||||
HUF_DECODE_SYMBOLX4_0(op4, &bitD4);
|
|
||||||
|
|
||||||
endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* check corruption */
|
|
||||||
if (op1 > opStart2) return ERROR(corruption_detected);
|
|
||||||
if (op2 > opStart3) return ERROR(corruption_detected);
|
|
||||||
if (op3 > opStart4) return ERROR(corruption_detected);
|
|
||||||
/* note : op4 already verified within main loop */
|
|
||||||
|
|
||||||
/* finish bitStreams one by one */
|
|
||||||
HUF_decodeStreamX4(op1, &bitD1, opStart2, dt, dtLog);
|
|
||||||
HUF_decodeStreamX4(op2, &bitD2, opStart3, dt, dtLog);
|
|
||||||
HUF_decodeStreamX4(op3, &bitD3, opStart4, dt, dtLog);
|
|
||||||
HUF_decodeStreamX4(op4, &bitD4, oend, dt, dtLog);
|
|
||||||
|
|
||||||
/* check */
|
|
||||||
{ U32 const endCheck = BIT_endOfDStream(&bitD1) & BIT_endOfDStream(&bitD2) & BIT_endOfDStream(&bitD3) & BIT_endOfDStream(&bitD4);
|
|
||||||
if (!endCheck) return ERROR(corruption_detected); }
|
|
||||||
|
|
||||||
/* decoded size */
|
|
||||||
return dstSize;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
size_t HUF_decompress4X4_usingDTable(
|
size_t HUF_decompress4X4_usingDTable(
|
||||||
void* dst, size_t dstSize,
|
void* dst, size_t dstSize,
|
||||||
const void* cSrc, size_t cSrcSize,
|
const void* cSrc, size_t cSrcSize,
|
||||||
@ -817,13 +871,12 @@ size_t HUF_decompress4X4_usingDTable(
|
|||||||
{
|
{
|
||||||
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
DTableDesc dtd = HUF_getDTableDesc(DTable);
|
||||||
if (dtd.tableType != 1) return ERROR(GENERIC);
|
if (dtd.tableType != 1) return ERROR(GENERIC);
|
||||||
return HUF_decompress4X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
|
return HUF_decompress4X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static size_t HUF_decompress4X4_DCtx_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize,
|
||||||
size_t HUF_decompress4X4_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize,
|
|
||||||
const void* cSrc, size_t cSrcSize,
|
const void* cSrc, size_t cSrcSize,
|
||||||
void* workSpace, size_t wkspSize)
|
void* workSpace, size_t wkspSize, int bmi2)
|
||||||
{
|
{
|
||||||
const BYTE* ip = (const BYTE*) cSrc;
|
const BYTE* ip = (const BYTE*) cSrc;
|
||||||
|
|
||||||
@ -833,7 +886,14 @@ size_t HUF_decompress4X4_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize,
|
|||||||
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
||||||
ip += hSize; cSrcSize -= hSize;
|
ip += hSize; cSrcSize -= hSize;
|
||||||
|
|
||||||
return HUF_decompress4X4_usingDTable_internal(dst, dstSize, ip, cSrcSize, dctx);
|
return HUF_decompress4X4_usingDTable_internal(dst, dstSize, ip, cSrcSize, dctx, bmi2);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t HUF_decompress4X4_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize,
|
||||||
|
const void* cSrc, size_t cSrcSize,
|
||||||
|
void* workSpace, size_t wkspSize)
|
||||||
|
{
|
||||||
|
return HUF_decompress4X4_DCtx_wksp_bmi2(dctx, dst, dstSize, cSrc, cSrcSize, workSpace, wkspSize, /* bmi2 */ 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -861,8 +921,8 @@ size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize,
|
|||||||
const HUF_DTable* DTable)
|
const HUF_DTable* DTable)
|
||||||
{
|
{
|
||||||
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
return dtd.tableType ? HUF_decompress1X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable) :
|
return dtd.tableType ? HUF_decompress1X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0) :
|
||||||
HUF_decompress1X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable);
|
HUF_decompress1X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize,
|
size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize,
|
||||||
@ -870,8 +930,8 @@ size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize,
|
|||||||
const HUF_DTable* DTable)
|
const HUF_DTable* DTable)
|
||||||
{
|
{
|
||||||
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
return dtd.tableType ? HUF_decompress4X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable) :
|
return dtd.tableType ? HUF_decompress4X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0) :
|
||||||
HUF_decompress4X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable);
|
HUF_decompress4X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, /* bmi2 */ 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -994,3 +1054,42 @@ size_t HUF_decompress1X_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize,
|
|||||||
return HUF_decompress1X_DCtx_wksp(dctx, dst, dstSize, cSrc, cSrcSize,
|
return HUF_decompress1X_DCtx_wksp(dctx, dst, dstSize, cSrc, cSrcSize,
|
||||||
workSpace, sizeof(workSpace));
|
workSpace, sizeof(workSpace));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
size_t HUF_decompress1X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2)
|
||||||
|
{
|
||||||
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
|
return dtd.tableType ? HUF_decompress1X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, bmi2) :
|
||||||
|
HUF_decompress1X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, bmi2);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t HUF_decompress1X2_DCtx_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2)
|
||||||
|
{
|
||||||
|
const BYTE* ip = (const BYTE*) cSrc;
|
||||||
|
|
||||||
|
size_t const hSize = HUF_readDTableX2_wksp(dctx, cSrc, cSrcSize, workSpace, wkspSize);
|
||||||
|
if (HUF_isError(hSize)) return hSize;
|
||||||
|
if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
|
||||||
|
ip += hSize; cSrcSize -= hSize;
|
||||||
|
|
||||||
|
return HUF_decompress1X2_usingDTable_internal(dst, dstSize, ip, cSrcSize, dctx, bmi2);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t HUF_decompress4X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2)
|
||||||
|
{
|
||||||
|
DTableDesc const dtd = HUF_getDTableDesc(DTable);
|
||||||
|
return dtd.tableType ? HUF_decompress4X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, bmi2) :
|
||||||
|
HUF_decompress4X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable, bmi2);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t HUF_decompress4X_hufOnly_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2)
|
||||||
|
{
|
||||||
|
/* validation checks */
|
||||||
|
if (dstSize == 0) return ERROR(dstSize_tooSmall);
|
||||||
|
if (cSrcSize == 0) return ERROR(corruption_detected);
|
||||||
|
|
||||||
|
{ U32 const algoNb = HUF_selectDecoder(dstSize, cSrcSize);
|
||||||
|
return algoNb ? HUF_decompress4X4_DCtx_wksp_bmi2(dctx, dst, dstSize, cSrc, cSrcSize, workSpace, wkspSize, bmi2) :
|
||||||
|
HUF_decompress4X2_DCtx_wksp_bmi2(dctx, dst, dstSize, cSrc, cSrcSize, workSpace, wkspSize, bmi2);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
@ -43,6 +43,7 @@
|
|||||||
* Dependencies
|
* Dependencies
|
||||||
*********************************************************/
|
*********************************************************/
|
||||||
#include <string.h> /* memcpy, memmove, memset */
|
#include <string.h> /* memcpy, memmove, memset */
|
||||||
|
#include "cpu.h"
|
||||||
#include "mem.h" /* low level memory routines */
|
#include "mem.h" /* low level memory routines */
|
||||||
#define FSE_STATIC_LINKING_ONLY
|
#define FSE_STATIC_LINKING_ONLY
|
||||||
#include "fse.h"
|
#include "fse.h"
|
||||||
@ -80,10 +81,25 @@ typedef enum { ZSTDds_getFrameHeaderSize, ZSTDds_decodeFrameHeader,
|
|||||||
typedef enum { zdss_init=0, zdss_loadHeader,
|
typedef enum { zdss_init=0, zdss_loadHeader,
|
||||||
zdss_read, zdss_load, zdss_flush } ZSTD_dStreamStage;
|
zdss_read, zdss_load, zdss_flush } ZSTD_dStreamStage;
|
||||||
|
|
||||||
|
|
||||||
typedef struct {
|
typedef struct {
|
||||||
FSE_DTable LLTable[FSE_DTABLE_SIZE_U32(LLFSELog)];
|
U32 fastMode;
|
||||||
FSE_DTable OFTable[FSE_DTABLE_SIZE_U32(OffFSELog)];
|
U32 tableLog;
|
||||||
FSE_DTable MLTable[FSE_DTABLE_SIZE_U32(MLFSELog)];
|
} ZSTD_seqSymbol_header;
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
U16 nextState;
|
||||||
|
BYTE nbAdditionalBits;
|
||||||
|
BYTE nbBits;
|
||||||
|
U32 baseValue;
|
||||||
|
} ZSTD_seqSymbol;
|
||||||
|
|
||||||
|
#define SEQSYMBOL_TABLE_SIZE(log) (1 + (1<<log))
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
ZSTD_seqSymbol LLTable[SEQSYMBOL_TABLE_SIZE(LLFSELog)];
|
||||||
|
ZSTD_seqSymbol OFTable[SEQSYMBOL_TABLE_SIZE(OffFSELog)];
|
||||||
|
ZSTD_seqSymbol MLTable[SEQSYMBOL_TABLE_SIZE(MLFSELog)];
|
||||||
HUF_DTable hufTable[HUF_DTABLE_SIZE(HufLog)]; /* can accommodate HUF_decompress4X */
|
HUF_DTable hufTable[HUF_DTABLE_SIZE(HufLog)]; /* can accommodate HUF_decompress4X */
|
||||||
U32 workspace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32];
|
U32 workspace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32];
|
||||||
U32 rep[ZSTD_REP_NUM];
|
U32 rep[ZSTD_REP_NUM];
|
||||||
@ -91,9 +107,9 @@ typedef struct {
|
|||||||
|
|
||||||
struct ZSTD_DCtx_s
|
struct ZSTD_DCtx_s
|
||||||
{
|
{
|
||||||
const FSE_DTable* LLTptr;
|
const ZSTD_seqSymbol* LLTptr;
|
||||||
const FSE_DTable* MLTptr;
|
const ZSTD_seqSymbol* MLTptr;
|
||||||
const FSE_DTable* OFTptr;
|
const ZSTD_seqSymbol* OFTptr;
|
||||||
const HUF_DTable* HUFptr;
|
const HUF_DTable* HUFptr;
|
||||||
ZSTD_entropyDTables_t entropy;
|
ZSTD_entropyDTables_t entropy;
|
||||||
const void* previousDstEnd; /* detect continuity */
|
const void* previousDstEnd; /* detect continuity */
|
||||||
@ -116,6 +132,7 @@ struct ZSTD_DCtx_s
|
|||||||
size_t litSize;
|
size_t litSize;
|
||||||
size_t rleSize;
|
size_t rleSize;
|
||||||
size_t staticSize;
|
size_t staticSize;
|
||||||
|
int bmi2; /* == 1 if the CPU supports BMI2 and 0 otherwise. CPU support is determined dynamically once per context lifetime. */
|
||||||
|
|
||||||
/* streaming */
|
/* streaming */
|
||||||
ZSTD_DDict* ddictLocal;
|
ZSTD_DDict* ddictLocal;
|
||||||
@ -173,6 +190,7 @@ static void ZSTD_initDCtx_internal(ZSTD_DCtx* dctx)
|
|||||||
dctx->inBuffSize = 0;
|
dctx->inBuffSize = 0;
|
||||||
dctx->outBuffSize = 0;
|
dctx->outBuffSize = 0;
|
||||||
dctx->streamStage = zdss_init;
|
dctx->streamStage = zdss_init;
|
||||||
|
dctx->bmi2 = ZSTD_cpuid_bmi2(ZSTD_cpuid());
|
||||||
}
|
}
|
||||||
|
|
||||||
ZSTD_DCtx* ZSTD_initStaticDCtx(void *workspace, size_t workspaceSize)
|
ZSTD_DCtx* ZSTD_initStaticDCtx(void *workspace, size_t workspaceSize)
|
||||||
@ -571,13 +589,13 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx,
|
|||||||
|
|
||||||
if (HUF_isError((litEncType==set_repeat) ?
|
if (HUF_isError((litEncType==set_repeat) ?
|
||||||
( singleStream ?
|
( singleStream ?
|
||||||
HUF_decompress1X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr) :
|
HUF_decompress1X_usingDTable_bmi2(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr, dctx->bmi2) :
|
||||||
HUF_decompress4X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr) ) :
|
HUF_decompress4X_usingDTable_bmi2(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr, dctx->bmi2) ) :
|
||||||
( singleStream ?
|
( singleStream ?
|
||||||
HUF_decompress1X2_DCtx_wksp(dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize,
|
HUF_decompress1X2_DCtx_wksp_bmi2(dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize,
|
||||||
dctx->entropy.workspace, sizeof(dctx->entropy.workspace)) :
|
dctx->entropy.workspace, sizeof(dctx->entropy.workspace), dctx->bmi2) :
|
||||||
HUF_decompress4X_hufOnly_wksp(dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize,
|
HUF_decompress4X_hufOnly_wksp_bmi2(dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize,
|
||||||
dctx->entropy.workspace, sizeof(dctx->entropy.workspace)))))
|
dctx->entropy.workspace, sizeof(dctx->entropy.workspace), dctx->bmi2))))
|
||||||
return ERROR(corruption_detected);
|
return ERROR(corruption_detected);
|
||||||
|
|
||||||
dctx->litPtr = dctx->litBuffer;
|
dctx->litPtr = dctx->litBuffer;
|
||||||
@ -652,98 +670,219 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx,
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Default FSE distribution tables.
|
||||||
typedef union {
|
* These are pre-calculated FSE decoding tables using default distributions as defined in specification :
|
||||||
FSE_decode_t realData;
|
* https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md#default-distributions
|
||||||
FSE_DTable dtable;
|
* They were generated programmatically with following method :
|
||||||
U32 alignedBy4;
|
* - start from default distributions, present in /lib/common/zstd_internal.h
|
||||||
} FSE_decode_t4;
|
* - generate tables normally, using ZSTD_buildFSETable()
|
||||||
|
* - printout the content of tables
|
||||||
|
* - pretify output, report below, test with fuzzer to ensure it's correct */
|
||||||
|
|
||||||
/* Default FSE distribution table for Literal Lengths */
|
/* Default FSE distribution table for Literal Lengths */
|
||||||
static const FSE_decode_t4 LL_defaultDTable[(1<<LL_DEFAULTNORMLOG)+1] = {
|
static const ZSTD_seqSymbol LL_defaultDTable[(1<<LL_DEFAULTNORMLOG)+1] = {
|
||||||
{ { LL_DEFAULTNORMLOG, 1, 1 } }, /* header : tableLog, fastMode, fastMode */
|
{ 1, 1, 1, LL_DEFAULTNORMLOG}, /* header : fastMode, tableLog */
|
||||||
/* base, symbol, bits */
|
/* nextState, nbAddBits, nbBits, baseVal */
|
||||||
{ { 0, 0, 4 } }, { { 16, 0, 4 } }, { { 32, 1, 5 } }, { { 0, 3, 5 } },
|
{ 0, 0, 4, 0}, { 16, 0, 4, 0},
|
||||||
{ { 0, 4, 5 } }, { { 0, 6, 5 } }, { { 0, 7, 5 } }, { { 0, 9, 5 } },
|
{ 32, 0, 5, 1}, { 0, 0, 5, 3},
|
||||||
{ { 0, 10, 5 } }, { { 0, 12, 5 } }, { { 0, 14, 6 } }, { { 0, 16, 5 } },
|
{ 0, 0, 5, 4}, { 0, 0, 5, 6},
|
||||||
{ { 0, 18, 5 } }, { { 0, 19, 5 } }, { { 0, 21, 5 } }, { { 0, 22, 5 } },
|
{ 0, 0, 5, 7}, { 0, 0, 5, 9},
|
||||||
{ { 0, 24, 5 } }, { { 32, 25, 5 } }, { { 0, 26, 5 } }, { { 0, 27, 6 } },
|
{ 0, 0, 5, 10}, { 0, 0, 5, 12},
|
||||||
{ { 0, 29, 6 } }, { { 0, 31, 6 } }, { { 32, 0, 4 } }, { { 0, 1, 4 } },
|
{ 0, 0, 6, 14}, { 0, 1, 5, 16},
|
||||||
{ { 0, 2, 5 } }, { { 32, 4, 5 } }, { { 0, 5, 5 } }, { { 32, 7, 5 } },
|
{ 0, 1, 5, 20}, { 0, 1, 5, 22},
|
||||||
{ { 0, 8, 5 } }, { { 32, 10, 5 } }, { { 0, 11, 5 } }, { { 0, 13, 6 } },
|
{ 0, 2, 5, 28}, { 0, 3, 5, 32},
|
||||||
{ { 32, 16, 5 } }, { { 0, 17, 5 } }, { { 32, 19, 5 } }, { { 0, 20, 5 } },
|
{ 0, 4, 5, 48}, { 32, 6, 5, 64},
|
||||||
{ { 32, 22, 5 } }, { { 0, 23, 5 } }, { { 0, 25, 4 } }, { { 16, 25, 4 } },
|
{ 0, 7, 5, 128}, { 0, 8, 6, 256},
|
||||||
{ { 32, 26, 5 } }, { { 0, 28, 6 } }, { { 0, 30, 6 } }, { { 48, 0, 4 } },
|
{ 0, 10, 6, 1024}, { 0, 12, 6, 4096},
|
||||||
{ { 16, 1, 4 } }, { { 32, 2, 5 } }, { { 32, 3, 5 } }, { { 32, 5, 5 } },
|
{ 32, 0, 4, 0}, { 0, 0, 4, 1},
|
||||||
{ { 32, 6, 5 } }, { { 32, 8, 5 } }, { { 32, 9, 5 } }, { { 32, 11, 5 } },
|
{ 0, 0, 5, 2}, { 32, 0, 5, 4},
|
||||||
{ { 32, 12, 5 } }, { { 0, 15, 6 } }, { { 32, 17, 5 } }, { { 32, 18, 5 } },
|
{ 0, 0, 5, 5}, { 32, 0, 5, 7},
|
||||||
{ { 32, 20, 5 } }, { { 32, 21, 5 } }, { { 32, 23, 5 } }, { { 32, 24, 5 } },
|
{ 0, 0, 5, 8}, { 32, 0, 5, 10},
|
||||||
{ { 0, 35, 6 } }, { { 0, 34, 6 } }, { { 0, 33, 6 } }, { { 0, 32, 6 } },
|
{ 0, 0, 5, 11}, { 0, 0, 6, 13},
|
||||||
|
{ 32, 1, 5, 16}, { 0, 1, 5, 18},
|
||||||
|
{ 32, 1, 5, 22}, { 0, 2, 5, 24},
|
||||||
|
{ 32, 3, 5, 32}, { 0, 3, 5, 40},
|
||||||
|
{ 0, 6, 4, 64}, { 16, 6, 4, 64},
|
||||||
|
{ 32, 7, 5, 128}, { 0, 9, 6, 512},
|
||||||
|
{ 0, 11, 6, 2048}, { 48, 0, 4, 0},
|
||||||
|
{ 16, 0, 4, 1}, { 32, 0, 5, 2},
|
||||||
|
{ 32, 0, 5, 3}, { 32, 0, 5, 5},
|
||||||
|
{ 32, 0, 5, 6}, { 32, 0, 5, 8},
|
||||||
|
{ 32, 0, 5, 9}, { 32, 0, 5, 11},
|
||||||
|
{ 32, 0, 5, 12}, { 0, 0, 6, 15},
|
||||||
|
{ 32, 1, 5, 18}, { 32, 1, 5, 20},
|
||||||
|
{ 32, 2, 5, 24}, { 32, 2, 5, 28},
|
||||||
|
{ 32, 3, 5, 40}, { 32, 4, 5, 48},
|
||||||
|
{ 0, 16, 6,65536}, { 0, 15, 6,32768},
|
||||||
|
{ 0, 14, 6,16384}, { 0, 13, 6, 8192},
|
||||||
}; /* LL_defaultDTable */
|
}; /* LL_defaultDTable */
|
||||||
|
|
||||||
|
/* Default FSE distribution table for Offset Codes */
|
||||||
|
static const ZSTD_seqSymbol OF_defaultDTable[(1<<OF_DEFAULTNORMLOG)+1] = {
|
||||||
|
{ 1, 1, 1, OF_DEFAULTNORMLOG}, /* header : fastMode, tableLog */
|
||||||
|
/* nextState, nbAddBits, nbBits, baseVal */
|
||||||
|
{ 0, 0, 5, 0}, { 0, 6, 4, 61},
|
||||||
|
{ 0, 9, 5, 509}, { 0, 15, 5,32765},
|
||||||
|
{ 0, 21, 5,2097149}, { 0, 3, 5, 5},
|
||||||
|
{ 0, 7, 4, 125}, { 0, 12, 5, 4093},
|
||||||
|
{ 0, 18, 5,262141}, { 0, 23, 5,8388605},
|
||||||
|
{ 0, 5, 5, 29}, { 0, 8, 4, 253},
|
||||||
|
{ 0, 14, 5,16381}, { 0, 20, 5,1048573},
|
||||||
|
{ 0, 2, 5, 1}, { 16, 7, 4, 125},
|
||||||
|
{ 0, 11, 5, 2045}, { 0, 17, 5,131069},
|
||||||
|
{ 0, 22, 5,4194301}, { 0, 4, 5, 13},
|
||||||
|
{ 16, 8, 4, 253}, { 0, 13, 5, 8189},
|
||||||
|
{ 0, 19, 5,524285}, { 0, 1, 5, 1},
|
||||||
|
{ 16, 6, 4, 61}, { 0, 10, 5, 1021},
|
||||||
|
{ 0, 16, 5,65533}, { 0, 28, 5,268435453},
|
||||||
|
{ 0, 27, 5,134217725}, { 0, 26, 5,67108861},
|
||||||
|
{ 0, 25, 5,33554429}, { 0, 24, 5,16777213},
|
||||||
|
}; /* OF_defaultDTable */
|
||||||
|
|
||||||
|
|
||||||
/* Default FSE distribution table for Match Lengths */
|
/* Default FSE distribution table for Match Lengths */
|
||||||
static const FSE_decode_t4 ML_defaultDTable[(1<<ML_DEFAULTNORMLOG)+1] = {
|
static const ZSTD_seqSymbol ML_defaultDTable[(1<<ML_DEFAULTNORMLOG)+1] = {
|
||||||
{ { ML_DEFAULTNORMLOG, 1, 1 } }, /* header : tableLog, fastMode, fastMode */
|
{ 1, 1, 1, ML_DEFAULTNORMLOG}, /* header : fastMode, tableLog */
|
||||||
/* base, symbol, bits */
|
/* nextState, nbAddBits, nbBits, baseVal */
|
||||||
{ { 0, 0, 6 } }, { { 0, 1, 4 } }, { { 32, 2, 5 } }, { { 0, 3, 5 } },
|
{ 0, 0, 6, 3}, { 0, 0, 4, 4},
|
||||||
{ { 0, 5, 5 } }, { { 0, 6, 5 } }, { { 0, 8, 5 } }, { { 0, 10, 6 } },
|
{ 32, 0, 5, 5}, { 0, 0, 5, 6},
|
||||||
{ { 0, 13, 6 } }, { { 0, 16, 6 } }, { { 0, 19, 6 } }, { { 0, 22, 6 } },
|
{ 0, 0, 5, 8}, { 0, 0, 5, 9},
|
||||||
{ { 0, 25, 6 } }, { { 0, 28, 6 } }, { { 0, 31, 6 } }, { { 0, 33, 6 } },
|
{ 0, 0, 5, 11}, { 0, 0, 6, 13},
|
||||||
{ { 0, 35, 6 } }, { { 0, 37, 6 } }, { { 0, 39, 6 } }, { { 0, 41, 6 } },
|
{ 0, 0, 6, 16}, { 0, 0, 6, 19},
|
||||||
{ { 0, 43, 6 } }, { { 0, 45, 6 } }, { { 16, 1, 4 } }, { { 0, 2, 4 } },
|
{ 0, 0, 6, 22}, { 0, 0, 6, 25},
|
||||||
{ { 32, 3, 5 } }, { { 0, 4, 5 } }, { { 32, 6, 5 } }, { { 0, 7, 5 } },
|
{ 0, 0, 6, 28}, { 0, 0, 6, 31},
|
||||||
{ { 0, 9, 6 } }, { { 0, 12, 6 } }, { { 0, 15, 6 } }, { { 0, 18, 6 } },
|
{ 0, 0, 6, 34}, { 0, 1, 6, 37},
|
||||||
{ { 0, 21, 6 } }, { { 0, 24, 6 } }, { { 0, 27, 6 } }, { { 0, 30, 6 } },
|
{ 0, 1, 6, 41}, { 0, 2, 6, 47},
|
||||||
{ { 0, 32, 6 } }, { { 0, 34, 6 } }, { { 0, 36, 6 } }, { { 0, 38, 6 } },
|
{ 0, 3, 6, 59}, { 0, 4, 6, 83},
|
||||||
{ { 0, 40, 6 } }, { { 0, 42, 6 } }, { { 0, 44, 6 } }, { { 32, 1, 4 } },
|
{ 0, 7, 6, 131}, { 0, 9, 6, 515},
|
||||||
{ { 48, 1, 4 } }, { { 16, 2, 4 } }, { { 32, 4, 5 } }, { { 32, 5, 5 } },
|
{ 16, 0, 4, 4}, { 0, 0, 4, 5},
|
||||||
{ { 32, 7, 5 } }, { { 32, 8, 5 } }, { { 0, 11, 6 } }, { { 0, 14, 6 } },
|
{ 32, 0, 5, 6}, { 0, 0, 5, 7},
|
||||||
{ { 0, 17, 6 } }, { { 0, 20, 6 } }, { { 0, 23, 6 } }, { { 0, 26, 6 } },
|
{ 32, 0, 5, 9}, { 0, 0, 5, 10},
|
||||||
{ { 0, 29, 6 } }, { { 0, 52, 6 } }, { { 0, 51, 6 } }, { { 0, 50, 6 } },
|
{ 0, 0, 6, 12}, { 0, 0, 6, 15},
|
||||||
{ { 0, 49, 6 } }, { { 0, 48, 6 } }, { { 0, 47, 6 } }, { { 0, 46, 6 } },
|
{ 0, 0, 6, 18}, { 0, 0, 6, 21},
|
||||||
|
{ 0, 0, 6, 24}, { 0, 0, 6, 27},
|
||||||
|
{ 0, 0, 6, 30}, { 0, 0, 6, 33},
|
||||||
|
{ 0, 1, 6, 35}, { 0, 1, 6, 39},
|
||||||
|
{ 0, 2, 6, 43}, { 0, 3, 6, 51},
|
||||||
|
{ 0, 4, 6, 67}, { 0, 5, 6, 99},
|
||||||
|
{ 0, 8, 6, 259}, { 32, 0, 4, 4},
|
||||||
|
{ 48, 0, 4, 4}, { 16, 0, 4, 5},
|
||||||
|
{ 32, 0, 5, 7}, { 32, 0, 5, 8},
|
||||||
|
{ 32, 0, 5, 10}, { 32, 0, 5, 11},
|
||||||
|
{ 0, 0, 6, 14}, { 0, 0, 6, 17},
|
||||||
|
{ 0, 0, 6, 20}, { 0, 0, 6, 23},
|
||||||
|
{ 0, 0, 6, 26}, { 0, 0, 6, 29},
|
||||||
|
{ 0, 0, 6, 32}, { 0, 16, 6,65539},
|
||||||
|
{ 0, 15, 6,32771}, { 0, 14, 6,16387},
|
||||||
|
{ 0, 13, 6, 8195}, { 0, 12, 6, 4099},
|
||||||
|
{ 0, 11, 6, 2051}, { 0, 10, 6, 1027},
|
||||||
}; /* ML_defaultDTable */
|
}; /* ML_defaultDTable */
|
||||||
|
|
||||||
/* Default FSE distribution table for Offset Codes */
|
|
||||||
static const FSE_decode_t4 OF_defaultDTable[(1<<OF_DEFAULTNORMLOG)+1] = {
|
static void ZSTD_buildSeqTable_rle(ZSTD_seqSymbol* dt, U32 baseValue, U32 nbAddBits)
|
||||||
{ { OF_DEFAULTNORMLOG, 1, 1 } }, /* header : tableLog, fastMode, fastMode */
|
{
|
||||||
/* base, symbol, bits */
|
void* ptr = dt;
|
||||||
{ { 0, 0, 5 } }, { { 0, 6, 4 } },
|
ZSTD_seqSymbol_header* const DTableH = (ZSTD_seqSymbol_header*)ptr;
|
||||||
{ { 0, 9, 5 } }, { { 0, 15, 5 } },
|
ZSTD_seqSymbol* const cell = dt + 1;
|
||||||
{ { 0, 21, 5 } }, { { 0, 3, 5 } },
|
|
||||||
{ { 0, 7, 4 } }, { { 0, 12, 5 } },
|
DTableH->tableLog = 0;
|
||||||
{ { 0, 18, 5 } }, { { 0, 23, 5 } },
|
DTableH->fastMode = 0;
|
||||||
{ { 0, 5, 5 } }, { { 0, 8, 4 } },
|
|
||||||
{ { 0, 14, 5 } }, { { 0, 20, 5 } },
|
cell->nbBits = 0;
|
||||||
{ { 0, 2, 5 } }, { { 16, 7, 4 } },
|
cell->nextState = 0;
|
||||||
{ { 0, 11, 5 } }, { { 0, 17, 5 } },
|
assert(nbAddBits < 255);
|
||||||
{ { 0, 22, 5 } }, { { 0, 4, 5 } },
|
cell->nbAdditionalBits = (BYTE)nbAddBits;
|
||||||
{ { 16, 8, 4 } }, { { 0, 13, 5 } },
|
cell->baseValue = baseValue;
|
||||||
{ { 0, 19, 5 } }, { { 0, 1, 5 } },
|
}
|
||||||
{ { 16, 6, 4 } }, { { 0, 10, 5 } },
|
|
||||||
{ { 0, 16, 5 } }, { { 0, 28, 5 } },
|
|
||||||
{ { 0, 27, 5 } }, { { 0, 26, 5 } },
|
/* ZSTD_buildFSETable() :
|
||||||
{ { 0, 25, 5 } }, { { 0, 24, 5 } },
|
* generate FSE decoding table for one symbol (ll, ml or off) */
|
||||||
}; /* OF_defaultDTable */
|
static void
|
||||||
|
ZSTD_buildFSETable(ZSTD_seqSymbol* dt,
|
||||||
|
const short* normalizedCounter, unsigned maxSymbolValue,
|
||||||
|
const U32* baseValue, const U32* nbAdditionalBits,
|
||||||
|
unsigned tableLog)
|
||||||
|
{
|
||||||
|
ZSTD_seqSymbol* const tableDecode = dt+1;
|
||||||
|
U16 symbolNext[MaxSeq+1];
|
||||||
|
|
||||||
|
U32 const maxSV1 = maxSymbolValue + 1;
|
||||||
|
U32 const tableSize = 1 << tableLog;
|
||||||
|
U32 highThreshold = tableSize-1;
|
||||||
|
|
||||||
|
/* Sanity Checks */
|
||||||
|
assert(maxSymbolValue <= MaxSeq);
|
||||||
|
assert(tableLog <= MaxFSELog);
|
||||||
|
|
||||||
|
/* Init, lay down lowprob symbols */
|
||||||
|
{ ZSTD_seqSymbol_header DTableH;
|
||||||
|
DTableH.tableLog = tableLog;
|
||||||
|
DTableH.fastMode = 1;
|
||||||
|
{ S16 const largeLimit= (S16)(1 << (tableLog-1));
|
||||||
|
U32 s;
|
||||||
|
for (s=0; s<maxSV1; s++) {
|
||||||
|
if (normalizedCounter[s]==-1) {
|
||||||
|
tableDecode[highThreshold--].baseValue = s;
|
||||||
|
symbolNext[s] = 1;
|
||||||
|
} else {
|
||||||
|
if (normalizedCounter[s] >= largeLimit) DTableH.fastMode=0;
|
||||||
|
symbolNext[s] = normalizedCounter[s];
|
||||||
|
} } }
|
||||||
|
memcpy(dt, &DTableH, sizeof(DTableH));
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Spread symbols */
|
||||||
|
{ U32 const tableMask = tableSize-1;
|
||||||
|
U32 const step = FSE_TABLESTEP(tableSize);
|
||||||
|
U32 s, position = 0;
|
||||||
|
for (s=0; s<maxSV1; s++) {
|
||||||
|
int i;
|
||||||
|
for (i=0; i<normalizedCounter[s]; i++) {
|
||||||
|
tableDecode[position].baseValue = s;
|
||||||
|
position = (position + step) & tableMask;
|
||||||
|
while (position > highThreshold) position = (position + step) & tableMask; /* lowprob area */
|
||||||
|
} }
|
||||||
|
assert(position == 0); /* position must reach all cells once, otherwise normalizedCounter is incorrect */
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Build Decoding table */
|
||||||
|
{ U32 u;
|
||||||
|
for (u=0; u<tableSize; u++) {
|
||||||
|
U32 const symbol = tableDecode[u].baseValue;
|
||||||
|
U32 const nextState = symbolNext[symbol]++;
|
||||||
|
tableDecode[u].nbBits = (BYTE) (tableLog - BIT_highbit32(nextState) );
|
||||||
|
tableDecode[u].nextState = (U16) ( (nextState << tableDecode[u].nbBits) - tableSize);
|
||||||
|
assert(nbAdditionalBits[symbol] < 255);
|
||||||
|
tableDecode[u].nbAdditionalBits = (BYTE)nbAdditionalBits[symbol];
|
||||||
|
tableDecode[u].baseValue = baseValue[symbol];
|
||||||
|
} }
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/*! ZSTD_buildSeqTable() :
|
/*! ZSTD_buildSeqTable() :
|
||||||
* @return : nb bytes read from src,
|
* @return : nb bytes read from src,
|
||||||
* or an error code if it fails, testable with ZSTD_isError()
|
* or an error code if it fails */
|
||||||
*/
|
static size_t ZSTD_buildSeqTable(ZSTD_seqSymbol* DTableSpace, const ZSTD_seqSymbol** DTablePtr,
|
||||||
static size_t ZSTD_buildSeqTable(FSE_DTable* DTableSpace, const FSE_DTable** DTablePtr,
|
|
||||||
symbolEncodingType_e type, U32 max, U32 maxLog,
|
symbolEncodingType_e type, U32 max, U32 maxLog,
|
||||||
const void* src, size_t srcSize,
|
const void* src, size_t srcSize,
|
||||||
const FSE_decode_t4* defaultTable, U32 flagRepeatTable)
|
const U32* baseValue, const U32* nbAdditionalBits,
|
||||||
|
const ZSTD_seqSymbol* defaultTable, U32 flagRepeatTable)
|
||||||
{
|
{
|
||||||
switch(type)
|
switch(type)
|
||||||
{
|
{
|
||||||
case set_rle :
|
case set_rle :
|
||||||
if (!srcSize) return ERROR(srcSize_wrong);
|
if (!srcSize) return ERROR(srcSize_wrong);
|
||||||
if ( (*(const BYTE*)src) > max) return ERROR(corruption_detected);
|
if ( (*(const BYTE*)src) > max) return ERROR(corruption_detected);
|
||||||
FSE_buildDTable_rle(DTableSpace, *(const BYTE*)src);
|
{ U32 const symbol = *(const BYTE*)src;
|
||||||
|
U32 const baseline = baseValue[symbol];
|
||||||
|
U32 const nbBits = nbAdditionalBits[symbol];
|
||||||
|
ZSTD_buildSeqTable_rle(DTableSpace, baseline, nbBits);
|
||||||
|
}
|
||||||
*DTablePtr = DTableSpace;
|
*DTablePtr = DTableSpace;
|
||||||
return 1;
|
return 1;
|
||||||
case set_basic :
|
case set_basic :
|
||||||
*DTablePtr = &defaultTable->dtable;
|
*DTablePtr = defaultTable;
|
||||||
return 0;
|
return 0;
|
||||||
case set_repeat:
|
case set_repeat:
|
||||||
if (!flagRepeatTable) return ERROR(corruption_detected);
|
if (!flagRepeatTable) return ERROR(corruption_detected);
|
||||||
@ -754,7 +893,7 @@ static size_t ZSTD_buildSeqTable(FSE_DTable* DTableSpace, const FSE_DTable** DTa
|
|||||||
size_t const headerSize = FSE_readNCount(norm, &max, &tableLog, src, srcSize);
|
size_t const headerSize = FSE_readNCount(norm, &max, &tableLog, src, srcSize);
|
||||||
if (FSE_isError(headerSize)) return ERROR(corruption_detected);
|
if (FSE_isError(headerSize)) return ERROR(corruption_detected);
|
||||||
if (tableLog > maxLog) return ERROR(corruption_detected);
|
if (tableLog > maxLog) return ERROR(corruption_detected);
|
||||||
FSE_buildDTable(DTableSpace, norm, max, tableLog);
|
ZSTD_buildFSETable(DTableSpace, norm, max, baseValue, nbAdditionalBits, tableLog);
|
||||||
*DTablePtr = DTableSpace;
|
*DTablePtr = DTableSpace;
|
||||||
return headerSize;
|
return headerSize;
|
||||||
}
|
}
|
||||||
@ -764,6 +903,35 @@ static size_t ZSTD_buildSeqTable(FSE_DTable* DTableSpace, const FSE_DTable** DTa
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static const U32 LL_base[MaxLL+1] = {
|
||||||
|
0, 1, 2, 3, 4, 5, 6, 7,
|
||||||
|
8, 9, 10, 11, 12, 13, 14, 15,
|
||||||
|
16, 18, 20, 22, 24, 28, 32, 40,
|
||||||
|
48, 64, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000,
|
||||||
|
0x2000, 0x4000, 0x8000, 0x10000 };
|
||||||
|
|
||||||
|
static const U32 OF_base[MaxOff+1] = {
|
||||||
|
0, 1, 1, 5, 0xD, 0x1D, 0x3D, 0x7D,
|
||||||
|
0xFD, 0x1FD, 0x3FD, 0x7FD, 0xFFD, 0x1FFD, 0x3FFD, 0x7FFD,
|
||||||
|
0xFFFD, 0x1FFFD, 0x3FFFD, 0x7FFFD, 0xFFFFD, 0x1FFFFD, 0x3FFFFD, 0x7FFFFD,
|
||||||
|
0xFFFFFD, 0x1FFFFFD, 0x3FFFFFD, 0x7FFFFFD, 0xFFFFFFD, 0x1FFFFFFD, 0x3FFFFFFD, 0x7FFFFFFD };
|
||||||
|
|
||||||
|
static const U32 OF_bits[MaxOff+1] = {
|
||||||
|
0, 1, 2, 3, 4, 5, 6, 7,
|
||||||
|
8, 9, 10, 11, 12, 13, 14, 15,
|
||||||
|
16, 17, 18, 19, 20, 21, 22, 23,
|
||||||
|
24, 25, 26, 27, 28, 29, 30, 31 };
|
||||||
|
|
||||||
|
static const U32 ML_base[MaxML+1] = {
|
||||||
|
3, 4, 5, 6, 7, 8, 9, 10,
|
||||||
|
11, 12, 13, 14, 15, 16, 17, 18,
|
||||||
|
19, 20, 21, 22, 23, 24, 25, 26,
|
||||||
|
27, 28, 29, 30, 31, 32, 33, 34,
|
||||||
|
35, 37, 39, 41, 43, 47, 51, 59,
|
||||||
|
67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803,
|
||||||
|
0x1003, 0x2003, 0x4003, 0x8003, 0x10003 };
|
||||||
|
|
||||||
|
|
||||||
size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr,
|
size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr,
|
||||||
const void* src, size_t srcSize)
|
const void* src, size_t srcSize)
|
||||||
{
|
{
|
||||||
@ -800,19 +968,27 @@ size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr,
|
|||||||
/* Build DTables */
|
/* Build DTables */
|
||||||
{ size_t const llhSize = ZSTD_buildSeqTable(dctx->entropy.LLTable, &dctx->LLTptr,
|
{ size_t const llhSize = ZSTD_buildSeqTable(dctx->entropy.LLTable, &dctx->LLTptr,
|
||||||
LLtype, MaxLL, LLFSELog,
|
LLtype, MaxLL, LLFSELog,
|
||||||
ip, iend-ip, LL_defaultDTable, dctx->fseEntropy);
|
ip, iend-ip,
|
||||||
|
LL_base, LL_bits,
|
||||||
|
LL_defaultDTable, dctx->fseEntropy);
|
||||||
if (ZSTD_isError(llhSize)) return ERROR(corruption_detected);
|
if (ZSTD_isError(llhSize)) return ERROR(corruption_detected);
|
||||||
ip += llhSize;
|
ip += llhSize;
|
||||||
}
|
}
|
||||||
|
|
||||||
{ size_t const ofhSize = ZSTD_buildSeqTable(dctx->entropy.OFTable, &dctx->OFTptr,
|
{ size_t const ofhSize = ZSTD_buildSeqTable(dctx->entropy.OFTable, &dctx->OFTptr,
|
||||||
OFtype, MaxOff, OffFSELog,
|
OFtype, MaxOff, OffFSELog,
|
||||||
ip, iend-ip, OF_defaultDTable, dctx->fseEntropy);
|
ip, iend-ip,
|
||||||
|
OF_base, OF_bits,
|
||||||
|
OF_defaultDTable, dctx->fseEntropy);
|
||||||
if (ZSTD_isError(ofhSize)) return ERROR(corruption_detected);
|
if (ZSTD_isError(ofhSize)) return ERROR(corruption_detected);
|
||||||
ip += ofhSize;
|
ip += ofhSize;
|
||||||
}
|
}
|
||||||
|
|
||||||
{ size_t const mlhSize = ZSTD_buildSeqTable(dctx->entropy.MLTable, &dctx->MLTptr,
|
{ size_t const mlhSize = ZSTD_buildSeqTable(dctx->entropy.MLTable, &dctx->MLTptr,
|
||||||
MLtype, MaxML, MLFSELog,
|
MLtype, MaxML, MLFSELog,
|
||||||
ip, iend-ip, ML_defaultDTable, dctx->fseEntropy);
|
ip, iend-ip,
|
||||||
|
ML_base, ML_bits,
|
||||||
|
ML_defaultDTable, dctx->fseEntropy);
|
||||||
if (ZSTD_isError(mlhSize)) return ERROR(corruption_detected);
|
if (ZSTD_isError(mlhSize)) return ERROR(corruption_detected);
|
||||||
ip += mlhSize;
|
ip += mlhSize;
|
||||||
}
|
}
|
||||||
@ -829,11 +1005,16 @@ typedef struct {
|
|||||||
const BYTE* match;
|
const BYTE* match;
|
||||||
} seq_t;
|
} seq_t;
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
size_t state;
|
||||||
|
const ZSTD_seqSymbol* table;
|
||||||
|
} ZSTD_fseState;
|
||||||
|
|
||||||
typedef struct {
|
typedef struct {
|
||||||
BIT_DStream_t DStream;
|
BIT_DStream_t DStream;
|
||||||
FSE_DState_t stateLL;
|
ZSTD_fseState stateLL;
|
||||||
FSE_DState_t stateOffb;
|
ZSTD_fseState stateOffb;
|
||||||
FSE_DState_t stateML;
|
ZSTD_fseState stateML;
|
||||||
size_t prevOffset[ZSTD_REP_NUM];
|
size_t prevOffset[ZSTD_REP_NUM];
|
||||||
const BYTE* prefixStart;
|
const BYTE* prefixStart;
|
||||||
const BYTE* dictEnd;
|
const BYTE* dictEnd;
|
||||||
@ -888,119 +1069,6 @@ size_t ZSTD_execSequenceLast7(BYTE* op,
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
typedef enum { ZSTD_lo_isRegularOffset, ZSTD_lo_isLongOffset=1 } ZSTD_longOffset_e;
|
|
||||||
|
|
||||||
/* We need to add at most (ZSTD_WINDOWLOG_MAX_32 - 1) bits to read the maximum
|
|
||||||
* offset bits. But we can only read at most (STREAM_ACCUMULATOR_MIN_32 - 1)
|
|
||||||
* bits before reloading. This value is the maximum number of bytes we read
|
|
||||||
* after reloading when we are decoding long offets.
|
|
||||||
*/
|
|
||||||
#define LONG_OFFSETS_MAX_EXTRA_BITS_32 \
|
|
||||||
(ZSTD_WINDOWLOG_MAX_32 > STREAM_ACCUMULATOR_MIN_32 \
|
|
||||||
? ZSTD_WINDOWLOG_MAX_32 - STREAM_ACCUMULATOR_MIN_32 \
|
|
||||||
: 0)
|
|
||||||
|
|
||||||
static seq_t ZSTD_decodeSequence(seqState_t* seqState, const ZSTD_longOffset_e longOffsets)
|
|
||||||
{
|
|
||||||
seq_t seq;
|
|
||||||
|
|
||||||
U32 const llCode = FSE_peekSymbol(&seqState->stateLL);
|
|
||||||
U32 const mlCode = FSE_peekSymbol(&seqState->stateML);
|
|
||||||
U32 const ofCode = FSE_peekSymbol(&seqState->stateOffb); /* <= MaxOff, by table construction */
|
|
||||||
|
|
||||||
U32 const llBits = LL_bits[llCode];
|
|
||||||
U32 const mlBits = ML_bits[mlCode];
|
|
||||||
U32 const ofBits = ofCode;
|
|
||||||
U32 const totalBits = llBits+mlBits+ofBits;
|
|
||||||
|
|
||||||
static const U32 LL_base[MaxLL+1] = {
|
|
||||||
0, 1, 2, 3, 4, 5, 6, 7,
|
|
||||||
8, 9, 10, 11, 12, 13, 14, 15,
|
|
||||||
16, 18, 20, 22, 24, 28, 32, 40,
|
|
||||||
48, 64, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000,
|
|
||||||
0x2000, 0x4000, 0x8000, 0x10000 };
|
|
||||||
|
|
||||||
static const U32 ML_base[MaxML+1] = {
|
|
||||||
3, 4, 5, 6, 7, 8, 9, 10,
|
|
||||||
11, 12, 13, 14, 15, 16, 17, 18,
|
|
||||||
19, 20, 21, 22, 23, 24, 25, 26,
|
|
||||||
27, 28, 29, 30, 31, 32, 33, 34,
|
|
||||||
35, 37, 39, 41, 43, 47, 51, 59,
|
|
||||||
67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803,
|
|
||||||
0x1003, 0x2003, 0x4003, 0x8003, 0x10003 };
|
|
||||||
|
|
||||||
static const U32 OF_base[MaxOff+1] = {
|
|
||||||
0, 1, 1, 5, 0xD, 0x1D, 0x3D, 0x7D,
|
|
||||||
0xFD, 0x1FD, 0x3FD, 0x7FD, 0xFFD, 0x1FFD, 0x3FFD, 0x7FFD,
|
|
||||||
0xFFFD, 0x1FFFD, 0x3FFFD, 0x7FFFD, 0xFFFFD, 0x1FFFFD, 0x3FFFFD, 0x7FFFFD,
|
|
||||||
0xFFFFFD, 0x1FFFFFD, 0x3FFFFFD, 0x7FFFFFD, 0xFFFFFFD, 0x1FFFFFFD, 0x3FFFFFFD, 0x7FFFFFFD };
|
|
||||||
|
|
||||||
/* sequence */
|
|
||||||
{ size_t offset;
|
|
||||||
if (!ofCode)
|
|
||||||
offset = 0;
|
|
||||||
else {
|
|
||||||
ZSTD_STATIC_ASSERT(ZSTD_lo_isLongOffset == 1);
|
|
||||||
ZSTD_STATIC_ASSERT(LONG_OFFSETS_MAX_EXTRA_BITS_32 == 5);
|
|
||||||
assert(ofBits <= MaxOff);
|
|
||||||
if (MEM_32bits() && longOffsets && (ofBits >= STREAM_ACCUMULATOR_MIN_32)) {
|
|
||||||
U32 const extraBits = ofBits - MIN(ofBits, 32 - seqState->DStream.bitsConsumed);
|
|
||||||
offset = OF_base[ofCode] + (BIT_readBitsFast(&seqState->DStream, ofBits - extraBits) << extraBits);
|
|
||||||
BIT_reloadDStream(&seqState->DStream);
|
|
||||||
if (extraBits) offset += BIT_readBitsFast(&seqState->DStream, extraBits);
|
|
||||||
assert(extraBits <= LONG_OFFSETS_MAX_EXTRA_BITS_32); /* to avoid another reload */
|
|
||||||
} else {
|
|
||||||
offset = OF_base[ofCode] + BIT_readBitsFast(&seqState->DStream, ofBits/*>0*/); /* <= (ZSTD_WINDOWLOG_MAX-1) bits */
|
|
||||||
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (ofCode <= 1) {
|
|
||||||
offset += (llCode==0);
|
|
||||||
if (offset) {
|
|
||||||
size_t temp = (offset==3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
|
|
||||||
temp += !temp; /* 0 is not valid; input is corrupted; force offset to 1 */
|
|
||||||
if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
|
|
||||||
seqState->prevOffset[1] = seqState->prevOffset[0];
|
|
||||||
seqState->prevOffset[0] = offset = temp;
|
|
||||||
} else { /* offset == 0 */
|
|
||||||
offset = seqState->prevOffset[0];
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
seqState->prevOffset[2] = seqState->prevOffset[1];
|
|
||||||
seqState->prevOffset[1] = seqState->prevOffset[0];
|
|
||||||
seqState->prevOffset[0] = offset;
|
|
||||||
}
|
|
||||||
seq.offset = offset;
|
|
||||||
}
|
|
||||||
|
|
||||||
seq.matchLength = ML_base[mlCode]
|
|
||||||
+ ((mlCode>31) ? BIT_readBitsFast(&seqState->DStream, mlBits/*>0*/) : 0); /* <= 16 bits */
|
|
||||||
if (MEM_32bits() && (mlBits+llBits >= STREAM_ACCUMULATOR_MIN_32-LONG_OFFSETS_MAX_EXTRA_BITS_32))
|
|
||||||
BIT_reloadDStream(&seqState->DStream);
|
|
||||||
if (MEM_64bits() && (totalBits >= STREAM_ACCUMULATOR_MIN_64-(LLFSELog+MLFSELog+OffFSELog)))
|
|
||||||
BIT_reloadDStream(&seqState->DStream);
|
|
||||||
/* Ensure there are enough bits to read the rest of data in 64-bit mode. */
|
|
||||||
ZSTD_STATIC_ASSERT(16+LLFSELog+MLFSELog+OffFSELog < STREAM_ACCUMULATOR_MIN_64);
|
|
||||||
|
|
||||||
seq.litLength = LL_base[llCode]
|
|
||||||
+ ((llCode>15) ? BIT_readBitsFast(&seqState->DStream, llBits/*>0*/) : 0); /* <= 16 bits */
|
|
||||||
if (MEM_32bits())
|
|
||||||
BIT_reloadDStream(&seqState->DStream);
|
|
||||||
|
|
||||||
DEBUGLOG(6, "seq: litL=%u, matchL=%u, offset=%u",
|
|
||||||
(U32)seq.litLength, (U32)seq.matchLength, (U32)seq.offset);
|
|
||||||
|
|
||||||
/* ANS state update */
|
|
||||||
FSE_updateState(&seqState->stateLL, &seqState->DStream); /* <= 9 bits */
|
|
||||||
FSE_updateState(&seqState->stateML, &seqState->DStream); /* <= 9 bits */
|
|
||||||
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream); /* <= 18 bits */
|
|
||||||
FSE_updateState(&seqState->stateOffb, &seqState->DStream); /* <= 8 bits */
|
|
||||||
|
|
||||||
return seq;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
HINT_INLINE
|
HINT_INLINE
|
||||||
size_t ZSTD_execSequence(BYTE* op,
|
size_t ZSTD_execSequence(BYTE* op,
|
||||||
BYTE* const oend, seq_t sequence,
|
BYTE* const oend, seq_t sequence,
|
||||||
@ -1082,165 +1150,6 @@ size_t ZSTD_execSequence(BYTE* op,
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static size_t ZSTD_decompressSequences(
|
|
||||||
ZSTD_DCtx* dctx,
|
|
||||||
void* dst, size_t maxDstSize,
|
|
||||||
const void* seqStart, size_t seqSize, int nbSeq,
|
|
||||||
const ZSTD_longOffset_e isLongOffset)
|
|
||||||
{
|
|
||||||
const BYTE* ip = (const BYTE*)seqStart;
|
|
||||||
const BYTE* const iend = ip + seqSize;
|
|
||||||
BYTE* const ostart = (BYTE* const)dst;
|
|
||||||
BYTE* const oend = ostart + maxDstSize;
|
|
||||||
BYTE* op = ostart;
|
|
||||||
const BYTE* litPtr = dctx->litPtr;
|
|
||||||
const BYTE* const litEnd = litPtr + dctx->litSize;
|
|
||||||
const BYTE* const base = (const BYTE*) (dctx->base);
|
|
||||||
const BYTE* const vBase = (const BYTE*) (dctx->vBase);
|
|
||||||
const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd);
|
|
||||||
|
|
||||||
DEBUGLOG(5, "ZSTD_decompressSequences");
|
|
||||||
|
|
||||||
/* Regen sequences */
|
|
||||||
if (nbSeq) {
|
|
||||||
seqState_t seqState;
|
|
||||||
dctx->fseEntropy = 1;
|
|
||||||
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) seqState.prevOffset[i] = dctx->entropy.rep[i]; }
|
|
||||||
CHECK_E(BIT_initDStream(&seqState.DStream, ip, iend-ip), corruption_detected);
|
|
||||||
FSE_initDState(&seqState.stateLL, &seqState.DStream, dctx->LLTptr);
|
|
||||||
FSE_initDState(&seqState.stateOffb, &seqState.DStream, dctx->OFTptr);
|
|
||||||
FSE_initDState(&seqState.stateML, &seqState.DStream, dctx->MLTptr);
|
|
||||||
|
|
||||||
for ( ; (BIT_reloadDStream(&(seqState.DStream)) <= BIT_DStream_completed) && nbSeq ; ) {
|
|
||||||
nbSeq--;
|
|
||||||
{ seq_t const sequence = ZSTD_decodeSequence(&seqState, isLongOffset);
|
|
||||||
size_t const oneSeqSize = ZSTD_execSequence(op, oend, sequence, &litPtr, litEnd, base, vBase, dictEnd);
|
|
||||||
DEBUGLOG(6, "regenerated sequence size : %u", (U32)oneSeqSize);
|
|
||||||
if (ZSTD_isError(oneSeqSize)) return oneSeqSize;
|
|
||||||
op += oneSeqSize;
|
|
||||||
} }
|
|
||||||
|
|
||||||
/* check if reached exact end */
|
|
||||||
DEBUGLOG(5, "after decode loop, remaining nbSeq : %i", nbSeq);
|
|
||||||
if (nbSeq) return ERROR(corruption_detected);
|
|
||||||
/* save reps for next block */
|
|
||||||
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) dctx->entropy.rep[i] = (U32)(seqState.prevOffset[i]); }
|
|
||||||
}
|
|
||||||
|
|
||||||
/* last literal segment */
|
|
||||||
{ size_t const lastLLSize = litEnd - litPtr;
|
|
||||||
if (lastLLSize > (size_t)(oend-op)) return ERROR(dstSize_tooSmall);
|
|
||||||
memcpy(op, litPtr, lastLLSize);
|
|
||||||
op += lastLLSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
return op-ostart;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
HINT_INLINE
|
|
||||||
seq_t ZSTD_decodeSequenceLong(seqState_t* seqState, ZSTD_longOffset_e const longOffsets)
|
|
||||||
{
|
|
||||||
seq_t seq;
|
|
||||||
|
|
||||||
U32 const llCode = FSE_peekSymbol(&seqState->stateLL);
|
|
||||||
U32 const mlCode = FSE_peekSymbol(&seqState->stateML);
|
|
||||||
U32 const ofCode = FSE_peekSymbol(&seqState->stateOffb); /* <= MaxOff, by table construction */
|
|
||||||
|
|
||||||
U32 const llBits = LL_bits[llCode];
|
|
||||||
U32 const mlBits = ML_bits[mlCode];
|
|
||||||
U32 const ofBits = ofCode;
|
|
||||||
U32 const totalBits = llBits+mlBits+ofBits;
|
|
||||||
|
|
||||||
static const U32 LL_base[MaxLL+1] = {
|
|
||||||
0, 1, 2, 3, 4, 5, 6, 7,
|
|
||||||
8, 9, 10, 11, 12, 13, 14, 15,
|
|
||||||
16, 18, 20, 22, 24, 28, 32, 40,
|
|
||||||
48, 64, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000,
|
|
||||||
0x2000, 0x4000, 0x8000, 0x10000 };
|
|
||||||
|
|
||||||
static const U32 ML_base[MaxML+1] = {
|
|
||||||
3, 4, 5, 6, 7, 8, 9, 10,
|
|
||||||
11, 12, 13, 14, 15, 16, 17, 18,
|
|
||||||
19, 20, 21, 22, 23, 24, 25, 26,
|
|
||||||
27, 28, 29, 30, 31, 32, 33, 34,
|
|
||||||
35, 37, 39, 41, 43, 47, 51, 59,
|
|
||||||
67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803,
|
|
||||||
0x1003, 0x2003, 0x4003, 0x8003, 0x10003 };
|
|
||||||
|
|
||||||
static const U32 OF_base[MaxOff+1] = {
|
|
||||||
0, 1, 1, 5, 0xD, 0x1D, 0x3D, 0x7D,
|
|
||||||
0xFD, 0x1FD, 0x3FD, 0x7FD, 0xFFD, 0x1FFD, 0x3FFD, 0x7FFD,
|
|
||||||
0xFFFD, 0x1FFFD, 0x3FFFD, 0x7FFFD, 0xFFFFD, 0x1FFFFD, 0x3FFFFD, 0x7FFFFD,
|
|
||||||
0xFFFFFD, 0x1FFFFFD, 0x3FFFFFD, 0x7FFFFFD, 0xFFFFFFD, 0x1FFFFFFD, 0x3FFFFFFD, 0x7FFFFFFD };
|
|
||||||
|
|
||||||
/* sequence */
|
|
||||||
{ size_t offset;
|
|
||||||
if (!ofCode)
|
|
||||||
offset = 0;
|
|
||||||
else {
|
|
||||||
ZSTD_STATIC_ASSERT(ZSTD_lo_isLongOffset == 1);
|
|
||||||
ZSTD_STATIC_ASSERT(LONG_OFFSETS_MAX_EXTRA_BITS_32 == 5);
|
|
||||||
assert(ofBits <= MaxOff);
|
|
||||||
if (MEM_32bits() && longOffsets) {
|
|
||||||
U32 const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN_32-1);
|
|
||||||
offset = OF_base[ofCode] + (BIT_readBitsFast(&seqState->DStream, ofBits - extraBits) << extraBits);
|
|
||||||
if (MEM_32bits() || extraBits) BIT_reloadDStream(&seqState->DStream);
|
|
||||||
if (extraBits) offset += BIT_readBitsFast(&seqState->DStream, extraBits);
|
|
||||||
} else {
|
|
||||||
offset = OF_base[ofCode] + BIT_readBitsFast(&seqState->DStream, ofBits); /* <= (ZSTD_WINDOWLOG_MAX-1) bits */
|
|
||||||
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (ofCode <= 1) {
|
|
||||||
offset += (llCode==0);
|
|
||||||
if (offset) {
|
|
||||||
size_t temp = (offset==3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
|
|
||||||
temp += !temp; /* 0 is not valid; input is corrupted; force offset to 1 */
|
|
||||||
if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
|
|
||||||
seqState->prevOffset[1] = seqState->prevOffset[0];
|
|
||||||
seqState->prevOffset[0] = offset = temp;
|
|
||||||
} else {
|
|
||||||
offset = seqState->prevOffset[0];
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
seqState->prevOffset[2] = seqState->prevOffset[1];
|
|
||||||
seqState->prevOffset[1] = seqState->prevOffset[0];
|
|
||||||
seqState->prevOffset[0] = offset;
|
|
||||||
}
|
|
||||||
seq.offset = offset;
|
|
||||||
}
|
|
||||||
|
|
||||||
seq.matchLength = ML_base[mlCode] + ((mlCode>31) ? BIT_readBitsFast(&seqState->DStream, mlBits) : 0); /* <= 16 bits */
|
|
||||||
if (MEM_32bits() && (mlBits+llBits >= STREAM_ACCUMULATOR_MIN_32-LONG_OFFSETS_MAX_EXTRA_BITS_32))
|
|
||||||
BIT_reloadDStream(&seqState->DStream);
|
|
||||||
if (MEM_64bits() && (totalBits >= STREAM_ACCUMULATOR_MIN_64-(LLFSELog+MLFSELog+OffFSELog)))
|
|
||||||
BIT_reloadDStream(&seqState->DStream);
|
|
||||||
/* Verify that there is enough bits to read the rest of the data in 64-bit mode. */
|
|
||||||
ZSTD_STATIC_ASSERT(16+LLFSELog+MLFSELog+OffFSELog < STREAM_ACCUMULATOR_MIN_64);
|
|
||||||
|
|
||||||
seq.litLength = LL_base[llCode] + ((llCode>15) ? BIT_readBitsFast(&seqState->DStream, llBits) : 0); /* <= 16 bits */
|
|
||||||
if (MEM_32bits())
|
|
||||||
BIT_reloadDStream(&seqState->DStream);
|
|
||||||
|
|
||||||
{ size_t const pos = seqState->pos + seq.litLength;
|
|
||||||
const BYTE* const matchBase = (seq.offset > pos) ? seqState->dictEnd : seqState->prefixStart;
|
|
||||||
seq.match = matchBase + pos - seq.offset; /* note : this operation can overflow when seq.offset is really too large, which can only happen when input is corrupted.
|
|
||||||
* No consequence though : no memory access will occur, overly large offset will be detected in ZSTD_execSequenceLong() */
|
|
||||||
seqState->pos = pos + seq.matchLength;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* ANS state update */
|
|
||||||
FSE_updateState(&seqState->stateLL, &seqState->DStream); /* <= 9 bits */
|
|
||||||
FSE_updateState(&seqState->stateML, &seqState->DStream); /* <= 9 bits */
|
|
||||||
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream); /* <= 18 bits */
|
|
||||||
FSE_updateState(&seqState->stateOffb, &seqState->DStream); /* <= 8 bits */
|
|
||||||
|
|
||||||
return seq;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
HINT_INLINE
|
HINT_INLINE
|
||||||
size_t ZSTD_execSequenceLong(BYTE* op,
|
size_t ZSTD_execSequenceLong(BYTE* op,
|
||||||
BYTE* const oend, seq_t sequence,
|
BYTE* const oend, seq_t sequence,
|
||||||
@ -1320,81 +1229,51 @@ size_t ZSTD_execSequenceLong(BYTE* op,
|
|||||||
return sequenceLength;
|
return sequenceLength;
|
||||||
}
|
}
|
||||||
|
|
||||||
static size_t ZSTD_decompressSequencesLong(
|
typedef enum { ZSTD_lo_isRegularOffset, ZSTD_lo_isLongOffset=1 } ZSTD_longOffset_e;
|
||||||
ZSTD_DCtx* dctx,
|
|
||||||
void* dst, size_t maxDstSize,
|
#define FUNCTION(fn) fn##_default
|
||||||
const void* seqStart, size_t seqSize, int nbSeq,
|
#define TARGET
|
||||||
const ZSTD_longOffset_e isLongOffset)
|
#include "zstd_decompress_impl.h"
|
||||||
|
#undef TARGET
|
||||||
|
#undef FUNCTION
|
||||||
|
|
||||||
|
#if DYNAMIC_BMI2
|
||||||
|
|
||||||
|
#define FUNCTION(fn) fn##_bmi2
|
||||||
|
#define TARGET TARGET_ATTRIBUTE("bmi2")
|
||||||
|
#include "zstd_decompress_impl.h"
|
||||||
|
#undef TARGET
|
||||||
|
#undef FUNCTION
|
||||||
|
|
||||||
|
#endif
|
||||||
|
|
||||||
|
typedef size_t (*ZSTD_decompressSequences_t)(
|
||||||
|
ZSTD_DCtx *dctx, void *dst, size_t maxDstSize, const void *seqStart,
|
||||||
|
size_t seqSize, const ZSTD_longOffset_e isLongOffset);
|
||||||
|
|
||||||
|
static size_t ZSTD_decompressSequences(ZSTD_DCtx* dctx, void* dst, size_t maxDstSize,
|
||||||
|
const void* seqStart, size_t seqSize,
|
||||||
|
const ZSTD_longOffset_e isLongOffset)
|
||||||
{
|
{
|
||||||
const BYTE* ip = (const BYTE*)seqStart;
|
#if DYNAMIC_BMI2
|
||||||
const BYTE* const iend = ip + seqSize;
|
if (dctx->bmi2) {
|
||||||
BYTE* const ostart = (BYTE* const)dst;
|
return ZSTD_decompressSequences_bmi2(dctx, dst, maxDstSize, seqStart, seqSize, isLongOffset);
|
||||||
BYTE* const oend = ostart + maxDstSize;
|
|
||||||
BYTE* op = ostart;
|
|
||||||
const BYTE* litPtr = dctx->litPtr;
|
|
||||||
const BYTE* const litEnd = litPtr + dctx->litSize;
|
|
||||||
const BYTE* const prefixStart = (const BYTE*) (dctx->base);
|
|
||||||
const BYTE* const dictStart = (const BYTE*) (dctx->vBase);
|
|
||||||
const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd);
|
|
||||||
|
|
||||||
/* Regen sequences */
|
|
||||||
if (nbSeq) {
|
|
||||||
#define STORED_SEQS 4
|
|
||||||
#define STOSEQ_MASK (STORED_SEQS-1)
|
|
||||||
#define ADVANCED_SEQS 4
|
|
||||||
seq_t sequences[STORED_SEQS];
|
|
||||||
int const seqAdvance = MIN(nbSeq, ADVANCED_SEQS);
|
|
||||||
seqState_t seqState;
|
|
||||||
int seqNb;
|
|
||||||
dctx->fseEntropy = 1;
|
|
||||||
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) seqState.prevOffset[i] = dctx->entropy.rep[i]; }
|
|
||||||
seqState.prefixStart = prefixStart;
|
|
||||||
seqState.pos = (size_t)(op-prefixStart);
|
|
||||||
seqState.dictEnd = dictEnd;
|
|
||||||
CHECK_E(BIT_initDStream(&seqState.DStream, ip, iend-ip), corruption_detected);
|
|
||||||
FSE_initDState(&seqState.stateLL, &seqState.DStream, dctx->LLTptr);
|
|
||||||
FSE_initDState(&seqState.stateOffb, &seqState.DStream, dctx->OFTptr);
|
|
||||||
FSE_initDState(&seqState.stateML, &seqState.DStream, dctx->MLTptr);
|
|
||||||
|
|
||||||
/* prepare in advance */
|
|
||||||
for (seqNb=0; (BIT_reloadDStream(&seqState.DStream) <= BIT_DStream_completed) && (seqNb<seqAdvance); seqNb++) {
|
|
||||||
sequences[seqNb] = ZSTD_decodeSequenceLong(&seqState, isLongOffset);
|
|
||||||
}
|
|
||||||
if (seqNb<seqAdvance) return ERROR(corruption_detected);
|
|
||||||
|
|
||||||
/* decode and decompress */
|
|
||||||
for ( ; (BIT_reloadDStream(&(seqState.DStream)) <= BIT_DStream_completed) && (seqNb<nbSeq) ; seqNb++) {
|
|
||||||
seq_t const sequence = ZSTD_decodeSequenceLong(&seqState, isLongOffset);
|
|
||||||
size_t const oneSeqSize = ZSTD_execSequenceLong(op, oend, sequences[(seqNb-ADVANCED_SEQS) & STOSEQ_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd);
|
|
||||||
if (ZSTD_isError(oneSeqSize)) return oneSeqSize;
|
|
||||||
PREFETCH(sequence.match); /* note : it's safe to invoke PREFETCH() on any memory address, including invalid ones */
|
|
||||||
sequences[seqNb&STOSEQ_MASK] = sequence;
|
|
||||||
op += oneSeqSize;
|
|
||||||
}
|
|
||||||
if (seqNb<nbSeq) return ERROR(corruption_detected);
|
|
||||||
|
|
||||||
/* finish queue */
|
|
||||||
seqNb -= seqAdvance;
|
|
||||||
for ( ; seqNb<nbSeq ; seqNb++) {
|
|
||||||
size_t const oneSeqSize = ZSTD_execSequenceLong(op, oend, sequences[seqNb&STOSEQ_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd);
|
|
||||||
if (ZSTD_isError(oneSeqSize)) return oneSeqSize;
|
|
||||||
op += oneSeqSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* save reps for next block */
|
|
||||||
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) dctx->entropy.rep[i] = (U32)(seqState.prevOffset[i]); }
|
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
/* last literal segment */
|
return ZSTD_decompressSequences_default(dctx, dst, maxDstSize, seqStart, seqSize, isLongOffset);
|
||||||
{ size_t const lastLLSize = litEnd - litPtr;
|
|
||||||
if (lastLLSize > (size_t)(oend-op)) return ERROR(dstSize_tooSmall);
|
|
||||||
memcpy(op, litPtr, lastLLSize);
|
|
||||||
op += lastLLSize;
|
|
||||||
}
|
|
||||||
|
|
||||||
return op-ostart;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static size_t ZSTD_decompressSequencesLong(ZSTD_DCtx* dctx, void* dst, size_t maxDstSize,
|
||||||
|
const void* seqStart, size_t seqSize,
|
||||||
|
const ZSTD_longOffset_e isLongOffset)
|
||||||
|
{
|
||||||
|
#if DYNAMIC_BMI2
|
||||||
|
if (dctx->bmi2) {
|
||||||
|
return ZSTD_decompressSequencesLong_bmi2(dctx, dst, maxDstSize, seqStart, seqSize, isLongOffset);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
return ZSTD_decompressSequencesLong_default(dctx, dst, maxDstSize, seqStart, seqSize, isLongOffset);
|
||||||
|
}
|
||||||
|
|
||||||
static unsigned
|
static unsigned
|
||||||
ZSTD_getLongOffsetsShare(const FSE_DTable* offTable)
|
ZSTD_getLongOffsetsShare(const FSE_DTable* offTable)
|
||||||
@ -1438,22 +1317,12 @@ static size_t ZSTD_decompressBlock_internal(ZSTD_DCtx* dctx,
|
|||||||
srcSize -= litCSize;
|
srcSize -= litCSize;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Build Decoding Tables */
|
if ( frame /* windowSize exists */
|
||||||
{ int nbSeq;
|
&& (dctx->fParams.windowSize > (1<<24))
|
||||||
size_t const seqHSize = ZSTD_decodeSeqHeaders(dctx, &nbSeq, ip, srcSize);
|
&& MEM_64bits() /* x86 benefits less from long mode than x64 */ )
|
||||||
if (ZSTD_isError(seqHSize)) return seqHSize;
|
return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, ip, srcSize, isLongOffset);
|
||||||
ip += seqHSize;
|
|
||||||
srcSize -= seqHSize;
|
|
||||||
|
|
||||||
if (dctx->fParams.windowSize > (1<<24)) {
|
return ZSTD_decompressSequences(dctx, dst, dstCapacity, ip, srcSize, isLongOffset);
|
||||||
U32 const shareLongOffsets = ZSTD_getLongOffsetsShare(dctx->OFTptr);
|
|
||||||
U32 const minShare = MEM_64bits() ? 5 : 13;
|
|
||||||
if (shareLongOffsets >= minShare)
|
|
||||||
return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, ip, srcSize, nbSeq, isLongOffset);
|
|
||||||
}
|
|
||||||
|
|
||||||
return ZSTD_decompressSequences(dctx, dst, dstCapacity, ip, srcSize, nbSeq, isLongOffset);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -1948,8 +1817,12 @@ static size_t ZSTD_loadEntropy(ZSTD_entropyDTables_t* entropy, const void* const
|
|||||||
U32 offcodeMaxValue = MaxOff, offcodeLog;
|
U32 offcodeMaxValue = MaxOff, offcodeLog;
|
||||||
size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dictPtr, dictEnd-dictPtr);
|
size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dictPtr, dictEnd-dictPtr);
|
||||||
if (FSE_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
|
if (FSE_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
|
||||||
|
if (offcodeMaxValue > MaxOff) return ERROR(dictionary_corrupted);
|
||||||
if (offcodeLog > OffFSELog) return ERROR(dictionary_corrupted);
|
if (offcodeLog > OffFSELog) return ERROR(dictionary_corrupted);
|
||||||
CHECK_E(FSE_buildDTable(entropy->OFTable, offcodeNCount, offcodeMaxValue, offcodeLog), dictionary_corrupted);
|
ZSTD_buildFSETable(entropy->OFTable,
|
||||||
|
offcodeNCount, offcodeMaxValue,
|
||||||
|
OF_base, OF_bits,
|
||||||
|
offcodeLog);
|
||||||
dictPtr += offcodeHeaderSize;
|
dictPtr += offcodeHeaderSize;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1957,8 +1830,12 @@ static size_t ZSTD_loadEntropy(ZSTD_entropyDTables_t* entropy, const void* const
|
|||||||
unsigned matchlengthMaxValue = MaxML, matchlengthLog;
|
unsigned matchlengthMaxValue = MaxML, matchlengthLog;
|
||||||
size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dictPtr, dictEnd-dictPtr);
|
size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dictPtr, dictEnd-dictPtr);
|
||||||
if (FSE_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
|
if (FSE_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
|
||||||
|
if (matchlengthMaxValue > MaxML) return ERROR(dictionary_corrupted);
|
||||||
if (matchlengthLog > MLFSELog) return ERROR(dictionary_corrupted);
|
if (matchlengthLog > MLFSELog) return ERROR(dictionary_corrupted);
|
||||||
CHECK_E(FSE_buildDTable(entropy->MLTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog), dictionary_corrupted);
|
ZSTD_buildFSETable(entropy->MLTable,
|
||||||
|
matchlengthNCount, matchlengthMaxValue,
|
||||||
|
ML_base, ML_bits,
|
||||||
|
matchlengthLog);
|
||||||
dictPtr += matchlengthHeaderSize;
|
dictPtr += matchlengthHeaderSize;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1966,8 +1843,12 @@ static size_t ZSTD_loadEntropy(ZSTD_entropyDTables_t* entropy, const void* const
|
|||||||
unsigned litlengthMaxValue = MaxLL, litlengthLog;
|
unsigned litlengthMaxValue = MaxLL, litlengthLog;
|
||||||
size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dictPtr, dictEnd-dictPtr);
|
size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dictPtr, dictEnd-dictPtr);
|
||||||
if (FSE_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
|
if (FSE_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
|
||||||
|
if (litlengthMaxValue > MaxLL) return ERROR(dictionary_corrupted);
|
||||||
if (litlengthLog > LLFSELog) return ERROR(dictionary_corrupted);
|
if (litlengthLog > LLFSELog) return ERROR(dictionary_corrupted);
|
||||||
CHECK_E(FSE_buildDTable(entropy->LLTable, litlengthNCount, litlengthMaxValue, litlengthLog), dictionary_corrupted);
|
ZSTD_buildFSETable(entropy->LLTable,
|
||||||
|
litlengthNCount, litlengthMaxValue,
|
||||||
|
LL_base, LL_bits,
|
||||||
|
litlengthLog);
|
||||||
dictPtr += litlengthHeaderSize;
|
dictPtr += litlengthHeaderSize;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
356
lib/decompress/zstd_decompress_impl.h
Normal file
356
lib/decompress/zstd_decompress_impl.h
Normal file
@ -0,0 +1,356 @@
|
|||||||
|
/*
|
||||||
|
* Copyright (c) 2018-present, Facebook, Inc.
|
||||||
|
* All rights reserved.
|
||||||
|
*
|
||||||
|
* This source code is licensed under both the BSD-style license (found in the
|
||||||
|
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||||
|
* in the COPYING file in the root directory of this source tree).
|
||||||
|
* You may select, at your option, one of the above-listed licenses.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#ifndef FUNCTION
|
||||||
|
# error "FUNCTION(name) must be defined"
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#ifndef TARGET
|
||||||
|
# error "TARGET must be defined"
|
||||||
|
#endif
|
||||||
|
|
||||||
|
static TARGET void
|
||||||
|
FUNCTION(ZSTD_updateFseState)(ZSTD_fseState* DStatePtr, BIT_DStream_t* bitD)
|
||||||
|
{
|
||||||
|
ZSTD_seqSymbol const DInfo = DStatePtr->table[DStatePtr->state];
|
||||||
|
U32 const nbBits = DInfo.nbBits;
|
||||||
|
size_t const lowBits = BIT_readBits(bitD, nbBits);
|
||||||
|
DStatePtr->state = DInfo.nextState + lowBits;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* We need to add at most (ZSTD_WINDOWLOG_MAX_32 - 1) bits to read the maximum
|
||||||
|
* offset bits. But we can only read at most (STREAM_ACCUMULATOR_MIN_32 - 1)
|
||||||
|
* bits before reloading. This value is the maximum number of bytes we read
|
||||||
|
* after reloading when we are decoding long offets.
|
||||||
|
*/
|
||||||
|
#define LONG_OFFSETS_MAX_EXTRA_BITS_32 \
|
||||||
|
(ZSTD_WINDOWLOG_MAX_32 > STREAM_ACCUMULATOR_MIN_32 \
|
||||||
|
? ZSTD_WINDOWLOG_MAX_32 - STREAM_ACCUMULATOR_MIN_32 \
|
||||||
|
: 0)
|
||||||
|
|
||||||
|
static TARGET seq_t
|
||||||
|
FUNCTION(ZSTD_decodeSequence)(seqState_t* seqState, const ZSTD_longOffset_e longOffsets)
|
||||||
|
{
|
||||||
|
seq_t seq;
|
||||||
|
U32 const llBits = seqState->stateLL.table[seqState->stateLL.state].nbAdditionalBits;
|
||||||
|
U32 const mlBits = seqState->stateML.table[seqState->stateML.state].nbAdditionalBits;
|
||||||
|
U32 const ofBits = seqState->stateOffb.table[seqState->stateOffb.state].nbAdditionalBits;
|
||||||
|
U32 const totalBits = llBits+mlBits+ofBits;
|
||||||
|
U32 const llBase = seqState->stateLL.table[seqState->stateLL.state].baseValue;
|
||||||
|
U32 const mlBase = seqState->stateML.table[seqState->stateML.state].baseValue;
|
||||||
|
U32 const ofBase = seqState->stateOffb.table[seqState->stateOffb.state].baseValue;
|
||||||
|
|
||||||
|
/* sequence */
|
||||||
|
{ size_t offset;
|
||||||
|
if (!ofBits)
|
||||||
|
offset = 0;
|
||||||
|
else {
|
||||||
|
ZSTD_STATIC_ASSERT(ZSTD_lo_isLongOffset == 1);
|
||||||
|
ZSTD_STATIC_ASSERT(LONG_OFFSETS_MAX_EXTRA_BITS_32 == 5);
|
||||||
|
assert(ofBits <= MaxOff);
|
||||||
|
if (MEM_32bits() && longOffsets && (ofBits >= STREAM_ACCUMULATOR_MIN_32)) {
|
||||||
|
U32 const extraBits = ofBits - MIN(ofBits, 32 - seqState->DStream.bitsConsumed);
|
||||||
|
offset = ofBase + (BIT_readBitsFast(&seqState->DStream, ofBits - extraBits) << extraBits);
|
||||||
|
BIT_reloadDStream(&seqState->DStream);
|
||||||
|
if (extraBits) offset += BIT_readBitsFast(&seqState->DStream, extraBits);
|
||||||
|
assert(extraBits <= LONG_OFFSETS_MAX_EXTRA_BITS_32); /* to avoid another reload */
|
||||||
|
} else {
|
||||||
|
offset = ofBase + BIT_readBitsFast(&seqState->DStream, ofBits/*>0*/); /* <= (ZSTD_WINDOWLOG_MAX-1) bits */
|
||||||
|
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (ofBits <= 1) {
|
||||||
|
offset += (llBase==0);
|
||||||
|
if (offset) {
|
||||||
|
size_t temp = (offset==3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
|
||||||
|
temp += !temp; /* 0 is not valid; input is corrupted; force offset to 1 */
|
||||||
|
if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
|
||||||
|
seqState->prevOffset[1] = seqState->prevOffset[0];
|
||||||
|
seqState->prevOffset[0] = offset = temp;
|
||||||
|
} else { /* offset == 0 */
|
||||||
|
offset = seqState->prevOffset[0];
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
seqState->prevOffset[2] = seqState->prevOffset[1];
|
||||||
|
seqState->prevOffset[1] = seqState->prevOffset[0];
|
||||||
|
seqState->prevOffset[0] = offset;
|
||||||
|
}
|
||||||
|
seq.offset = offset;
|
||||||
|
}
|
||||||
|
|
||||||
|
seq.matchLength = mlBase
|
||||||
|
+ ((mlBits>0) ? BIT_readBitsFast(&seqState->DStream, mlBits/*>0*/) : 0); /* <= 16 bits */
|
||||||
|
if (MEM_32bits() && (mlBits+llBits >= STREAM_ACCUMULATOR_MIN_32-LONG_OFFSETS_MAX_EXTRA_BITS_32))
|
||||||
|
BIT_reloadDStream(&seqState->DStream);
|
||||||
|
if (MEM_64bits() && (totalBits >= STREAM_ACCUMULATOR_MIN_64-(LLFSELog+MLFSELog+OffFSELog)))
|
||||||
|
BIT_reloadDStream(&seqState->DStream);
|
||||||
|
/* Ensure there are enough bits to read the rest of data in 64-bit mode. */
|
||||||
|
ZSTD_STATIC_ASSERT(16+LLFSELog+MLFSELog+OffFSELog < STREAM_ACCUMULATOR_MIN_64);
|
||||||
|
|
||||||
|
seq.litLength = llBase
|
||||||
|
+ ((llBits>0) ? BIT_readBitsFast(&seqState->DStream, llBits/*>0*/) : 0); /* <= 16 bits */
|
||||||
|
if (MEM_32bits())
|
||||||
|
BIT_reloadDStream(&seqState->DStream);
|
||||||
|
|
||||||
|
DEBUGLOG(6, "seq: litL=%u, matchL=%u, offset=%u",
|
||||||
|
(U32)seq.litLength, (U32)seq.matchLength, (U32)seq.offset);
|
||||||
|
|
||||||
|
/* ANS state update */
|
||||||
|
FUNCTION(ZSTD_updateFseState)(&seqState->stateLL, &seqState->DStream); /* <= 9 bits */
|
||||||
|
FUNCTION(ZSTD_updateFseState)(&seqState->stateML, &seqState->DStream); /* <= 9 bits */
|
||||||
|
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream); /* <= 18 bits */
|
||||||
|
FUNCTION(ZSTD_updateFseState)(&seqState->stateOffb, &seqState->DStream); /* <= 8 bits */
|
||||||
|
|
||||||
|
return seq;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
HINT_INLINE seq_t
|
||||||
|
FUNCTION(ZSTD_decodeSequenceLong)(seqState_t* seqState, ZSTD_longOffset_e const longOffsets)
|
||||||
|
{
|
||||||
|
seq_t seq;
|
||||||
|
U32 const llBits = seqState->stateLL.table[seqState->stateLL.state].nbAdditionalBits;
|
||||||
|
U32 const mlBits = seqState->stateML.table[seqState->stateML.state].nbAdditionalBits;
|
||||||
|
U32 const ofBits = seqState->stateOffb.table[seqState->stateOffb.state].nbAdditionalBits;
|
||||||
|
U32 const totalBits = llBits+mlBits+ofBits;
|
||||||
|
U32 const llBase = seqState->stateLL.table[seqState->stateLL.state].baseValue;
|
||||||
|
U32 const mlBase = seqState->stateML.table[seqState->stateML.state].baseValue;
|
||||||
|
U32 const ofBase = seqState->stateOffb.table[seqState->stateOffb.state].baseValue;
|
||||||
|
|
||||||
|
/* sequence */
|
||||||
|
{ size_t offset;
|
||||||
|
if (!ofBits)
|
||||||
|
offset = 0;
|
||||||
|
else {
|
||||||
|
ZSTD_STATIC_ASSERT(ZSTD_lo_isLongOffset == 1);
|
||||||
|
ZSTD_STATIC_ASSERT(LONG_OFFSETS_MAX_EXTRA_BITS_32 == 5);
|
||||||
|
assert(ofBits <= MaxOff);
|
||||||
|
if (MEM_32bits() && longOffsets) {
|
||||||
|
U32 const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN_32-1);
|
||||||
|
offset = ofBase + (BIT_readBitsFast(&seqState->DStream, ofBits - extraBits) << extraBits);
|
||||||
|
if (MEM_32bits() || extraBits) BIT_reloadDStream(&seqState->DStream);
|
||||||
|
if (extraBits) offset += BIT_readBitsFast(&seqState->DStream, extraBits);
|
||||||
|
} else {
|
||||||
|
offset = ofBase + BIT_readBitsFast(&seqState->DStream, ofBits); /* <= (ZSTD_WINDOWLOG_MAX-1) bits */
|
||||||
|
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (ofBits <= 1) {
|
||||||
|
offset += (llBase==0);
|
||||||
|
if (offset) {
|
||||||
|
size_t temp = (offset==3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
|
||||||
|
temp += !temp; /* 0 is not valid; input is corrupted; force offset to 1 */
|
||||||
|
if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
|
||||||
|
seqState->prevOffset[1] = seqState->prevOffset[0];
|
||||||
|
seqState->prevOffset[0] = offset = temp;
|
||||||
|
} else {
|
||||||
|
offset = seqState->prevOffset[0];
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
seqState->prevOffset[2] = seqState->prevOffset[1];
|
||||||
|
seqState->prevOffset[1] = seqState->prevOffset[0];
|
||||||
|
seqState->prevOffset[0] = offset;
|
||||||
|
}
|
||||||
|
seq.offset = offset;
|
||||||
|
}
|
||||||
|
|
||||||
|
seq.matchLength = mlBase + ((mlBits>0) ? BIT_readBitsFast(&seqState->DStream, mlBits) : 0); /* <= 16 bits */
|
||||||
|
if (MEM_32bits() && (mlBits+llBits >= STREAM_ACCUMULATOR_MIN_32-LONG_OFFSETS_MAX_EXTRA_BITS_32))
|
||||||
|
BIT_reloadDStream(&seqState->DStream);
|
||||||
|
if (MEM_64bits() && (totalBits >= STREAM_ACCUMULATOR_MIN_64-(LLFSELog+MLFSELog+OffFSELog)))
|
||||||
|
BIT_reloadDStream(&seqState->DStream);
|
||||||
|
/* Verify that there is enough bits to read the rest of the data in 64-bit mode. */
|
||||||
|
ZSTD_STATIC_ASSERT(16+LLFSELog+MLFSELog+OffFSELog < STREAM_ACCUMULATOR_MIN_64);
|
||||||
|
|
||||||
|
seq.litLength = llBase + ((llBits>0) ? BIT_readBitsFast(&seqState->DStream, llBits) : 0); /* <= 16 bits */
|
||||||
|
if (MEM_32bits())
|
||||||
|
BIT_reloadDStream(&seqState->DStream);
|
||||||
|
|
||||||
|
{ size_t const pos = seqState->pos + seq.litLength;
|
||||||
|
const BYTE* const matchBase = (seq.offset > pos) ? seqState->dictEnd : seqState->prefixStart;
|
||||||
|
seq.match = matchBase + pos - seq.offset; /* note : this operation can overflow when seq.offset is really too large, which can only happen when input is corrupted.
|
||||||
|
* No consequence though : no memory access will occur, overly large offset will be detected in ZSTD_execSequenceLong() */
|
||||||
|
seqState->pos = pos + seq.matchLength;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ANS state update */
|
||||||
|
FUNCTION(ZSTD_updateFseState)(&seqState->stateLL, &seqState->DStream); /* <= 9 bits */
|
||||||
|
FUNCTION(ZSTD_updateFseState)(&seqState->stateML, &seqState->DStream); /* <= 9 bits */
|
||||||
|
if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream); /* <= 18 bits */
|
||||||
|
FUNCTION(ZSTD_updateFseState)(&seqState->stateOffb, &seqState->DStream); /* <= 8 bits */
|
||||||
|
|
||||||
|
return seq;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
static TARGET void
|
||||||
|
FUNCTION(ZSTD_initFseState)(ZSTD_fseState* DStatePtr, BIT_DStream_t* bitD, const ZSTD_seqSymbol* dt)
|
||||||
|
{
|
||||||
|
const void* ptr = dt;
|
||||||
|
const ZSTD_seqSymbol_header* const DTableH = (const ZSTD_seqSymbol_header*)ptr;
|
||||||
|
DStatePtr->state = BIT_readBits(bitD, DTableH->tableLog);
|
||||||
|
DEBUGLOG(6, "ZSTD_initFseState : val=%u using %u bits",
|
||||||
|
(U32)DStatePtr->state, DTableH->tableLog);
|
||||||
|
BIT_reloadDStream(bitD);
|
||||||
|
DStatePtr->table = dt + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static TARGET
|
||||||
|
size_t FUNCTION(ZSTD_decompressSequences)(
|
||||||
|
ZSTD_DCtx* dctx,
|
||||||
|
void* dst, size_t maxDstSize,
|
||||||
|
const void* seqStart, size_t seqSize,
|
||||||
|
const ZSTD_longOffset_e isLongOffset)
|
||||||
|
{
|
||||||
|
const BYTE* ip = (const BYTE*)seqStart;
|
||||||
|
const BYTE* const iend = ip + seqSize;
|
||||||
|
BYTE* const ostart = (BYTE* const)dst;
|
||||||
|
BYTE* const oend = ostart + maxDstSize;
|
||||||
|
BYTE* op = ostart;
|
||||||
|
const BYTE* litPtr = dctx->litPtr;
|
||||||
|
const BYTE* const litEnd = litPtr + dctx->litSize;
|
||||||
|
const BYTE* const base = (const BYTE*) (dctx->base);
|
||||||
|
const BYTE* const vBase = (const BYTE*) (dctx->vBase);
|
||||||
|
const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd);
|
||||||
|
int nbSeq;
|
||||||
|
DEBUGLOG(5, "ZSTD_decompressSequences");
|
||||||
|
|
||||||
|
/* Build Decoding Tables */
|
||||||
|
{ size_t const seqHSize = ZSTD_decodeSeqHeaders(dctx, &nbSeq, ip, seqSize);
|
||||||
|
DEBUGLOG(5, "ZSTD_decodeSeqHeaders: size=%u, nbSeq=%i",
|
||||||
|
(U32)seqHSize, nbSeq);
|
||||||
|
if (ZSTD_isError(seqHSize)) return seqHSize;
|
||||||
|
ip += seqHSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Regen sequences */
|
||||||
|
if (nbSeq) {
|
||||||
|
seqState_t seqState;
|
||||||
|
dctx->fseEntropy = 1;
|
||||||
|
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) seqState.prevOffset[i] = dctx->entropy.rep[i]; }
|
||||||
|
CHECK_E(BIT_initDStream(&seqState.DStream, ip, iend-ip), corruption_detected);
|
||||||
|
FUNCTION(ZSTD_initFseState)(&seqState.stateLL, &seqState.DStream, dctx->LLTptr);
|
||||||
|
FUNCTION(ZSTD_initFseState)(&seqState.stateOffb, &seqState.DStream, dctx->OFTptr);
|
||||||
|
FUNCTION(ZSTD_initFseState)(&seqState.stateML, &seqState.DStream, dctx->MLTptr);
|
||||||
|
|
||||||
|
for ( ; (BIT_reloadDStream(&(seqState.DStream)) <= BIT_DStream_completed) && nbSeq ; ) {
|
||||||
|
nbSeq--;
|
||||||
|
{ seq_t const sequence = FUNCTION(ZSTD_decodeSequence)(&seqState, isLongOffset);
|
||||||
|
size_t const oneSeqSize = ZSTD_execSequence(op, oend, sequence, &litPtr, litEnd, base, vBase, dictEnd);
|
||||||
|
DEBUGLOG(6, "regenerated sequence size : %u", (U32)oneSeqSize);
|
||||||
|
if (ZSTD_isError(oneSeqSize)) return oneSeqSize;
|
||||||
|
op += oneSeqSize;
|
||||||
|
} }
|
||||||
|
|
||||||
|
/* check if reached exact end */
|
||||||
|
DEBUGLOG(5, "ZSTD_decompressSequences: after decode loop, remaining nbSeq : %i", nbSeq);
|
||||||
|
if (nbSeq) return ERROR(corruption_detected);
|
||||||
|
/* save reps for next block */
|
||||||
|
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) dctx->entropy.rep[i] = (U32)(seqState.prevOffset[i]); }
|
||||||
|
}
|
||||||
|
|
||||||
|
/* last literal segment */
|
||||||
|
{ size_t const lastLLSize = litEnd - litPtr;
|
||||||
|
if (lastLLSize > (size_t)(oend-op)) return ERROR(dstSize_tooSmall);
|
||||||
|
memcpy(op, litPtr, lastLLSize);
|
||||||
|
op += lastLLSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
return op-ostart;
|
||||||
|
}
|
||||||
|
|
||||||
|
static TARGET
|
||||||
|
size_t FUNCTION(ZSTD_decompressSequencesLong)(
|
||||||
|
ZSTD_DCtx* dctx,
|
||||||
|
void* dst, size_t maxDstSize,
|
||||||
|
const void* seqStart, size_t seqSize,
|
||||||
|
const ZSTD_longOffset_e isLongOffset)
|
||||||
|
{
|
||||||
|
const BYTE* ip = (const BYTE*)seqStart;
|
||||||
|
const BYTE* const iend = ip + seqSize;
|
||||||
|
BYTE* const ostart = (BYTE* const)dst;
|
||||||
|
BYTE* const oend = ostart + maxDstSize;
|
||||||
|
BYTE* op = ostart;
|
||||||
|
const BYTE* litPtr = dctx->litPtr;
|
||||||
|
const BYTE* const litEnd = litPtr + dctx->litSize;
|
||||||
|
const BYTE* const prefixStart = (const BYTE*) (dctx->base);
|
||||||
|
const BYTE* const dictStart = (const BYTE*) (dctx->vBase);
|
||||||
|
const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd);
|
||||||
|
int nbSeq;
|
||||||
|
|
||||||
|
/* Build Decoding Tables */
|
||||||
|
{ size_t const seqHSize = ZSTD_decodeSeqHeaders(dctx, &nbSeq, ip, seqSize);
|
||||||
|
if (ZSTD_isError(seqHSize)) return seqHSize;
|
||||||
|
ip += seqHSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Regen sequences */
|
||||||
|
if (nbSeq) {
|
||||||
|
#define STORED_SEQS 4
|
||||||
|
#define STOSEQ_MASK (STORED_SEQS-1)
|
||||||
|
#define ADVANCED_SEQS 4
|
||||||
|
seq_t sequences[STORED_SEQS];
|
||||||
|
int const seqAdvance = MIN(nbSeq, ADVANCED_SEQS);
|
||||||
|
seqState_t seqState;
|
||||||
|
int seqNb;
|
||||||
|
dctx->fseEntropy = 1;
|
||||||
|
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) seqState.prevOffset[i] = dctx->entropy.rep[i]; }
|
||||||
|
seqState.prefixStart = prefixStart;
|
||||||
|
seqState.pos = (size_t)(op-prefixStart);
|
||||||
|
seqState.dictEnd = dictEnd;
|
||||||
|
CHECK_E(BIT_initDStream(&seqState.DStream, ip, iend-ip), corruption_detected);
|
||||||
|
FUNCTION(ZSTD_initFseState)(&seqState.stateLL, &seqState.DStream, dctx->LLTptr);
|
||||||
|
FUNCTION(ZSTD_initFseState)(&seqState.stateOffb, &seqState.DStream, dctx->OFTptr);
|
||||||
|
FUNCTION(ZSTD_initFseState)(&seqState.stateML, &seqState.DStream, dctx->MLTptr);
|
||||||
|
|
||||||
|
/* prepare in advance */
|
||||||
|
for (seqNb=0; (BIT_reloadDStream(&seqState.DStream) <= BIT_DStream_completed) && (seqNb<seqAdvance); seqNb++) {
|
||||||
|
sequences[seqNb] = FUNCTION(ZSTD_decodeSequenceLong)(&seqState, isLongOffset);
|
||||||
|
}
|
||||||
|
if (seqNb<seqAdvance) return ERROR(corruption_detected);
|
||||||
|
|
||||||
|
/* decode and decompress */
|
||||||
|
for ( ; (BIT_reloadDStream(&(seqState.DStream)) <= BIT_DStream_completed) && (seqNb<nbSeq) ; seqNb++) {
|
||||||
|
seq_t const sequence = FUNCTION(ZSTD_decodeSequenceLong)(&seqState, isLongOffset);
|
||||||
|
size_t const oneSeqSize = ZSTD_execSequenceLong(op, oend, sequences[(seqNb-ADVANCED_SEQS) & STOSEQ_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd);
|
||||||
|
if (ZSTD_isError(oneSeqSize)) return oneSeqSize;
|
||||||
|
PREFETCH(sequence.match); /* note : it's safe to invoke PREFETCH() on any memory address, including invalid ones */
|
||||||
|
sequences[seqNb&STOSEQ_MASK] = sequence;
|
||||||
|
op += oneSeqSize;
|
||||||
|
}
|
||||||
|
if (seqNb<nbSeq) return ERROR(corruption_detected);
|
||||||
|
|
||||||
|
/* finish queue */
|
||||||
|
seqNb -= seqAdvance;
|
||||||
|
for ( ; seqNb<nbSeq ; seqNb++) {
|
||||||
|
size_t const oneSeqSize = ZSTD_execSequenceLong(op, oend, sequences[seqNb&STOSEQ_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd);
|
||||||
|
if (ZSTD_isError(oneSeqSize)) return oneSeqSize;
|
||||||
|
op += oneSeqSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* save reps for next block */
|
||||||
|
{ U32 i; for (i=0; i<ZSTD_REP_NUM; i++) dctx->entropy.rep[i] = (U32)(seqState.prevOffset[i]); }
|
||||||
|
#undef STORED_SEQS
|
||||||
|
#undef STOSEQ_MASK
|
||||||
|
#undef ADVANCED_SEQS
|
||||||
|
}
|
||||||
|
|
||||||
|
/* last literal segment */
|
||||||
|
{ size_t const lastLLSize = litEnd - litPtr;
|
||||||
|
if (lastLLSize > (size_t)(oend-op)) return ERROR(dstSize_tooSmall);
|
||||||
|
memcpy(op, litPtr, lastLLSize);
|
||||||
|
op += lastLLSize;
|
||||||
|
}
|
||||||
|
|
||||||
|
return op-ostart;
|
||||||
|
}
|
74
lib/zstd.h
74
lib/zstd.h
@ -381,7 +381,9 @@ ZSTDLIB_API size_t ZSTD_DStreamOutSize(void); /*!< recommended size for output
|
|||||||
#define ZSTD_WINDOWLOG_MIN 10
|
#define ZSTD_WINDOWLOG_MIN 10
|
||||||
#define ZSTD_HASHLOG_MAX ((ZSTD_WINDOWLOG_MAX < 30) ? ZSTD_WINDOWLOG_MAX : 30)
|
#define ZSTD_HASHLOG_MAX ((ZSTD_WINDOWLOG_MAX < 30) ? ZSTD_WINDOWLOG_MAX : 30)
|
||||||
#define ZSTD_HASHLOG_MIN 6
|
#define ZSTD_HASHLOG_MIN 6
|
||||||
#define ZSTD_CHAINLOG_MAX ((ZSTD_WINDOWLOG_MAX < 29) ? ZSTD_WINDOWLOG_MAX+1 : 30)
|
#define ZSTD_CHAINLOG_MAX_32 29
|
||||||
|
#define ZSTD_CHAINLOG_MAX_64 30
|
||||||
|
#define ZSTD_CHAINLOG_MAX ((unsigned)(sizeof(size_t) == 4 ? ZSTD_CHAINLOG_MAX_32 : ZSTD_CHAINLOG_MAX_64))
|
||||||
#define ZSTD_CHAINLOG_MIN ZSTD_HASHLOG_MIN
|
#define ZSTD_CHAINLOG_MIN ZSTD_HASHLOG_MIN
|
||||||
#define ZSTD_HASHLOG3_MAX 17
|
#define ZSTD_HASHLOG3_MAX 17
|
||||||
#define ZSTD_SEARCHLOG_MAX (ZSTD_WINDOWLOG_MAX-1)
|
#define ZSTD_SEARCHLOG_MAX (ZSTD_WINDOWLOG_MAX-1)
|
||||||
@ -506,7 +508,7 @@ ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
|
|||||||
* It will also consider src size to be arbitrarily "large", which is worst case.
|
* It will also consider src size to be arbitrarily "large", which is worst case.
|
||||||
* If srcSize is known to always be small, ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation.
|
* If srcSize is known to always be small, ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation.
|
||||||
* ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
* ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
||||||
* ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbThreads is > 1.
|
* ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbWorkers is >= 1.
|
||||||
* Note : CCtx size estimation is only correct for single-threaded compression. */
|
* Note : CCtx size estimation is only correct for single-threaded compression. */
|
||||||
ZSTDLIB_API size_t ZSTD_estimateCCtxSize(int compressionLevel);
|
ZSTDLIB_API size_t ZSTD_estimateCCtxSize(int compressionLevel);
|
||||||
ZSTDLIB_API size_t ZSTD_estimateCCtxSize_usingCParams(ZSTD_compressionParameters cParams);
|
ZSTDLIB_API size_t ZSTD_estimateCCtxSize_usingCParams(ZSTD_compressionParameters cParams);
|
||||||
@ -518,7 +520,7 @@ ZSTDLIB_API size_t ZSTD_estimateDCtxSize(void);
|
|||||||
* It will also consider src size to be arbitrarily "large", which is worst case.
|
* It will also consider src size to be arbitrarily "large", which is worst case.
|
||||||
* If srcSize is known to always be small, ZSTD_estimateCStreamSize_usingCParams() can provide a tighter estimation.
|
* If srcSize is known to always be small, ZSTD_estimateCStreamSize_usingCParams() can provide a tighter estimation.
|
||||||
* ZSTD_estimateCStreamSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
* ZSTD_estimateCStreamSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel.
|
||||||
* ZSTD_estimateCStreamSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbThreads is set to a value > 1.
|
* ZSTD_estimateCStreamSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbWorkers is >= 1.
|
||||||
* Note : CStream size estimation is only correct for single-threaded compression.
|
* Note : CStream size estimation is only correct for single-threaded compression.
|
||||||
* ZSTD_DStream memory budget depends on window Size.
|
* ZSTD_DStream memory budget depends on window Size.
|
||||||
* This information can be passed manually, using ZSTD_estimateDStreamSize,
|
* This information can be passed manually, using ZSTD_estimateDStreamSize,
|
||||||
@ -992,18 +994,13 @@ typedef enum {
|
|||||||
/* multi-threading parameters */
|
/* multi-threading parameters */
|
||||||
/* These parameters are only useful if multi-threading is enabled (ZSTD_MULTITHREAD).
|
/* These parameters are only useful if multi-threading is enabled (ZSTD_MULTITHREAD).
|
||||||
* They return an error otherwise. */
|
* They return an error otherwise. */
|
||||||
ZSTD_p_nbThreads=400, /* Select how many threads a compression job can spawn (default:1)
|
ZSTD_p_nbWorkers=400, /* Select how many threads will be spawned to compress in parallel.
|
||||||
* More threads improve speed, but also increase memory usage.
|
* When nbWorkers >= 1, triggers asynchronous mode :
|
||||||
* Can only receive a value > 1 if ZSTD_MULTITHREAD is enabled.
|
* ZSTD_compress_generic() consumes some input, flush some output if possible, and immediately gives back control to caller,
|
||||||
* Special: value 0 means "do not change nbThreads" */
|
* while compression work is performed in parallel, within worker threads.
|
||||||
ZSTD_p_nonBlockingMode, /* Single thread mode is by default "blocking" :
|
* (note : a strong exception to this rule is when first invocation sets ZSTD_e_end : it becomes a blocking call).
|
||||||
* it finishes its job as much as possible, and only then gives back control to caller.
|
* More workers improve speed, but also increase memory usage.
|
||||||
* In contrast, multi-thread is by default "non-blocking" :
|
* Default value is `0`, aka "single-threaded mode" : no worker is spawned, compression is performed inside Caller's thread, all invocations are blocking */
|
||||||
* it takes some input, flush some output if available, and immediately gives back control to caller.
|
|
||||||
* Compression work is performed in parallel, within worker threads.
|
|
||||||
* (note : a strong exception to this rule is when first job is called with ZSTD_e_end : it becomes blocking)
|
|
||||||
* Setting this parameter to 1 will enforce non-blocking mode even when only 1 thread is selected.
|
|
||||||
* It allows the caller to do other tasks while the worker thread compresses in parallel. */
|
|
||||||
ZSTD_p_jobSize, /* Size of a compression job. This value is only enforced in streaming (non-blocking) mode.
|
ZSTD_p_jobSize, /* Size of a compression job. This value is only enforced in streaming (non-blocking) mode.
|
||||||
* Each compression job is completed in parallel, so indirectly controls the nb of active threads.
|
* Each compression job is completed in parallel, so indirectly controls the nb of active threads.
|
||||||
* 0 means default, which is dynamically determined based on compression parameters.
|
* 0 means default, which is dynamically determined based on compression parameters.
|
||||||
@ -1015,7 +1012,7 @@ typedef enum {
|
|||||||
/* advanced parameters - may not remain available after API update */
|
/* advanced parameters - may not remain available after API update */
|
||||||
ZSTD_p_forceMaxWindow=1100, /* Force back-reference distances to remain < windowSize,
|
ZSTD_p_forceMaxWindow=1100, /* Force back-reference distances to remain < windowSize,
|
||||||
* even when referencing into Dictionary content (default:0) */
|
* even when referencing into Dictionary content (default:0) */
|
||||||
ZSTD_p_enableLongDistanceMatching=1200, /* Enable long distance matching.
|
ZSTD_p_enableLongDistanceMatching=1200, /* Enable long distance matching.
|
||||||
* This parameter is designed to improve the compression
|
* This parameter is designed to improve the compression
|
||||||
* ratio for large inputs with long distance matches.
|
* ratio for large inputs with long distance matches.
|
||||||
* This increases the memory usage as well as window size.
|
* This increases the memory usage as well as window size.
|
||||||
@ -1025,33 +1022,39 @@ typedef enum {
|
|||||||
* other LDM parameters. Setting the compression level
|
* other LDM parameters. Setting the compression level
|
||||||
* after this parameter overrides the window log, though LDM
|
* after this parameter overrides the window log, though LDM
|
||||||
* will remain enabled until explicitly disabled. */
|
* will remain enabled until explicitly disabled. */
|
||||||
ZSTD_p_ldmHashLog, /* Size of the table for long distance matching, as a power of 2.
|
ZSTD_p_ldmHashLog, /* Size of the table for long distance matching, as a power of 2.
|
||||||
* Larger values increase memory usage and compression ratio, but decrease
|
* Larger values increase memory usage and compression ratio, but decrease
|
||||||
* compression speed.
|
* compression speed.
|
||||||
* Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX
|
* Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX
|
||||||
* (default: windowlog - 7). */
|
* (default: windowlog - 7).
|
||||||
ZSTD_p_ldmMinMatch, /* Minimum size of searched matches for long distance matcher.
|
* Special: value 0 means "do not change ldmHashLog". */
|
||||||
* Larger/too small values usually decrease compression ratio.
|
ZSTD_p_ldmMinMatch, /* Minimum size of searched matches for long distance matcher.
|
||||||
* Must be clamped between ZSTD_LDM_MINMATCH_MIN
|
* Larger/too small values usually decrease compression ratio.
|
||||||
* and ZSTD_LDM_MINMATCH_MAX (default: 64). */
|
* Must be clamped between ZSTD_LDM_MINMATCH_MIN
|
||||||
ZSTD_p_ldmBucketSizeLog, /* Log size of each bucket in the LDM hash table for collision resolution.
|
* and ZSTD_LDM_MINMATCH_MAX (default: 64).
|
||||||
* Larger values usually improve collision resolution but may decrease
|
* Special: value 0 means "do not change ldmMinMatch". */
|
||||||
* compression speed.
|
ZSTD_p_ldmBucketSizeLog, /* Log size of each bucket in the LDM hash table for collision resolution.
|
||||||
* The maximum value is ZSTD_LDM_BUCKETSIZELOG_MAX (default: 3). */
|
* Larger values usually improve collision resolution but may decrease
|
||||||
|
* compression speed.
|
||||||
|
* The maximum value is ZSTD_LDM_BUCKETSIZELOG_MAX (default: 3).
|
||||||
|
* note : 0 is a valid value */
|
||||||
ZSTD_p_ldmHashEveryLog, /* Frequency of inserting/looking up entries in the LDM hash table.
|
ZSTD_p_ldmHashEveryLog, /* Frequency of inserting/looking up entries in the LDM hash table.
|
||||||
* The default is MAX(0, (windowLog - ldmHashLog)) to
|
* The default is MAX(0, (windowLog - ldmHashLog)) to
|
||||||
* optimize hash table usage.
|
* optimize hash table usage.
|
||||||
* Larger values improve compression speed. Deviating far from the
|
* Larger values improve compression speed. Deviating far from the
|
||||||
* default value will likely result in a decrease in compression ratio.
|
* default value will likely result in a decrease in compression ratio.
|
||||||
* Must be clamped between 0 and ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN. */
|
* Must be clamped between 0 and ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN.
|
||||||
|
* note : 0 is a valid value */
|
||||||
|
|
||||||
} ZSTD_cParameter;
|
} ZSTD_cParameter;
|
||||||
|
|
||||||
|
|
||||||
/*! ZSTD_CCtx_setParameter() :
|
/*! ZSTD_CCtx_setParameter() :
|
||||||
* Set one compression parameter, selected by enum ZSTD_cParameter.
|
* Set one compression parameter, selected by enum ZSTD_cParameter.
|
||||||
|
* Setting a parameter is generally only possible during frame initialization (before starting compression),
|
||||||
|
* except for a few exceptions which can be updated during compression: compressionLevel, hashLog, chainLog, searchLog, minMatch, targetLength and strategy.
|
||||||
* Note : when `value` is an enum, cast it to unsigned for proper type checking.
|
* Note : when `value` is an enum, cast it to unsigned for proper type checking.
|
||||||
* @result : informational value (typically, the one being set, possibly corrected),
|
* @result : informational value (typically, value being set clamped correctly),
|
||||||
* or an error code (which can be tested with ZSTD_isError()). */
|
* or an error code (which can be tested with ZSTD_isError()). */
|
||||||
ZSTDLIB_API size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value);
|
ZSTDLIB_API size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value);
|
||||||
|
|
||||||
@ -1198,7 +1201,7 @@ ZSTDLIB_API size_t ZSTD_compress_generic_simpleArgs (
|
|||||||
ZSTDLIB_API ZSTD_CCtx_params* ZSTD_createCCtxParams(void);
|
ZSTDLIB_API ZSTD_CCtx_params* ZSTD_createCCtxParams(void);
|
||||||
|
|
||||||
/*! ZSTD_resetCCtxParams() :
|
/*! ZSTD_resetCCtxParams() :
|
||||||
* Reset params to default, with the default compression level.
|
* Reset params to default values.
|
||||||
*/
|
*/
|
||||||
ZSTDLIB_API size_t ZSTD_resetCCtxParams(ZSTD_CCtx_params* params);
|
ZSTDLIB_API size_t ZSTD_resetCCtxParams(ZSTD_CCtx_params* params);
|
||||||
|
|
||||||
@ -1227,9 +1230,10 @@ ZSTDLIB_API size_t ZSTD_CCtxParam_setParameter(ZSTD_CCtx_params* params, ZSTD_cP
|
|||||||
|
|
||||||
/*! ZSTD_CCtx_setParametersUsingCCtxParams() :
|
/*! ZSTD_CCtx_setParametersUsingCCtxParams() :
|
||||||
* Apply a set of ZSTD_CCtx_params to the compression context.
|
* Apply a set of ZSTD_CCtx_params to the compression context.
|
||||||
* This must be done before the dictionary is loaded.
|
* This can be done even after compression is started,
|
||||||
* The pledgedSrcSize is treated as unknown.
|
* if nbWorkers==0, this will have no impact until a new compression is started.
|
||||||
* Multithreading parameters are applied only if nbThreads > 1.
|
* if nbWorkers>=1, new parameters will be picked up at next job,
|
||||||
|
* with a few restrictions (windowLog, pledgedSrcSize, nbWorkers, jobSize, and overlapLog are not updated).
|
||||||
*/
|
*/
|
||||||
ZSTDLIB_API size_t ZSTD_CCtx_setParametersUsingCCtxParams(
|
ZSTDLIB_API size_t ZSTD_CCtx_setParametersUsingCCtxParams(
|
||||||
ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params);
|
ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params);
|
||||||
|
@ -22,7 +22,7 @@
|
|||||||
* Compiler Warnings
|
* Compiler Warnings
|
||||||
****************************************/
|
****************************************/
|
||||||
#ifdef _MSC_VER
|
#ifdef _MSC_VER
|
||||||
# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
|
# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
|
||||||
@ -34,6 +34,7 @@
|
|||||||
#include <stdlib.h> /* malloc, free */
|
#include <stdlib.h> /* malloc, free */
|
||||||
#include <string.h> /* memset */
|
#include <string.h> /* memset */
|
||||||
#include <stdio.h> /* fprintf, fopen */
|
#include <stdio.h> /* fprintf, fopen */
|
||||||
|
#include <assert.h> /* assert */
|
||||||
|
|
||||||
#include "mem.h"
|
#include "mem.h"
|
||||||
#define ZSTD_STATIC_LINKING_ONLY
|
#define ZSTD_STATIC_LINKING_ONLY
|
||||||
@ -51,8 +52,9 @@
|
|||||||
# define ZSTD_GIT_COMMIT_STRING ZSTD_EXPAND_AND_QUOTE(ZSTD_GIT_COMMIT)
|
# define ZSTD_GIT_COMMIT_STRING ZSTD_EXPAND_AND_QUOTE(ZSTD_GIT_COMMIT)
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#define TIMELOOP_MICROSEC 1*1000000ULL /* 1 second */
|
#define TIMELOOP_MICROSEC (1*1000000ULL) /* 1 second */
|
||||||
#define ACTIVEPERIOD_MICROSEC 70*1000000ULL /* 70 seconds */
|
#define TIMELOOP_NANOSEC (1*1000000000ULL) /* 1 second */
|
||||||
|
#define ACTIVEPERIOD_MICROSEC (70*TIMELOOP_MICROSEC) /* 70 seconds */
|
||||||
#define COOLPERIOD_SEC 10
|
#define COOLPERIOD_SEC 10
|
||||||
|
|
||||||
#define KB *(1 <<10)
|
#define KB *(1 <<10)
|
||||||
@ -122,12 +124,12 @@ void BMK_setBlockSize(size_t blockSize)
|
|||||||
|
|
||||||
void BMK_setDecodeOnlyMode(unsigned decodeFlag) { g_decodeOnly = (decodeFlag>0); }
|
void BMK_setDecodeOnlyMode(unsigned decodeFlag) { g_decodeOnly = (decodeFlag>0); }
|
||||||
|
|
||||||
static U32 g_nbThreads = 1;
|
static U32 g_nbWorkers = 0;
|
||||||
void BMK_setNbThreads(unsigned nbThreads) {
|
void BMK_setNbWorkers(unsigned nbWorkers) {
|
||||||
#ifndef ZSTD_MULTITHREAD
|
#ifndef ZSTD_MULTITHREAD
|
||||||
if (nbThreads > 1) DISPLAYLEVEL(2, "Note : multi-threading is disabled \n");
|
if (nbWorkers > 0) DISPLAYLEVEL(2, "Note : multi-threading is disabled \n");
|
||||||
#endif
|
#endif
|
||||||
g_nbThreads = nbThreads;
|
g_nbWorkers = nbWorkers;
|
||||||
}
|
}
|
||||||
|
|
||||||
static U32 g_realTime = 0;
|
static U32 g_realTime = 0;
|
||||||
@ -264,7 +266,9 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
{ U64 fastestC = (U64)(-1LL), fastestD = (U64)(-1LL);
|
{ U64 fastestC = (U64)(-1LL), fastestD = (U64)(-1LL);
|
||||||
U64 const crcOrig = g_decodeOnly ? 0 : XXH64(srcBuffer, srcSize, 0);
|
U64 const crcOrig = g_decodeOnly ? 0 : XXH64(srcBuffer, srcSize, 0);
|
||||||
UTIL_time_t coolTime;
|
UTIL_time_t coolTime;
|
||||||
U64 const maxTime = (g_nbSeconds * TIMELOOP_MICROSEC) + 1;
|
U64 const maxTime = (g_nbSeconds * TIMELOOP_NANOSEC) + 1;
|
||||||
|
U32 nbDecodeLoops = (U32)((100 MB) / (srcSize+1)) + 1; /* initial conservative speed estimate */
|
||||||
|
U32 nbCompressionLoops = (U32)((2 MB) / (srcSize+1)) + 1; /* initial conservative speed estimate */
|
||||||
U64 totalCTime=0, totalDTime=0;
|
U64 totalCTime=0, totalDTime=0;
|
||||||
U32 cCompleted=g_decodeOnly, dCompleted=0;
|
U32 cCompleted=g_decodeOnly, dCompleted=0;
|
||||||
# define NB_MARKS 4
|
# define NB_MARKS 4
|
||||||
@ -283,19 +287,17 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (!g_decodeOnly) {
|
if (!g_decodeOnly) {
|
||||||
UTIL_time_t clockStart;
|
|
||||||
/* Compression */
|
/* Compression */
|
||||||
DISPLAYLEVEL(2, "%2s-%-17.17s :%10u ->\r", marks[markNb], displayName, (U32)srcSize);
|
DISPLAYLEVEL(2, "%2s-%-17.17s :%10u ->\r", marks[markNb], displayName, (U32)srcSize);
|
||||||
if (!cCompleted) memset(compressedBuffer, 0xE5, maxCompressedSize); /* warm up and erase result buffer */
|
if (!cCompleted) memset(compressedBuffer, 0xE5, maxCompressedSize); /* warm up and erase result buffer */
|
||||||
|
|
||||||
UTIL_sleepMilli(1); /* give processor time to other processes */
|
UTIL_sleepMilli(5); /* give processor time to other processes */
|
||||||
UTIL_waitForNextTick();
|
UTIL_waitForNextTick();
|
||||||
clockStart = UTIL_getTime();
|
|
||||||
|
|
||||||
if (!cCompleted) { /* still some time to do compression tests */
|
if (!cCompleted) { /* still some time to do compression tests */
|
||||||
U64 const clockLoop = g_nbSeconds ? TIMELOOP_MICROSEC : 1;
|
|
||||||
U32 nbLoops = 0;
|
U32 nbLoops = 0;
|
||||||
ZSTD_CCtx_setParameter(ctx, ZSTD_p_nbThreads, g_nbThreads);
|
UTIL_time_t const clockStart = UTIL_getTime();
|
||||||
|
ZSTD_CCtx_setParameter(ctx, ZSTD_p_nbWorkers, g_nbWorkers);
|
||||||
ZSTD_CCtx_setParameter(ctx, ZSTD_p_compressionLevel, cLevel);
|
ZSTD_CCtx_setParameter(ctx, ZSTD_p_compressionLevel, cLevel);
|
||||||
ZSTD_CCtx_setParameter(ctx, ZSTD_p_enableLongDistanceMatching, g_ldmFlag);
|
ZSTD_CCtx_setParameter(ctx, ZSTD_p_enableLongDistanceMatching, g_ldmFlag);
|
||||||
ZSTD_CCtx_setParameter(ctx, ZSTD_p_ldmMinMatch, g_ldmMinMatch);
|
ZSTD_CCtx_setParameter(ctx, ZSTD_p_ldmMinMatch, g_ldmMinMatch);
|
||||||
@ -314,7 +316,9 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
ZSTD_CCtx_setParameter(ctx, ZSTD_p_targetLength, comprParams->targetLength);
|
ZSTD_CCtx_setParameter(ctx, ZSTD_p_targetLength, comprParams->targetLength);
|
||||||
ZSTD_CCtx_setParameter(ctx, ZSTD_p_compressionStrategy, comprParams->strategy);
|
ZSTD_CCtx_setParameter(ctx, ZSTD_p_compressionStrategy, comprParams->strategy);
|
||||||
ZSTD_CCtx_loadDictionary(ctx, dictBuffer, dictBufferSize);
|
ZSTD_CCtx_loadDictionary(ctx, dictBuffer, dictBufferSize);
|
||||||
do {
|
|
||||||
|
if (!g_nbSeconds) nbCompressionLoops=1;
|
||||||
|
for (nbLoops=0; nbLoops<nbCompressionLoops; nbLoops++) {
|
||||||
U32 blockNb;
|
U32 blockNb;
|
||||||
for (blockNb=0; blockNb<nbBlocks; blockNb++) {
|
for (blockNb=0; blockNb<nbBlocks; blockNb++) {
|
||||||
#if 0 /* direct compression function, for occasional comparison */
|
#if 0 /* direct compression function, for occasional comparison */
|
||||||
@ -343,12 +347,16 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
}
|
}
|
||||||
blockTable[blockNb].cSize = out.pos;
|
blockTable[blockNb].cSize = out.pos;
|
||||||
#endif
|
#endif
|
||||||
|
} }
|
||||||
|
{ U64 const loopDuration = UTIL_clockSpanNano(clockStart);
|
||||||
|
if (loopDuration > 0) {
|
||||||
|
if (loopDuration < fastestC * nbCompressionLoops)
|
||||||
|
fastestC = loopDuration / nbCompressionLoops;
|
||||||
|
nbCompressionLoops = (U32)(TIMELOOP_NANOSEC / fastestC) + 1;
|
||||||
|
} else {
|
||||||
|
assert(nbCompressionLoops < 40000000); /* avoid overflow */
|
||||||
|
nbCompressionLoops *= 100;
|
||||||
}
|
}
|
||||||
nbLoops++;
|
|
||||||
} while (UTIL_clockSpanMicro(clockStart) < clockLoop);
|
|
||||||
{ U64 const loopDuration = UTIL_clockSpanMicro(clockStart);
|
|
||||||
if (loopDuration < fastestC*nbLoops)
|
|
||||||
fastestC = loopDuration / nbLoops;
|
|
||||||
totalCTime += loopDuration;
|
totalCTime += loopDuration;
|
||||||
cCompleted = (totalCTime >= maxTime); /* end compression tests */
|
cCompleted = (totalCTime >= maxTime); /* end compression tests */
|
||||||
} }
|
} }
|
||||||
@ -358,7 +366,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
ratio = (double)srcSize / (double)cSize;
|
ratio = (double)srcSize / (double)cSize;
|
||||||
markNb = (markNb+1) % NB_MARKS;
|
markNb = (markNb+1) % NB_MARKS;
|
||||||
{ int const ratioAccuracy = (ratio < 10.) ? 3 : 2;
|
{ int const ratioAccuracy = (ratio < 10.) ? 3 : 2;
|
||||||
double const compressionSpeed = (double)srcSize / fastestC;
|
double const compressionSpeed = ((double)srcSize / fastestC) * 1000;
|
||||||
int const cSpeedAccuracy = (compressionSpeed < 10.) ? 2 : 1;
|
int const cSpeedAccuracy = (compressionSpeed < 10.) ? 2 : 1;
|
||||||
DISPLAYLEVEL(2, "%2s-%-17.17s :%10u ->%10u (%5.*f),%6.*f MB/s\r",
|
DISPLAYLEVEL(2, "%2s-%-17.17s :%10u ->%10u (%5.*f),%6.*f MB/s\r",
|
||||||
marks[markNb], displayName, (U32)srcSize, (U32)cSize,
|
marks[markNb], displayName, (U32)srcSize, (U32)cSize,
|
||||||
@ -376,16 +384,16 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
/* Decompression */
|
/* Decompression */
|
||||||
if (!dCompleted) memset(resultBuffer, 0xD6, srcSize); /* warm result buffer */
|
if (!dCompleted) memset(resultBuffer, 0xD6, srcSize); /* warm result buffer */
|
||||||
|
|
||||||
UTIL_sleepMilli(1); /* give processor time to other processes */
|
UTIL_sleepMilli(5); /* give processor time to other processes */
|
||||||
UTIL_waitForNextTick();
|
UTIL_waitForNextTick();
|
||||||
|
|
||||||
if (!dCompleted) {
|
if (!dCompleted) {
|
||||||
U64 clockLoop = g_nbSeconds ? TIMELOOP_MICROSEC : 1;
|
|
||||||
U32 nbLoops = 0;
|
U32 nbLoops = 0;
|
||||||
ZSTD_DDict* const ddict = ZSTD_createDDict(dictBuffer, dictBufferSize);
|
ZSTD_DDict* const ddict = ZSTD_createDDict(dictBuffer, dictBufferSize);
|
||||||
UTIL_time_t const clockStart = UTIL_getTime();
|
UTIL_time_t const clockStart = UTIL_getTime();
|
||||||
if (!ddict) EXM_THROW(2, "ZSTD_createDDict() allocation failure");
|
if (!ddict) EXM_THROW(2, "ZSTD_createDDict() allocation failure");
|
||||||
do {
|
if (!g_nbSeconds) nbDecodeLoops = 1;
|
||||||
|
for (nbLoops=0; nbLoops < nbDecodeLoops; nbLoops++) {
|
||||||
U32 blockNb;
|
U32 blockNb;
|
||||||
for (blockNb=0; blockNb<nbBlocks; blockNb++) {
|
for (blockNb=0; blockNb<nbBlocks; blockNb++) {
|
||||||
size_t const regenSize = ZSTD_decompress_usingDDict(dctx,
|
size_t const regenSize = ZSTD_decompress_usingDDict(dctx,
|
||||||
@ -397,22 +405,26 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
blockNb, (U32)blockTable[blockNb].cSize, ZSTD_getErrorName(regenSize));
|
blockNb, (U32)blockTable[blockNb].cSize, ZSTD_getErrorName(regenSize));
|
||||||
}
|
}
|
||||||
blockTable[blockNb].resSize = regenSize;
|
blockTable[blockNb].resSize = regenSize;
|
||||||
}
|
} }
|
||||||
nbLoops++;
|
|
||||||
} while (UTIL_clockSpanMicro(clockStart) < clockLoop);
|
|
||||||
ZSTD_freeDDict(ddict);
|
ZSTD_freeDDict(ddict);
|
||||||
{ U64 const loopDuration = UTIL_clockSpanMicro(clockStart);
|
{ U64 const loopDuration = UTIL_clockSpanNano(clockStart);
|
||||||
if (loopDuration < fastestD*nbLoops)
|
if (loopDuration > 0) {
|
||||||
fastestD = loopDuration / nbLoops;
|
if (loopDuration < fastestD * nbDecodeLoops)
|
||||||
|
fastestD = loopDuration / nbDecodeLoops;
|
||||||
|
nbDecodeLoops = (U32)(TIMELOOP_NANOSEC / fastestD) + 1;
|
||||||
|
} else {
|
||||||
|
assert(nbDecodeLoops < 40000000); /* avoid overflow */
|
||||||
|
nbDecodeLoops *= 100;
|
||||||
|
}
|
||||||
totalDTime += loopDuration;
|
totalDTime += loopDuration;
|
||||||
dCompleted = (totalDTime >= maxTime);
|
dCompleted = (totalDTime >= maxTime);
|
||||||
} }
|
} }
|
||||||
|
|
||||||
markNb = (markNb+1) % NB_MARKS;
|
markNb = (markNb+1) % NB_MARKS;
|
||||||
{ int const ratioAccuracy = (ratio < 10.) ? 3 : 2;
|
{ int const ratioAccuracy = (ratio < 10.) ? 3 : 2;
|
||||||
double const compressionSpeed = (double)srcSize / fastestC;
|
double const compressionSpeed = ((double)srcSize / fastestC) * 1000;
|
||||||
int const cSpeedAccuracy = (compressionSpeed < 10.) ? 2 : 1;
|
int const cSpeedAccuracy = (compressionSpeed < 10.) ? 2 : 1;
|
||||||
double const decompressionSpeed = (double)srcSize / fastestD;
|
double const decompressionSpeed = ((double)srcSize / fastestD) * 1000;
|
||||||
DISPLAYLEVEL(2, "%2s-%-17.17s :%10u ->%10u (%5.*f),%6.*f MB/s ,%6.1f MB/s \r",
|
DISPLAYLEVEL(2, "%2s-%-17.17s :%10u ->%10u (%5.*f),%6.*f MB/s ,%6.1f MB/s \r",
|
||||||
marks[markNb], displayName, (U32)srcSize, (U32)cSize,
|
marks[markNb], displayName, (U32)srcSize, (U32)cSize,
|
||||||
ratioAccuracy, ratio,
|
ratioAccuracy, ratio,
|
||||||
@ -461,8 +473,8 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
|
|||||||
} /* for (testNb = 1; testNb <= (g_nbSeconds + !g_nbSeconds); testNb++) */
|
} /* for (testNb = 1; testNb <= (g_nbSeconds + !g_nbSeconds); testNb++) */
|
||||||
|
|
||||||
if (g_displayLevel == 1) { /* hidden display mode -q, used by python speed benchmark */
|
if (g_displayLevel == 1) { /* hidden display mode -q, used by python speed benchmark */
|
||||||
double const cSpeed = (double)srcSize / fastestC;
|
double const cSpeed = ((double)srcSize / fastestC) * 1000;
|
||||||
double const dSpeed = (double)srcSize / fastestD;
|
double const dSpeed = ((double)srcSize / fastestD) * 1000;
|
||||||
if (g_additionalParam)
|
if (g_additionalParam)
|
||||||
DISPLAY("-%-3i%11i (%5.3f) %6.2f MB/s %6.1f MB/s %s (param=%d)\n", cLevel, (int)cSize, ratio, cSpeed, dSpeed, displayName, g_additionalParam);
|
DISPLAY("-%-3i%11i (%5.3f) %6.2f MB/s %6.1f MB/s %s (param=%d)\n", cLevel, (int)cSize, ratio, cSpeed, dSpeed, displayName, g_additionalParam);
|
||||||
else
|
else
|
||||||
@ -634,7 +646,8 @@ static void BMK_benchFileTable(const char* const * const fileNamesTable, unsigne
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static void BMK_syntheticTest(int cLevel, int cLevelLast, double compressibility, const ZSTD_compressionParameters* compressionParams)
|
static void BMK_syntheticTest(int cLevel, int cLevelLast, double compressibility,
|
||||||
|
const ZSTD_compressionParameters* compressionParams)
|
||||||
{
|
{
|
||||||
char name[20] = {0};
|
char name[20] = {0};
|
||||||
size_t benchedSize = 10000000;
|
size_t benchedSize = 10000000;
|
||||||
|
@ -22,7 +22,7 @@ int BMK_benchFiles(const char** fileNamesTable, unsigned nbFiles, const char* di
|
|||||||
/* Set Parameters */
|
/* Set Parameters */
|
||||||
void BMK_setNbSeconds(unsigned nbLoops);
|
void BMK_setNbSeconds(unsigned nbLoops);
|
||||||
void BMK_setBlockSize(size_t blockSize);
|
void BMK_setBlockSize(size_t blockSize);
|
||||||
void BMK_setNbThreads(unsigned nbThreads);
|
void BMK_setNbWorkers(unsigned nbWorkers);
|
||||||
void BMK_setRealTime(unsigned priority);
|
void BMK_setRealTime(unsigned priority);
|
||||||
void BMK_setNotificationLevel(unsigned level);
|
void BMK_setNotificationLevel(unsigned level);
|
||||||
void BMK_setSeparateFiles(unsigned separate);
|
void BMK_setSeparateFiles(unsigned separate);
|
||||||
|
@ -36,18 +36,21 @@
|
|||||||
# include <io.h>
|
# include <io.h>
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#include "bitstream.h"
|
|
||||||
#include "mem.h"
|
#include "mem.h"
|
||||||
#include "fileio.h"
|
#include "fileio.h"
|
||||||
#include "util.h"
|
#include "util.h"
|
||||||
|
|
||||||
#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_magicNumber, ZSTD_frameHeaderSize_max */
|
#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_magicNumber, ZSTD_frameHeaderSize_max */
|
||||||
#include "zstd.h"
|
#include "zstd.h"
|
||||||
|
#include "zstd_errors.h" /* ZSTD_error_frameParameter_windowTooLarge */
|
||||||
|
|
||||||
#if defined(ZSTD_GZCOMPRESS) || defined(ZSTD_GZDECOMPRESS)
|
#if defined(ZSTD_GZCOMPRESS) || defined(ZSTD_GZDECOMPRESS)
|
||||||
# include <zlib.h>
|
# include <zlib.h>
|
||||||
# if !defined(z_const)
|
# if !defined(z_const)
|
||||||
# define z_const
|
# define z_const
|
||||||
# endif
|
# endif
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#if defined(ZSTD_LZMACOMPRESS) || defined(ZSTD_LZMADECOMPRESS)
|
#if defined(ZSTD_LZMACOMPRESS) || defined(ZSTD_LZMADECOMPRESS)
|
||||||
# include <lzma.h>
|
# include <lzma.h>
|
||||||
#endif
|
#endif
|
||||||
@ -215,23 +218,23 @@ static U32 g_removeSrcFile = 0;
|
|||||||
void FIO_setRemoveSrcFile(unsigned flag) { g_removeSrcFile = (flag>0); }
|
void FIO_setRemoveSrcFile(unsigned flag) { g_removeSrcFile = (flag>0); }
|
||||||
static U32 g_memLimit = 0;
|
static U32 g_memLimit = 0;
|
||||||
void FIO_setMemLimit(unsigned memLimit) { g_memLimit = memLimit; }
|
void FIO_setMemLimit(unsigned memLimit) { g_memLimit = memLimit; }
|
||||||
static U32 g_nbThreads = 1;
|
static U32 g_nbWorkers = 1;
|
||||||
void FIO_setNbThreads(unsigned nbThreads) {
|
void FIO_setNbWorkers(unsigned nbWorkers) {
|
||||||
#ifndef ZSTD_MULTITHREAD
|
#ifndef ZSTD_MULTITHREAD
|
||||||
if (nbThreads > 1) DISPLAYLEVEL(2, "Note : multi-threading is disabled \n");
|
if (nbWorkers > 0) DISPLAYLEVEL(2, "Note : multi-threading is disabled \n");
|
||||||
#endif
|
#endif
|
||||||
g_nbThreads = nbThreads;
|
g_nbWorkers = nbWorkers;
|
||||||
}
|
}
|
||||||
static U32 g_blockSize = 0;
|
static U32 g_blockSize = 0;
|
||||||
void FIO_setBlockSize(unsigned blockSize) {
|
void FIO_setBlockSize(unsigned blockSize) {
|
||||||
if (blockSize && g_nbThreads==1)
|
if (blockSize && g_nbWorkers==0)
|
||||||
DISPLAYLEVEL(2, "Setting block size is useless in single-thread mode \n");
|
DISPLAYLEVEL(2, "Setting block size is useless in single-thread mode \n");
|
||||||
g_blockSize = blockSize;
|
g_blockSize = blockSize;
|
||||||
}
|
}
|
||||||
#define FIO_OVERLAP_LOG_NOTSET 9999
|
#define FIO_OVERLAP_LOG_NOTSET 9999
|
||||||
static U32 g_overlapLog = FIO_OVERLAP_LOG_NOTSET;
|
static U32 g_overlapLog = FIO_OVERLAP_LOG_NOTSET;
|
||||||
void FIO_setOverlapLog(unsigned overlapLog){
|
void FIO_setOverlapLog(unsigned overlapLog){
|
||||||
if (overlapLog && g_nbThreads==1)
|
if (overlapLog && g_nbWorkers==0)
|
||||||
DISPLAYLEVEL(2, "Setting overlapLog is useless in single-thread mode \n");
|
DISPLAYLEVEL(2, "Setting overlapLog is useless in single-thread mode \n");
|
||||||
g_overlapLog = overlapLog;
|
g_overlapLog = overlapLog;
|
||||||
}
|
}
|
||||||
@ -427,7 +430,7 @@ static cRess_t FIO_createCResources(const char* dictFileName, int cLevel,
|
|||||||
if (!ress.srcBuffer || !ress.dstBuffer)
|
if (!ress.srcBuffer || !ress.dstBuffer)
|
||||||
EXM_THROW(31, "allocation error : not enough memory");
|
EXM_THROW(31, "allocation error : not enough memory");
|
||||||
|
|
||||||
/* Advances parameters, including dictionary */
|
/* Advanced parameters, including dictionary */
|
||||||
{ void* dictBuffer;
|
{ void* dictBuffer;
|
||||||
size_t const dictBuffSize = FIO_createDictBuffer(&dictBuffer, dictFileName); /* works with dictFileName==NULL */
|
size_t const dictBuffSize = FIO_createDictBuffer(&dictBuffer, dictFileName); /* works with dictFileName==NULL */
|
||||||
if (dictFileName && (dictBuffer==NULL))
|
if (dictFileName && (dictBuffer==NULL))
|
||||||
@ -439,8 +442,7 @@ static cRess_t FIO_createCResources(const char* dictFileName, int cLevel,
|
|||||||
/* compression level */
|
/* compression level */
|
||||||
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_compressionLevel, cLevel) );
|
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_compressionLevel, cLevel) );
|
||||||
/* long distance matching */
|
/* long distance matching */
|
||||||
CHECK( ZSTD_CCtx_setParameter(
|
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_enableLongDistanceMatching, g_ldmFlag) );
|
||||||
ress.cctx, ZSTD_p_enableLongDistanceMatching, g_ldmFlag) );
|
|
||||||
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_ldmHashLog, g_ldmHashLog) );
|
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_ldmHashLog, g_ldmHashLog) );
|
||||||
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_ldmMinMatch, g_ldmMinMatch) );
|
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_ldmMinMatch, g_ldmMinMatch) );
|
||||||
if (g_ldmBucketSizeLog != FIO_LDM_PARAM_NOTSET) {
|
if (g_ldmBucketSizeLog != FIO_LDM_PARAM_NOTSET) {
|
||||||
@ -458,13 +460,12 @@ static cRess_t FIO_createCResources(const char* dictFileName, int cLevel,
|
|||||||
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_targetLength, comprParams->targetLength) );
|
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_targetLength, comprParams->targetLength) );
|
||||||
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_compressionStrategy, (U32)comprParams->strategy) );
|
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_compressionStrategy, (U32)comprParams->strategy) );
|
||||||
/* multi-threading */
|
/* multi-threading */
|
||||||
DISPLAYLEVEL(5,"set nb threads = %u \n", g_nbThreads);
|
|
||||||
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_nbThreads, g_nbThreads) );
|
|
||||||
#ifdef ZSTD_MULTITHREAD
|
#ifdef ZSTD_MULTITHREAD
|
||||||
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_nonBlockingMode, 1) );
|
DISPLAYLEVEL(5,"set nb workers = %u \n", g_nbWorkers);
|
||||||
|
CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_nbWorkers, g_nbWorkers) );
|
||||||
#endif
|
#endif
|
||||||
/* dictionary */
|
/* dictionary */
|
||||||
CHECK( ZSTD_CCtx_setPledgedSrcSize(ress.cctx, srcSize) ); /* just for dictionary loading, for compression parameters adaptation */
|
CHECK( ZSTD_CCtx_setPledgedSrcSize(ress.cctx, srcSize) ); /* set the value temporarily for dictionary loading, to adapt compression parameters */
|
||||||
CHECK( ZSTD_CCtx_loadDictionary(ress.cctx, dictBuffer, dictBuffSize) );
|
CHECK( ZSTD_CCtx_loadDictionary(ress.cctx, dictBuffer, dictBuffSize) );
|
||||||
CHECK( ZSTD_CCtx_setPledgedSrcSize(ress.cctx, ZSTD_CONTENTSIZE_UNKNOWN) ); /* reset */
|
CHECK( ZSTD_CCtx_setPledgedSrcSize(ress.cctx, ZSTD_CONTENTSIZE_UNKNOWN) ); /* reset */
|
||||||
|
|
||||||
@ -735,56 +736,22 @@ static unsigned long long FIO_compressLz4Frame(cRess_t* ress,
|
|||||||
* @return : 0 : compression completed correctly,
|
* @return : 0 : compression completed correctly,
|
||||||
* 1 : missing or pb opening srcFileName
|
* 1 : missing or pb opening srcFileName
|
||||||
*/
|
*/
|
||||||
static int FIO_compressFilename_internal(cRess_t ress,
|
static unsigned long long
|
||||||
const char* dstFileName, const char* srcFileName, int compressionLevel)
|
FIO_compressZstdFrame(const cRess_t* ressPtr,
|
||||||
|
const char* srcFileName, U64 fileSize,
|
||||||
|
int compressionLevel, U64* readsize)
|
||||||
{
|
{
|
||||||
|
cRess_t const ress = *ressPtr;
|
||||||
FILE* const srcFile = ress.srcFile;
|
FILE* const srcFile = ress.srcFile;
|
||||||
FILE* const dstFile = ress.dstFile;
|
FILE* const dstFile = ress.dstFile;
|
||||||
U64 readsize = 0;
|
|
||||||
U64 compressedfilesize = 0;
|
U64 compressedfilesize = 0;
|
||||||
U64 const fileSize = UTIL_getFileSize(srcFileName);
|
|
||||||
ZSTD_EndDirective directive = ZSTD_e_continue;
|
ZSTD_EndDirective directive = ZSTD_e_continue;
|
||||||
DISPLAYLEVEL(5, "%s: %u bytes \n", srcFileName, (U32)fileSize);
|
DISPLAYLEVEL(6, "compression using zstd format \n");
|
||||||
|
|
||||||
switch (g_compressionType) {
|
|
||||||
case FIO_zstdCompression:
|
|
||||||
break;
|
|
||||||
|
|
||||||
case FIO_gzipCompression:
|
|
||||||
#ifdef ZSTD_GZCOMPRESS
|
|
||||||
compressedfilesize = FIO_compressGzFrame(&ress, srcFileName, fileSize, compressionLevel, &readsize);
|
|
||||||
#else
|
|
||||||
(void)compressionLevel;
|
|
||||||
EXM_THROW(20, "zstd: %s: file cannot be compressed as gzip (zstd compiled without ZSTD_GZCOMPRESS) -- ignored \n",
|
|
||||||
srcFileName);
|
|
||||||
#endif
|
|
||||||
goto finish;
|
|
||||||
|
|
||||||
case FIO_xzCompression:
|
|
||||||
case FIO_lzmaCompression:
|
|
||||||
#ifdef ZSTD_LZMACOMPRESS
|
|
||||||
compressedfilesize = FIO_compressLzmaFrame(&ress, srcFileName, fileSize, compressionLevel, &readsize, g_compressionType==FIO_lzmaCompression);
|
|
||||||
#else
|
|
||||||
(void)compressionLevel;
|
|
||||||
EXM_THROW(20, "zstd: %s: file cannot be compressed as xz/lzma (zstd compiled without ZSTD_LZMACOMPRESS) -- ignored \n",
|
|
||||||
srcFileName);
|
|
||||||
#endif
|
|
||||||
goto finish;
|
|
||||||
|
|
||||||
case FIO_lz4Compression:
|
|
||||||
#ifdef ZSTD_LZ4COMPRESS
|
|
||||||
compressedfilesize = FIO_compressLz4Frame(&ress, srcFileName, fileSize, compressionLevel, &readsize);
|
|
||||||
#else
|
|
||||||
(void)compressionLevel;
|
|
||||||
EXM_THROW(20, "zstd: %s: file cannot be compressed as lz4 (zstd compiled without ZSTD_LZ4COMPRESS) -- ignored \n",
|
|
||||||
srcFileName);
|
|
||||||
#endif
|
|
||||||
goto finish;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* init */
|
/* init */
|
||||||
if (fileSize != UTIL_FILESIZE_UNKNOWN)
|
if (fileSize != UTIL_FILESIZE_UNKNOWN)
|
||||||
ZSTD_CCtx_setPledgedSrcSize(ress.cctx, fileSize);
|
ZSTD_CCtx_setPledgedSrcSize(ress.cctx, fileSize);
|
||||||
|
(void)compressionLevel; (void)srcFileName;
|
||||||
|
|
||||||
/* Main compression loop */
|
/* Main compression loop */
|
||||||
do {
|
do {
|
||||||
@ -793,9 +760,9 @@ static int FIO_compressFilename_internal(cRess_t ress,
|
|||||||
size_t const inSize = fread(ress.srcBuffer, (size_t)1, ress.srcBufferSize, srcFile);
|
size_t const inSize = fread(ress.srcBuffer, (size_t)1, ress.srcBufferSize, srcFile);
|
||||||
ZSTD_inBuffer inBuff = { ress.srcBuffer, inSize, 0 };
|
ZSTD_inBuffer inBuff = { ress.srcBuffer, inSize, 0 };
|
||||||
DISPLAYLEVEL(6, "fread %u bytes from source \n", (U32)inSize);
|
DISPLAYLEVEL(6, "fread %u bytes from source \n", (U32)inSize);
|
||||||
readsize += inSize;
|
*readsize += inSize;
|
||||||
|
|
||||||
if (inSize == 0 || (fileSize != UTIL_FILESIZE_UNKNOWN && readsize == fileSize))
|
if ((inSize == 0) || (*readsize == fileSize))
|
||||||
directive = ZSTD_e_end;
|
directive = ZSTD_e_end;
|
||||||
|
|
||||||
result = 1;
|
result = 1;
|
||||||
@ -809,12 +776,13 @@ static int FIO_compressFilename_internal(cRess_t ress,
|
|||||||
if (outBuff.pos) {
|
if (outBuff.pos) {
|
||||||
size_t const sizeCheck = fwrite(ress.dstBuffer, 1, outBuff.pos, dstFile);
|
size_t const sizeCheck = fwrite(ress.dstBuffer, 1, outBuff.pos, dstFile);
|
||||||
if (sizeCheck!=outBuff.pos)
|
if (sizeCheck!=outBuff.pos)
|
||||||
EXM_THROW(25, "Write error : cannot write compressed block into %s", dstFileName);
|
EXM_THROW(25, "Write error : cannot write compressed block");
|
||||||
compressedfilesize += outBuff.pos;
|
compressedfilesize += outBuff.pos;
|
||||||
}
|
}
|
||||||
if (READY_FOR_UPDATE()) {
|
if (READY_FOR_UPDATE()) {
|
||||||
ZSTD_frameProgression const zfp = ZSTD_getFrameProgression(ress.cctx);
|
ZSTD_frameProgression const zfp = ZSTD_getFrameProgression(ress.cctx);
|
||||||
DISPLAYUPDATE(2, "\rRead :%6u MB - Consumed :%6u MB - Compressed :%6u MB => %.2f%%",
|
DISPLAYUPDATE(2, "\r(%i) Read :%6u MB - Consumed :%6u MB - Compressed :%6u MB => %.2f%%",
|
||||||
|
compressionLevel,
|
||||||
(U32)(zfp.ingested >> 20),
|
(U32)(zfp.ingested >> 20),
|
||||||
(U32)(zfp.consumed >> 20),
|
(U32)(zfp.consumed >> 20),
|
||||||
(U32)(zfp.produced >> 20),
|
(U32)(zfp.produced >> 20),
|
||||||
@ -823,10 +791,67 @@ static int FIO_compressFilename_internal(cRess_t ress,
|
|||||||
}
|
}
|
||||||
} while (directive != ZSTD_e_end);
|
} while (directive != ZSTD_e_end);
|
||||||
|
|
||||||
finish:
|
return compressedfilesize;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*! FIO_compressFilename_internal() :
|
||||||
|
* same as FIO_compressFilename_extRess(), with `ress.desFile` already opened.
|
||||||
|
* @return : 0 : compression completed correctly,
|
||||||
|
* 1 : missing or pb opening srcFileName
|
||||||
|
*/
|
||||||
|
static int
|
||||||
|
FIO_compressFilename_internal(cRess_t ress,
|
||||||
|
const char* dstFileName, const char* srcFileName,
|
||||||
|
int compressionLevel)
|
||||||
|
{
|
||||||
|
U64 readsize = 0;
|
||||||
|
U64 compressedfilesize = 0;
|
||||||
|
U64 const fileSize = UTIL_getFileSize(srcFileName);
|
||||||
|
DISPLAYLEVEL(5, "%s: %u bytes \n", srcFileName, (U32)fileSize);
|
||||||
|
|
||||||
|
/* compression format selection */
|
||||||
|
switch (g_compressionType) {
|
||||||
|
default:
|
||||||
|
case FIO_zstdCompression:
|
||||||
|
compressedfilesize = FIO_compressZstdFrame(&ress, srcFileName, fileSize, compressionLevel, &readsize);
|
||||||
|
break;
|
||||||
|
|
||||||
|
case FIO_gzipCompression:
|
||||||
|
#ifdef ZSTD_GZCOMPRESS
|
||||||
|
compressedfilesize = FIO_compressGzFrame(&ress, srcFileName, fileSize, compressionLevel, &readsize);
|
||||||
|
#else
|
||||||
|
(void)compressionLevel;
|
||||||
|
EXM_THROW(20, "zstd: %s: file cannot be compressed as gzip (zstd compiled without ZSTD_GZCOMPRESS) -- ignored \n",
|
||||||
|
srcFileName);
|
||||||
|
#endif
|
||||||
|
break;
|
||||||
|
|
||||||
|
case FIO_xzCompression:
|
||||||
|
case FIO_lzmaCompression:
|
||||||
|
#ifdef ZSTD_LZMACOMPRESS
|
||||||
|
compressedfilesize = FIO_compressLzmaFrame(&ress, srcFileName, fileSize, compressionLevel, &readsize, g_compressionType==FIO_lzmaCompression);
|
||||||
|
#else
|
||||||
|
(void)compressionLevel;
|
||||||
|
EXM_THROW(20, "zstd: %s: file cannot be compressed as xz/lzma (zstd compiled without ZSTD_LZMACOMPRESS) -- ignored \n",
|
||||||
|
srcFileName);
|
||||||
|
#endif
|
||||||
|
break;
|
||||||
|
|
||||||
|
case FIO_lz4Compression:
|
||||||
|
#ifdef ZSTD_LZ4COMPRESS
|
||||||
|
compressedfilesize = FIO_compressLz4Frame(&ress, srcFileName, fileSize, compressionLevel, &readsize);
|
||||||
|
#else
|
||||||
|
(void)compressionLevel;
|
||||||
|
EXM_THROW(20, "zstd: %s: file cannot be compressed as lz4 (zstd compiled without ZSTD_LZ4COMPRESS) -- ignored \n",
|
||||||
|
srcFileName);
|
||||||
|
#endif
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
/* Status */
|
/* Status */
|
||||||
DISPLAYLEVEL(2, "\r%79s\r", "");
|
DISPLAYLEVEL(2, "\r%79s\r", "");
|
||||||
DISPLAYLEVEL(2,"%-20s :%6.2f%% (%6llu => %6llu bytes, %s) \n", srcFileName,
|
DISPLAYLEVEL(2,"%-20s :%6.2f%% (%6llu => %6llu bytes, %s) \n",
|
||||||
|
srcFileName,
|
||||||
(double)compressedfilesize / (readsize+(!readsize)/*avoid div by zero*/) * 100,
|
(double)compressedfilesize / (readsize+(!readsize)/*avoid div by zero*/) * 100,
|
||||||
(unsigned long long)readsize, (unsigned long long) compressedfilesize,
|
(unsigned long long)readsize, (unsigned long long) compressedfilesize,
|
||||||
dstFileName);
|
dstFileName);
|
||||||
@ -1142,33 +1167,46 @@ static unsigned FIO_passThrough(FILE* foutput, FILE* finput, void* buffer, size_
|
|||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void FIO_zstdErrorHelp(dRess_t* ress, size_t ret, char const* srcFileName)
|
/* FIO_highbit64() :
|
||||||
|
* gives position of highest bit.
|
||||||
|
* note : only works for v > 0 !
|
||||||
|
*/
|
||||||
|
static unsigned FIO_highbit64(unsigned long long v)
|
||||||
|
{
|
||||||
|
unsigned count = 0;
|
||||||
|
assert(v != 0);
|
||||||
|
v >>= 1;
|
||||||
|
while (v) { v >>= 1; count++; }
|
||||||
|
return count;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* FIO_zstdErrorHelp() :
|
||||||
|
* detailed error message when requested window size is too large */
|
||||||
|
static void FIO_zstdErrorHelp(dRess_t* ress, size_t err, char const* srcFileName)
|
||||||
{
|
{
|
||||||
ZSTD_frameHeader header;
|
ZSTD_frameHeader header;
|
||||||
/* No special help for these errors */
|
|
||||||
if (ZSTD_getErrorCode(ret) != ZSTD_error_frameParameter_windowTooLarge)
|
/* Help message only for one specific error */
|
||||||
|
if (ZSTD_getErrorCode(err) != ZSTD_error_frameParameter_windowTooLarge)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
/* Try to decode the frame header */
|
/* Try to decode the frame header */
|
||||||
ret = ZSTD_getFrameHeader(&header, ress->srcBuffer, ress->srcBufferLoaded);
|
err = ZSTD_getFrameHeader(&header, ress->srcBuffer, ress->srcBufferLoaded);
|
||||||
if (ret == 0) {
|
if (err == 0) {
|
||||||
U32 const windowSize = (U32)header.windowSize;
|
unsigned long long const windowSize = header.windowSize;
|
||||||
U32 const windowLog = BIT_highbit32(windowSize) + ((windowSize & (windowSize - 1)) != 0);
|
U32 const windowLog = FIO_highbit64(windowSize) + ((windowSize & (windowSize - 1)) != 0);
|
||||||
U32 const windowMB = (windowSize >> 20) + ((windowSize & ((1 MB) - 1)) != 0);
|
U32 const windowMB = (U32)((windowSize >> 20) + ((windowSize & ((1 MB) - 1)) != 0));
|
||||||
assert(header.windowSize <= (U64)((U32)-1));
|
assert(windowSize < (U64)(1ULL << 52));
|
||||||
assert(g_memLimit > 0);
|
assert(g_memLimit > 0);
|
||||||
DISPLAYLEVEL(1, "%s : Window size larger than maximum : %llu > %u\n",
|
DISPLAYLEVEL(1, "%s : Window size larger than maximum : %llu > %u\n",
|
||||||
srcFileName, header.windowSize, g_memLimit);
|
srcFileName, windowSize, g_memLimit);
|
||||||
if (windowLog <= ZSTD_WINDOWLOG_MAX) {
|
if (windowLog <= ZSTD_WINDOWLOG_MAX) {
|
||||||
DISPLAYLEVEL(1, "%s : Use --long=%u or --memory=%uMB\n",
|
DISPLAYLEVEL(1, "%s : Use --long=%u or --memory=%uMB\n",
|
||||||
srcFileName, windowLog, windowMB);
|
srcFileName, windowLog, windowMB);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
} else if (ZSTD_getErrorCode(ret) != ZSTD_error_frameParameter_windowTooLarge) {
|
|
||||||
DISPLAYLEVEL(1, "%s : Error decoding frame header to read window size : %s\n",
|
|
||||||
srcFileName, ZSTD_getErrorName(ret));
|
|
||||||
return;
|
|
||||||
}
|
}
|
||||||
DISPLAYLEVEL(1, "%s : Window log larger than ZSTD_WINDOWLOG_MAX=%u not supported\n",
|
DISPLAYLEVEL(1, "%s : Window log larger than ZSTD_WINDOWLOG_MAX=%u; not supported\n",
|
||||||
srcFileName, ZSTD_WINDOWLOG_MAX);
|
srcFileName, ZSTD_WINDOWLOG_MAX);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -54,7 +54,7 @@ void FIO_setDictIDFlag(unsigned dictIDFlag);
|
|||||||
void FIO_setChecksumFlag(unsigned checksumFlag);
|
void FIO_setChecksumFlag(unsigned checksumFlag);
|
||||||
void FIO_setRemoveSrcFile(unsigned flag);
|
void FIO_setRemoveSrcFile(unsigned flag);
|
||||||
void FIO_setMemLimit(unsigned memLimit);
|
void FIO_setMemLimit(unsigned memLimit);
|
||||||
void FIO_setNbThreads(unsigned nbThreads);
|
void FIO_setNbWorkers(unsigned nbWorkers);
|
||||||
void FIO_setBlockSize(unsigned blockSize);
|
void FIO_setBlockSize(unsigned blockSize);
|
||||||
void FIO_setOverlapLog(unsigned overlapLog);
|
void FIO_setOverlapLog(unsigned overlapLog);
|
||||||
void FIO_setLdmFlag(unsigned ldmFlag);
|
void FIO_setLdmFlag(unsigned ldmFlag);
|
||||||
|
@ -142,7 +142,9 @@ static int g_utilDisplayLevel;
|
|||||||
}
|
}
|
||||||
return 1000000000ULL*(clockEnd.QuadPart - clockStart.QuadPart)/ticksPerSecond.QuadPart;
|
return 1000000000ULL*(clockEnd.QuadPart - clockStart.QuadPart)/ticksPerSecond.QuadPart;
|
||||||
}
|
}
|
||||||
|
|
||||||
#elif defined(__APPLE__) && defined(__MACH__)
|
#elif defined(__APPLE__) && defined(__MACH__)
|
||||||
|
|
||||||
#include <mach/mach_time.h>
|
#include <mach/mach_time.h>
|
||||||
#define UTIL_TIME_INITIALIZER 0
|
#define UTIL_TIME_INITIALIZER 0
|
||||||
typedef U64 UTIL_time_t;
|
typedef U64 UTIL_time_t;
|
||||||
@ -167,7 +169,9 @@ static int g_utilDisplayLevel;
|
|||||||
}
|
}
|
||||||
return ((clockEnd - clockStart) * (U64)rate.numer) / ((U64)rate.denom);
|
return ((clockEnd - clockStart) * (U64)rate.numer) / ((U64)rate.denom);
|
||||||
}
|
}
|
||||||
|
|
||||||
#elif (PLATFORM_POSIX_VERSION >= 200112L) && (defined __UCLIBC__ || ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 17) || __GLIBC__ > 2))
|
#elif (PLATFORM_POSIX_VERSION >= 200112L) && (defined __UCLIBC__ || ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 17) || __GLIBC__ > 2))
|
||||||
|
|
||||||
#define UTIL_TIME_INITIALIZER { 0, 0 }
|
#define UTIL_TIME_INITIALIZER { 0, 0 }
|
||||||
typedef struct timespec UTIL_freq_t;
|
typedef struct timespec UTIL_freq_t;
|
||||||
typedef struct timespec UTIL_time_t;
|
typedef struct timespec UTIL_time_t;
|
||||||
@ -217,12 +221,18 @@ static int g_utilDisplayLevel;
|
|||||||
#define SEC_TO_MICRO 1000000
|
#define SEC_TO_MICRO 1000000
|
||||||
|
|
||||||
/* returns time span in microseconds */
|
/* returns time span in microseconds */
|
||||||
UTIL_STATIC U64 UTIL_clockSpanMicro( UTIL_time_t clockStart )
|
UTIL_STATIC U64 UTIL_clockSpanMicro(UTIL_time_t clockStart )
|
||||||
{
|
{
|
||||||
UTIL_time_t const clockEnd = UTIL_getTime();
|
UTIL_time_t const clockEnd = UTIL_getTime();
|
||||||
return UTIL_getSpanTimeMicro(clockStart, clockEnd);
|
return UTIL_getSpanTimeMicro(clockStart, clockEnd);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* returns time span in microseconds */
|
||||||
|
UTIL_STATIC U64 UTIL_clockSpanNano(UTIL_time_t clockStart )
|
||||||
|
{
|
||||||
|
UTIL_time_t const clockEnd = UTIL_getTime();
|
||||||
|
return UTIL_getSpanTimeNano(clockStart, clockEnd);
|
||||||
|
}
|
||||||
|
|
||||||
UTIL_STATIC void UTIL_waitForNextTick(void)
|
UTIL_STATIC void UTIL_waitForNextTick(void)
|
||||||
{
|
{
|
||||||
|
@ -116,10 +116,16 @@ the last one takes effect.
|
|||||||
Note: If `windowLog` is set to larger than 27, `--long=windowLog` or
|
Note: If `windowLog` is set to larger than 27, `--long=windowLog` or
|
||||||
`--memory=windowSize` needs to be passed to the decompressor.
|
`--memory=windowSize` needs to be passed to the decompressor.
|
||||||
* `-T#`, `--threads=#`:
|
* `-T#`, `--threads=#`:
|
||||||
Compress using `#` threads (default: 1).
|
Compress using `#` working threads (default: 1).
|
||||||
If `#` is 0, attempt to detect and use the number of physical CPU cores.
|
If `#` is 0, attempt to detect and use the number of physical CPU cores.
|
||||||
In all cases, the nb of threads is capped to ZSTDMT_NBTHREADS_MAX==256.
|
In all cases, the nb of threads is capped to ZSTDMT_NBTHREADS_MAX==200.
|
||||||
This modifier does nothing if `zstd` is compiled without multithread support.
|
This modifier does nothing if `zstd` is compiled without multithread support.
|
||||||
|
* `--single-thread`:
|
||||||
|
Does not spawn a thread for compression, use caller thread instead.
|
||||||
|
This is the only available mode when multithread support is disabled.
|
||||||
|
In this mode, compression is serialized with I/O.
|
||||||
|
(This is different from `-T1`, which spawns 1 compression thread in parallel of I/O).
|
||||||
|
Single-thread mode also features lower memory usage.
|
||||||
* `-D file`:
|
* `-D file`:
|
||||||
use `file` as Dictionary to compress or decompress FILE(s)
|
use `file` as Dictionary to compress or decompress FILE(s)
|
||||||
* `--nodictID`:
|
* `--nodictID`:
|
||||||
|
@ -135,7 +135,7 @@ static int usage_advanced(const char* programName)
|
|||||||
DISPLAY( "--ultra : enable levels beyond %i, up to %i (requires more memory)\n", ZSTDCLI_CLEVEL_MAX, ZSTD_maxCLevel());
|
DISPLAY( "--ultra : enable levels beyond %i, up to %i (requires more memory)\n", ZSTDCLI_CLEVEL_MAX, ZSTD_maxCLevel());
|
||||||
DISPLAY( "--long[=#] : enable long distance matching with given window log (default: %u)\n", g_defaultMaxWindowLog);
|
DISPLAY( "--long[=#] : enable long distance matching with given window log (default: %u)\n", g_defaultMaxWindowLog);
|
||||||
#ifdef ZSTD_MULTITHREAD
|
#ifdef ZSTD_MULTITHREAD
|
||||||
DISPLAY( " -T# : use # threads for compression (default: 1) \n");
|
DISPLAY( " -T# : spawns # compression threads (default: 1) \n");
|
||||||
DISPLAY( " -B# : select size of each job (default: 0==automatic) \n");
|
DISPLAY( " -B# : select size of each job (default: 0==automatic) \n");
|
||||||
#endif
|
#endif
|
||||||
DISPLAY( "--no-dictID : don't write dictID into header (dictionary compression)\n");
|
DISPLAY( "--no-dictID : don't write dictID into header (dictionary compression)\n");
|
||||||
@ -366,21 +366,22 @@ typedef enum { zom_compress, zom_decompress, zom_test, zom_bench, zom_train, zom
|
|||||||
int main(int argCount, const char* argv[])
|
int main(int argCount, const char* argv[])
|
||||||
{
|
{
|
||||||
int argNb,
|
int argNb,
|
||||||
forceStdout=0,
|
followLinks = 0,
|
||||||
followLinks=0,
|
forceStdout = 0,
|
||||||
main_pause=0,
|
|
||||||
nextEntryIsDictionary=0,
|
|
||||||
operationResult=0,
|
|
||||||
nextArgumentIsOutFileName=0,
|
|
||||||
nextArgumentIsMaxDict=0,
|
|
||||||
nextArgumentIsDictID=0,
|
|
||||||
nextArgumentsAreFiles=0,
|
|
||||||
ultra=0,
|
|
||||||
lastCommand = 0,
|
lastCommand = 0,
|
||||||
nbThreads = 1,
|
ldmFlag = 0,
|
||||||
setRealTimePrio = 0,
|
main_pause = 0,
|
||||||
|
nbWorkers = 0,
|
||||||
|
nextArgumentIsOutFileName = 0,
|
||||||
|
nextArgumentIsMaxDict = 0,
|
||||||
|
nextArgumentIsDictID = 0,
|
||||||
|
nextArgumentsAreFiles = 0,
|
||||||
|
nextEntryIsDictionary = 0,
|
||||||
|
operationResult = 0,
|
||||||
separateFiles = 0,
|
separateFiles = 0,
|
||||||
ldmFlag = 0;
|
setRealTimePrio = 0,
|
||||||
|
singleThread = 0,
|
||||||
|
ultra=0;
|
||||||
unsigned bench_nbSeconds = 3; /* would be better if this value was synchronized from bench */
|
unsigned bench_nbSeconds = 3; /* would be better if this value was synchronized from bench */
|
||||||
size_t blockSize = 0;
|
size_t blockSize = 0;
|
||||||
zstd_operation_mode operation = zom_compress;
|
zstd_operation_mode operation = zom_compress;
|
||||||
@ -418,11 +419,13 @@ int main(int argCount, const char* argv[])
|
|||||||
if (filenameTable==NULL) { DISPLAY("zstd: %s \n", strerror(errno)); exit(1); }
|
if (filenameTable==NULL) { DISPLAY("zstd: %s \n", strerror(errno)); exit(1); }
|
||||||
filenameTable[0] = stdinmark;
|
filenameTable[0] = stdinmark;
|
||||||
g_displayOut = stderr;
|
g_displayOut = stderr;
|
||||||
|
|
||||||
programName = lastNameFromPath(programName);
|
programName = lastNameFromPath(programName);
|
||||||
|
#ifdef ZSTD_MULTITHREAD
|
||||||
|
nbWorkers = 1;
|
||||||
|
#endif
|
||||||
|
|
||||||
/* preset behaviors */
|
/* preset behaviors */
|
||||||
if (exeNameMatch(programName, ZSTD_ZSTDMT)) nbThreads=0;
|
if (exeNameMatch(programName, ZSTD_ZSTDMT)) nbWorkers=0;
|
||||||
if (exeNameMatch(programName, ZSTD_UNZSTD)) operation=zom_decompress;
|
if (exeNameMatch(programName, ZSTD_UNZSTD)) operation=zom_decompress;
|
||||||
if (exeNameMatch(programName, ZSTD_CAT)) { operation=zom_decompress; forceStdout=1; FIO_overwriteMode(); outFileName=stdoutmark; g_displayLevel=1; } /* supports multiple formats */
|
if (exeNameMatch(programName, ZSTD_CAT)) { operation=zom_decompress; forceStdout=1; FIO_overwriteMode(); outFileName=stdoutmark; g_displayLevel=1; } /* supports multiple formats */
|
||||||
if (exeNameMatch(programName, ZSTD_ZCAT)) { operation=zom_decompress; forceStdout=1; FIO_overwriteMode(); outFileName=stdoutmark; g_displayLevel=1; } /* behave like zcat, also supports multiple formats */
|
if (exeNameMatch(programName, ZSTD_ZCAT)) { operation=zom_decompress; forceStdout=1; FIO_overwriteMode(); outFileName=stdoutmark; g_displayLevel=1; } /* behave like zcat, also supports multiple formats */
|
||||||
@ -481,6 +484,7 @@ int main(int argCount, const char* argv[])
|
|||||||
if (!strcmp(argument, "--keep")) { FIO_setRemoveSrcFile(0); continue; }
|
if (!strcmp(argument, "--keep")) { FIO_setRemoveSrcFile(0); continue; }
|
||||||
if (!strcmp(argument, "--rm")) { FIO_setRemoveSrcFile(1); continue; }
|
if (!strcmp(argument, "--rm")) { FIO_setRemoveSrcFile(1); continue; }
|
||||||
if (!strcmp(argument, "--priority=rt")) { setRealTimePrio = 1; continue; }
|
if (!strcmp(argument, "--priority=rt")) { setRealTimePrio = 1; continue; }
|
||||||
|
if (!strcmp(argument, "--single-thread")) { nbWorkers = 0; singleThread = 1; continue; }
|
||||||
#ifdef ZSTD_GZCOMPRESS
|
#ifdef ZSTD_GZCOMPRESS
|
||||||
if (!strcmp(argument, "--format=gzip")) { suffix = GZ_EXTENSION; FIO_setCompressionType(FIO_gzipCompression); continue; }
|
if (!strcmp(argument, "--format=gzip")) { suffix = GZ_EXTENSION; FIO_setCompressionType(FIO_gzipCompression); continue; }
|
||||||
#endif
|
#endif
|
||||||
@ -515,7 +519,7 @@ int main(int argCount, const char* argv[])
|
|||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
if (longCommandWArg(&argument, "--threads=")) { nbThreads = readU32FromChar(&argument); continue; }
|
if (longCommandWArg(&argument, "--threads=")) { nbWorkers = readU32FromChar(&argument); continue; }
|
||||||
if (longCommandWArg(&argument, "--memlimit=")) { memLimit = readU32FromChar(&argument); continue; }
|
if (longCommandWArg(&argument, "--memlimit=")) { memLimit = readU32FromChar(&argument); continue; }
|
||||||
if (longCommandWArg(&argument, "--memory=")) { memLimit = readU32FromChar(&argument); continue; }
|
if (longCommandWArg(&argument, "--memory=")) { memLimit = readU32FromChar(&argument); continue; }
|
||||||
if (longCommandWArg(&argument, "--memlimit-decompress=")) { memLimit = readU32FromChar(&argument); continue; }
|
if (longCommandWArg(&argument, "--memlimit-decompress=")) { memLimit = readU32FromChar(&argument); continue; }
|
||||||
@ -648,7 +652,7 @@ int main(int argCount, const char* argv[])
|
|||||||
/* nb of threads (hidden option) */
|
/* nb of threads (hidden option) */
|
||||||
case 'T':
|
case 'T':
|
||||||
argument++;
|
argument++;
|
||||||
nbThreads = readU32FromChar(&argument);
|
nbWorkers = readU32FromChar(&argument);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
/* Dictionary Selection level */
|
/* Dictionary Selection level */
|
||||||
@ -716,11 +720,13 @@ int main(int argCount, const char* argv[])
|
|||||||
/* Welcome message (if verbose) */
|
/* Welcome message (if verbose) */
|
||||||
DISPLAYLEVEL(3, WELCOME_MESSAGE);
|
DISPLAYLEVEL(3, WELCOME_MESSAGE);
|
||||||
|
|
||||||
if (nbThreads == 0) {
|
#ifdef ZSTD_MULTITHREAD
|
||||||
/* try to guess */
|
if ((nbWorkers==0) && (!singleThread)) {
|
||||||
nbThreads = UTIL_countPhysicalCores();
|
/* automatically set # workers based on # of reported cpus */
|
||||||
DISPLAYLEVEL(3, "Note: %d physical core(s) detected \n", nbThreads);
|
nbWorkers = UTIL_countPhysicalCores();
|
||||||
|
DISPLAYLEVEL(3, "Note: %d physical core(s) detected \n", nbWorkers);
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
g_utilDisplayLevel = g_displayLevel;
|
g_utilDisplayLevel = g_displayLevel;
|
||||||
if (!followLinks) {
|
if (!followLinks) {
|
||||||
@ -763,7 +769,7 @@ int main(int argCount, const char* argv[])
|
|||||||
BMK_setNotificationLevel(g_displayLevel);
|
BMK_setNotificationLevel(g_displayLevel);
|
||||||
BMK_setSeparateFiles(separateFiles);
|
BMK_setSeparateFiles(separateFiles);
|
||||||
BMK_setBlockSize(blockSize);
|
BMK_setBlockSize(blockSize);
|
||||||
BMK_setNbThreads(nbThreads);
|
BMK_setNbWorkers(nbWorkers);
|
||||||
BMK_setRealTime(setRealTimePrio);
|
BMK_setRealTime(setRealTimePrio);
|
||||||
BMK_setNbSeconds(bench_nbSeconds);
|
BMK_setNbSeconds(bench_nbSeconds);
|
||||||
BMK_setLdmFlag(ldmFlag);
|
BMK_setLdmFlag(ldmFlag);
|
||||||
@ -791,7 +797,7 @@ int main(int argCount, const char* argv[])
|
|||||||
zParams.dictID = dictID;
|
zParams.dictID = dictID;
|
||||||
if (cover) {
|
if (cover) {
|
||||||
int const optimize = !coverParams.k || !coverParams.d;
|
int const optimize = !coverParams.k || !coverParams.d;
|
||||||
coverParams.nbThreads = nbThreads;
|
coverParams.nbThreads = nbWorkers;
|
||||||
coverParams.zParams = zParams;
|
coverParams.zParams = zParams;
|
||||||
operationResult = DiB_trainFromFiles(outFileName, maxDictSize, filenameTable, filenameIdx, blockSize, NULL, &coverParams, optimize);
|
operationResult = DiB_trainFromFiles(outFileName, maxDictSize, filenameTable, filenameIdx, blockSize, NULL, &coverParams, optimize);
|
||||||
} else {
|
} else {
|
||||||
@ -835,7 +841,7 @@ int main(int argCount, const char* argv[])
|
|||||||
FIO_setNotificationLevel(g_displayLevel);
|
FIO_setNotificationLevel(g_displayLevel);
|
||||||
if (operation==zom_compress) {
|
if (operation==zom_compress) {
|
||||||
#ifndef ZSTD_NOCOMPRESS
|
#ifndef ZSTD_NOCOMPRESS
|
||||||
FIO_setNbThreads(nbThreads);
|
FIO_setNbWorkers(nbWorkers);
|
||||||
FIO_setBlockSize((U32)blockSize);
|
FIO_setBlockSize((U32)blockSize);
|
||||||
FIO_setLdmFlag(ldmFlag);
|
FIO_setLdmFlag(ldmFlag);
|
||||||
FIO_setLdmHashLog(g_ldmHashLog);
|
FIO_setLdmHashLog(g_ldmHashLog);
|
||||||
|
@ -15,6 +15,7 @@
|
|||||||
#include "util.h" /* Compiler options, UTIL_GetFileSize */
|
#include "util.h" /* Compiler options, UTIL_GetFileSize */
|
||||||
#include <stdlib.h> /* malloc */
|
#include <stdlib.h> /* malloc */
|
||||||
#include <stdio.h> /* fprintf, fopen, ftello64 */
|
#include <stdio.h> /* fprintf, fopen, ftello64 */
|
||||||
|
#include <assert.h> /* assert */
|
||||||
|
|
||||||
#include "mem.h" /* U32 */
|
#include "mem.h" /* U32 */
|
||||||
#ifndef ZSTD_DLL_IMPORT
|
#ifndef ZSTD_DLL_IMPORT
|
||||||
@ -181,7 +182,7 @@ static size_t local_ZSTD_compress_generic_T2_end(void* dst, size_t dstCapacity,
|
|||||||
ZSTD_inBuffer buffIn;
|
ZSTD_inBuffer buffIn;
|
||||||
(void)buff2;
|
(void)buff2;
|
||||||
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1);
|
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1);
|
||||||
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_nbThreads, 2);
|
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_nbWorkers, 2);
|
||||||
buffOut.dst = dst;
|
buffOut.dst = dst;
|
||||||
buffOut.size = dstCapacity;
|
buffOut.size = dstCapacity;
|
||||||
buffOut.pos = 0;
|
buffOut.pos = 0;
|
||||||
@ -198,7 +199,7 @@ static size_t local_ZSTD_compress_generic_T2_continue(void* dst, size_t dstCapac
|
|||||||
ZSTD_inBuffer buffIn;
|
ZSTD_inBuffer buffIn;
|
||||||
(void)buff2;
|
(void)buff2;
|
||||||
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1);
|
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1);
|
||||||
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_nbThreads, 2);
|
ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_nbWorkers, 2);
|
||||||
buffOut.dst = dst;
|
buffOut.dst = dst;
|
||||||
buffOut.size = dstCapacity;
|
buffOut.size = dstCapacity;
|
||||||
buffOut.pos = 0;
|
buffOut.pos = 0;
|
||||||
@ -413,33 +414,48 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb)
|
|||||||
break;
|
break;
|
||||||
|
|
||||||
/* test functions */
|
/* test functions */
|
||||||
/* by convention, test functions can be added > 100 */
|
/* convention: test functions have ID > 100 */
|
||||||
|
|
||||||
default : ;
|
default : ;
|
||||||
}
|
}
|
||||||
|
|
||||||
{ size_t i; for (i=0; i<dstBuffSize; i++) dstBuff[i]=(BYTE)i; } /* warming up memory */
|
/* warming up memory */
|
||||||
|
{ size_t i; for (i=0; i<dstBuffSize; i++) dstBuff[i]=(BYTE)i; }
|
||||||
|
|
||||||
|
/* benchmark loop */
|
||||||
{ U32 loopNb;
|
{ U32 loopNb;
|
||||||
# define TIME_SEC_MICROSEC (1*1000000ULL) /* 1 second */
|
U32 nbRounds = (U32)((50 MB) / (srcSize+1)) + 1; /* initial conservative speed estimate */
|
||||||
U64 const clockLoop = TIMELOOP_S * TIME_SEC_MICROSEC;
|
# define TIME_SEC_MICROSEC (1*1000000ULL) /* 1 second */
|
||||||
|
# define TIME_SEC_NANOSEC (1*1000000000ULL) /* 1 second */
|
||||||
DISPLAY("%2i- %-30.30s : \r", benchNb, benchName);
|
DISPLAY("%2i- %-30.30s : \r", benchNb, benchName);
|
||||||
for (loopNb = 1; loopNb <= g_nbIterations; loopNb++) {
|
for (loopNb = 1; loopNb <= g_nbIterations; loopNb++) {
|
||||||
UTIL_time_t clockStart;
|
UTIL_time_t clockStart;
|
||||||
size_t benchResult=0;
|
size_t benchResult=0;
|
||||||
U32 nbRounds;
|
U32 roundNb;
|
||||||
|
|
||||||
UTIL_sleepMilli(1); /* give processor time to other processes */
|
UTIL_sleepMilli(5); /* give processor time to other processes */
|
||||||
UTIL_waitForNextTick();
|
UTIL_waitForNextTick();
|
||||||
clockStart = UTIL_getTime();
|
clockStart = UTIL_getTime();
|
||||||
for (nbRounds=0; UTIL_clockSpanMicro(clockStart) < clockLoop; nbRounds++) {
|
for (roundNb=0; roundNb < nbRounds; roundNb++) {
|
||||||
benchResult = benchFunction(dstBuff, dstBuffSize, buff2, src, srcSize);
|
benchResult = benchFunction(dstBuff, dstBuffSize, buff2, src, srcSize);
|
||||||
if (ZSTD_isError(benchResult)) { DISPLAY("ERROR ! %s() => %s !! \n", benchName, ZSTD_getErrorName(benchResult)); exit(1); }
|
if (ZSTD_isError(benchResult)) {
|
||||||
}
|
DISPLAY("ERROR ! %s() => %s !! \n", benchName, ZSTD_getErrorName(benchResult));
|
||||||
{ U64 const clockSpanMicro = UTIL_clockSpanMicro(clockStart);
|
exit(1);
|
||||||
double const averageTime = (double)clockSpanMicro / TIME_SEC_MICROSEC / nbRounds;
|
} }
|
||||||
if (averageTime < bestTime) bestTime = averageTime;
|
{ U64 const clockSpanNano = UTIL_clockSpanNano(clockStart);
|
||||||
DISPLAY("%2i- %-30.30s : %7.1f MB/s (%9u)\r", loopNb, benchName, (double)srcSize / (1 MB) / bestTime, (U32)benchResult);
|
double const averageTime = (double)clockSpanNano / TIME_SEC_NANOSEC / nbRounds;
|
||||||
|
if (clockSpanNano > 0) {
|
||||||
|
if (averageTime < bestTime) bestTime = averageTime;
|
||||||
|
assert(bestTime > (1./2000000000));
|
||||||
|
nbRounds = (U32)(1. / bestTime); /* aim for 1 sec */
|
||||||
|
DISPLAY("%2i- %-30.30s : %7.1f MB/s (%9u)\r",
|
||||||
|
loopNb, benchName,
|
||||||
|
(double)srcSize / (1 MB) / bestTime,
|
||||||
|
(U32)benchResult);
|
||||||
|
} else {
|
||||||
|
assert(nbRounds < 40000000); /* avoid overflow */
|
||||||
|
nbRounds *= 100;
|
||||||
|
}
|
||||||
} } }
|
} } }
|
||||||
DISPLAY("%2u\n", benchNb);
|
DISPLAY("%2u\n", benchNb);
|
||||||
|
|
||||||
@ -573,7 +589,7 @@ int main(int argc, const char** argv)
|
|||||||
|
|
||||||
for(i=1; i<argc; i++) {
|
for(i=1; i<argc; i++) {
|
||||||
const char* argument = argv[i];
|
const char* argument = argv[i];
|
||||||
if(!argument) continue; /* Protection if argument empty */
|
assert(argument != NULL);
|
||||||
|
|
||||||
/* Commands (note : aggregated commands are allowed) */
|
/* Commands (note : aggregated commands are allowed) */
|
||||||
if (argument[0]=='-') {
|
if (argument[0]=='-') {
|
||||||
|
@ -53,7 +53,7 @@ static const U32 nbTestsDefault = 30000;
|
|||||||
/*-************************************
|
/*-************************************
|
||||||
* Display Macros
|
* Display Macros
|
||||||
**************************************/
|
**************************************/
|
||||||
#define DISPLAY(...) fprintf(stdout, __VA_ARGS__)
|
#define DISPLAY(...) fprintf(stderr, __VA_ARGS__)
|
||||||
#define DISPLAYLEVEL(l, ...) if (g_displayLevel>=l) { DISPLAY(__VA_ARGS__); }
|
#define DISPLAYLEVEL(l, ...) if (g_displayLevel>=l) { DISPLAY(__VA_ARGS__); }
|
||||||
static U32 g_displayLevel = 2;
|
static U32 g_displayLevel = 2;
|
||||||
|
|
||||||
@ -63,7 +63,7 @@ static UTIL_time_t g_displayClock = UTIL_TIME_INITIALIZER;
|
|||||||
#define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \
|
#define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \
|
||||||
if ((UTIL_clockSpanMicro(g_displayClock) > g_refreshRate) || (g_displayLevel>=4)) \
|
if ((UTIL_clockSpanMicro(g_displayClock) > g_refreshRate) || (g_displayLevel>=4)) \
|
||||||
{ g_displayClock = UTIL_getTime(); DISPLAY(__VA_ARGS__); \
|
{ g_displayClock = UTIL_getTime(); DISPLAY(__VA_ARGS__); \
|
||||||
if (g_displayLevel>=4) fflush(stdout); } }
|
if (g_displayLevel>=4) fflush(stderr); } }
|
||||||
|
|
||||||
|
|
||||||
#undef MIN
|
#undef MIN
|
||||||
@ -226,7 +226,7 @@ static int FUZ_mallocTests(unsigned seed, double compressibility, unsigned part)
|
|||||||
ZSTD_outBuffer out = { outBuffer, outSize, 0 };
|
ZSTD_outBuffer out = { outBuffer, outSize, 0 };
|
||||||
ZSTD_inBuffer in = { inBuffer, inSize, 0 };
|
ZSTD_inBuffer in = { inBuffer, inSize, 0 };
|
||||||
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_compressionLevel, (U32)compressionLevel) );
|
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_compressionLevel, (U32)compressionLevel) );
|
||||||
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_nbThreads, nbThreads) );
|
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_nbWorkers, nbThreads) );
|
||||||
while ( ZSTD_compress_generic(cctx, &out, &in, ZSTD_e_end) ) {}
|
while ( ZSTD_compress_generic(cctx, &out, &in, ZSTD_e_end) ) {}
|
||||||
ZSTD_freeCCtx(cctx);
|
ZSTD_freeCCtx(cctx);
|
||||||
DISPLAYLEVEL(3, "compress_generic,-T%u,end level %i : ",
|
DISPLAYLEVEL(3, "compress_generic,-T%u,end level %i : ",
|
||||||
@ -246,7 +246,7 @@ static int FUZ_mallocTests(unsigned seed, double compressibility, unsigned part)
|
|||||||
ZSTD_outBuffer out = { outBuffer, outSize, 0 };
|
ZSTD_outBuffer out = { outBuffer, outSize, 0 };
|
||||||
ZSTD_inBuffer in = { inBuffer, inSize, 0 };
|
ZSTD_inBuffer in = { inBuffer, inSize, 0 };
|
||||||
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_compressionLevel, (U32)compressionLevel) );
|
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_compressionLevel, (U32)compressionLevel) );
|
||||||
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_nbThreads, nbThreads) );
|
CHECK_Z( ZSTD_CCtx_setParameter(cctx, ZSTD_p_nbWorkers, nbThreads) );
|
||||||
CHECK_Z( ZSTD_compress_generic(cctx, &out, &in, ZSTD_e_continue) );
|
CHECK_Z( ZSTD_compress_generic(cctx, &out, &in, ZSTD_e_continue) );
|
||||||
while ( ZSTD_compress_generic(cctx, &out, &in, ZSTD_e_end) ) {}
|
while ( ZSTD_compress_generic(cctx, &out, &in, ZSTD_e_end) ) {}
|
||||||
ZSTD_freeCCtx(cctx);
|
ZSTD_freeCCtx(cctx);
|
||||||
@ -1209,7 +1209,7 @@ static int basicUnitTests(U32 seed, double compressibility)
|
|||||||
if (strcmp("No error detected", ZSTD_getErrorName(ZSTD_error_GENERIC)) != 0) goto _output_error;
|
if (strcmp("No error detected", ZSTD_getErrorName(ZSTD_error_GENERIC)) != 0) goto _output_error;
|
||||||
DISPLAYLEVEL(3, "OK \n");
|
DISPLAYLEVEL(3, "OK \n");
|
||||||
|
|
||||||
DISPLAYLEVEL(4, "test%3i : testing ZSTD dictionary sizes : ", testNb++);
|
DISPLAYLEVEL(3, "test%3i : testing ZSTD dictionary sizes : ", testNb++);
|
||||||
RDG_genBuffer(CNBuffer, CNBuffSize, compressibility, 0., seed);
|
RDG_genBuffer(CNBuffer, CNBuffSize, compressibility, 0., seed);
|
||||||
{
|
{
|
||||||
size_t const size = MIN(128 KB, CNBuffSize);
|
size_t const size = MIN(128 KB, CNBuffSize);
|
||||||
@ -1230,6 +1230,7 @@ static int basicUnitTests(U32 seed, double compressibility)
|
|||||||
ZSTD_freeCDict(lgCDict);
|
ZSTD_freeCDict(lgCDict);
|
||||||
ZSTD_freeCCtx(cctx);
|
ZSTD_freeCCtx(cctx);
|
||||||
}
|
}
|
||||||
|
DISPLAYLEVEL(3, "OK \n");
|
||||||
|
|
||||||
_end:
|
_end:
|
||||||
free(CNBuffer);
|
free(CNBuffer);
|
||||||
|
@ -634,6 +634,7 @@ roundTripTest -g518K "19 --long"
|
|||||||
fileRoundTripTest -g5M "3 --long"
|
fileRoundTripTest -g5M "3 --long"
|
||||||
|
|
||||||
|
|
||||||
|
roundTripTest -g96K "5 --single-thread"
|
||||||
if [ -n "$hasMT" ]
|
if [ -n "$hasMT" ]
|
||||||
then
|
then
|
||||||
$ECHO "\n===> zstdmt round-trip tests "
|
$ECHO "\n===> zstdmt round-trip tests "
|
||||||
|
@ -94,7 +94,7 @@ static size_t cctxParamRoundTripTest(void* resultBuff, size_t resultBuffCapacity
|
|||||||
|
|
||||||
/* Set parameters */
|
/* Set parameters */
|
||||||
CHECK_Z( ZSTD_CCtxParam_setParameter(cctxParams, ZSTD_p_compressionLevel, cLevel) );
|
CHECK_Z( ZSTD_CCtxParam_setParameter(cctxParams, ZSTD_p_compressionLevel, cLevel) );
|
||||||
CHECK_Z( ZSTD_CCtxParam_setParameter(cctxParams, ZSTD_p_nbThreads, 2) );
|
CHECK_Z( ZSTD_CCtxParam_setParameter(cctxParams, ZSTD_p_nbWorkers, 2) );
|
||||||
CHECK_Z( ZSTD_CCtxParam_setParameter(cctxParams, ZSTD_p_overlapSizeLog, 5) );
|
CHECK_Z( ZSTD_CCtxParam_setParameter(cctxParams, ZSTD_p_overlapSizeLog, 5) );
|
||||||
|
|
||||||
|
|
||||||
|
@ -753,9 +753,9 @@ static int basicUnitTests(U32 seed, double compressibility)
|
|||||||
DISPLAYLEVEL(3, "OK \n");
|
DISPLAYLEVEL(3, "OK \n");
|
||||||
|
|
||||||
/* Complex multithreading + dictionary test */
|
/* Complex multithreading + dictionary test */
|
||||||
{ U32 const nbThreads = 2;
|
{ U32 const nbWorkers = 2;
|
||||||
size_t const jobSize = 4 * 1 MB;
|
size_t const jobSize = 4 * 1 MB;
|
||||||
size_t const srcSize = jobSize * nbThreads; /* we want each job to have predictable size */
|
size_t const srcSize = jobSize * nbWorkers; /* we want each job to have predictable size */
|
||||||
size_t const segLength = 2 KB;
|
size_t const segLength = 2 KB;
|
||||||
size_t const offset = 600 KB; /* must be larger than window defined in cdict */
|
size_t const offset = 600 KB; /* must be larger than window defined in cdict */
|
||||||
size_t const start = jobSize + (offset-1);
|
size_t const start = jobSize + (offset-1);
|
||||||
@ -763,7 +763,7 @@ static int basicUnitTests(U32 seed, double compressibility)
|
|||||||
BYTE* const dst = (BYTE*)CNBuffer + start - offset;
|
BYTE* const dst = (BYTE*)CNBuffer + start - offset;
|
||||||
DISPLAYLEVEL(3, "test%3i : compress %u bytes with multiple threads + dictionary : ", testNb++, (U32)srcSize);
|
DISPLAYLEVEL(3, "test%3i : compress %u bytes with multiple threads + dictionary : ", testNb++, (U32)srcSize);
|
||||||
CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_compressionLevel, 3) );
|
CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_compressionLevel, 3) );
|
||||||
CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_nbThreads, 2) );
|
CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_nbWorkers, nbWorkers) );
|
||||||
CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_jobSize, jobSize) );
|
CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_jobSize, jobSize) );
|
||||||
assert(start > offset);
|
assert(start > offset);
|
||||||
assert(start + segLength < COMPRESSIBLE_NOISE_LENGTH);
|
assert(start + segLength < COMPRESSIBLE_NOISE_LENGTH);
|
||||||
@ -1672,7 +1672,7 @@ static int fuzzerTests_newAPI(U32 seed, U32 nbTests, unsigned startTest, double
|
|||||||
U32 const nbThreadsAdjusted = (windowLogMalus < nbThreadsCandidate) ? nbThreadsCandidate - windowLogMalus : 1;
|
U32 const nbThreadsAdjusted = (windowLogMalus < nbThreadsCandidate) ? nbThreadsCandidate - windowLogMalus : 1;
|
||||||
U32 const nbThreads = MIN(nbThreadsAdjusted, nbThreadsMax);
|
U32 const nbThreads = MIN(nbThreadsAdjusted, nbThreadsMax);
|
||||||
DISPLAYLEVEL(5, "t%u: nbThreads : %u \n", testNb, nbThreads);
|
DISPLAYLEVEL(5, "t%u: nbThreads : %u \n", testNb, nbThreads);
|
||||||
CHECK_Z( setCCtxParameter(zc, cctxParams, ZSTD_p_nbThreads, nbThreads, useOpaqueAPI) );
|
CHECK_Z( setCCtxParameter(zc, cctxParams, ZSTD_p_nbWorkers, nbThreads, useOpaqueAPI) );
|
||||||
if (nbThreads > 1) {
|
if (nbThreads > 1) {
|
||||||
U32 const jobLog = FUZ_rand(&lseed) % (testLog+1);
|
U32 const jobLog = FUZ_rand(&lseed) % (testLog+1);
|
||||||
CHECK_Z( setCCtxParameter(zc, cctxParams, ZSTD_p_overlapSizeLog, FUZ_rand(&lseed) % 10, useOpaqueAPI) );
|
CHECK_Z( setCCtxParameter(zc, cctxParams, ZSTD_p_overlapSizeLog, FUZ_rand(&lseed) % 10, useOpaqueAPI) );
|
||||||
|
Loading…
x
Reference in New Issue
Block a user