commit
a37a8df532
2
.github/workflows/dev-short-tests.yml
vendored
2
.github/workflows/dev-short-tests.yml
vendored
@ -335,7 +335,7 @@ jobs:
|
|||||||
# This test currently fails on Github Actions specifically.
|
# This test currently fails on Github Actions specifically.
|
||||||
# Possible reason : TTY emulation.
|
# Possible reason : TTY emulation.
|
||||||
# Note that the same test works fine locally and on travisCI.
|
# Note that the same test works fine locally and on travisCI.
|
||||||
# This will have to be fixed before transfering the test to GA.
|
# This will have to be fixed before transferring the test to GA.
|
||||||
# versions-compatibility:
|
# versions-compatibility:
|
||||||
# runs-on: ubuntu-latest
|
# runs-on: ubuntu-latest
|
||||||
# steps:
|
# steps:
|
||||||
|
@ -47,7 +47,7 @@ Our contribution process works in three main stages:
|
|||||||
* Topic and development:
|
* Topic and development:
|
||||||
* Make a new branch on your fork about the topic you're developing for
|
* Make a new branch on your fork about the topic you're developing for
|
||||||
```
|
```
|
||||||
# branch names should be consise but sufficiently informative
|
# branch names should be concise but sufficiently informative
|
||||||
git checkout -b <branch-name>
|
git checkout -b <branch-name>
|
||||||
git push origin <branch-name>
|
git push origin <branch-name>
|
||||||
```
|
```
|
||||||
@ -104,7 +104,7 @@ Our contribution process works in three main stages:
|
|||||||
issue at hand, then please indicate this by requesting that an issue be closed by commenting.
|
issue at hand, then please indicate this by requesting that an issue be closed by commenting.
|
||||||
* Just because your changes have been merged does not mean the topic or larger issue is complete. Remember
|
* Just because your changes have been merged does not mean the topic or larger issue is complete. Remember
|
||||||
that the change must make it to an official zstd release for it to be meaningful. We recommend
|
that the change must make it to an official zstd release for it to be meaningful. We recommend
|
||||||
that contributers track the activity on their pull request and corresponding issue(s) page(s) until
|
that contributors track the activity on their pull request and corresponding issue(s) page(s) until
|
||||||
their change makes it to the next release of zstd. Users will often discover bugs in your code or
|
their change makes it to the next release of zstd. Users will often discover bugs in your code or
|
||||||
suggest ways to refine and improve your initial changes even after the pull request is merged.
|
suggest ways to refine and improve your initial changes even after the pull request is merged.
|
||||||
|
|
||||||
@ -270,15 +270,15 @@ for level 1 compression on Zstd. Typically this means, you have identified a sec
|
|||||||
code that you think can be made to run faster.
|
code that you think can be made to run faster.
|
||||||
|
|
||||||
The first thing you will want to do is make sure that the piece of code is actually taking up
|
The first thing you will want to do is make sure that the piece of code is actually taking up
|
||||||
a notable amount of time to run. It is usually not worth optimzing something which accounts for less than
|
a notable amount of time to run. It is usually not worth optimizing something which accounts for less than
|
||||||
0.0001% of the total running time. Luckily, there are tools to help with this.
|
0.0001% of the total running time. Luckily, there are tools to help with this.
|
||||||
Profilers will let you see how much time your code spends inside a particular function.
|
Profilers will let you see how much time your code spends inside a particular function.
|
||||||
If your target code snippit is only part of a function, it might be worth trying to
|
If your target code snippet is only part of a function, it might be worth trying to
|
||||||
isolate that snippit by moving it to its own function (this is usually not necessary but
|
isolate that snippet by moving it to its own function (this is usually not necessary but
|
||||||
might be).
|
might be).
|
||||||
|
|
||||||
Most profilers (including the profilers dicusssed below) will generate a call graph of
|
Most profilers (including the profilers discussed below) will generate a call graph of
|
||||||
functions for you. Your goal will be to find your function of interest in this call grapch
|
functions for you. Your goal will be to find your function of interest in this call graph
|
||||||
and then inspect the time spent inside of it. You might also want to to look at the
|
and then inspect the time spent inside of it. You might also want to to look at the
|
||||||
annotated assembly which most profilers will provide you with.
|
annotated assembly which most profilers will provide you with.
|
||||||
|
|
||||||
@ -301,16 +301,16 @@ $ zstd -b1 -i5 <my-data> # this will run for 5 seconds
|
|||||||
5. Once you run your benchmarking script, switch back over to instruments and attach your
|
5. Once you run your benchmarking script, switch back over to instruments and attach your
|
||||||
process to the time profiler. You can do this by:
|
process to the time profiler. You can do this by:
|
||||||
* Clicking on the `All Processes` drop down in the top left of the toolbar.
|
* Clicking on the `All Processes` drop down in the top left of the toolbar.
|
||||||
* Selecting your process from the dropdown. In my case, it is just going to be labled
|
* Selecting your process from the dropdown. In my case, it is just going to be labeled
|
||||||
`zstd`
|
`zstd`
|
||||||
* Hitting the bright red record circle button on the top left of the toolbar
|
* Hitting the bright red record circle button on the top left of the toolbar
|
||||||
6. You profiler will now start collecting metrics from your bencharking script. Once
|
6. You profiler will now start collecting metrics from your benchmarking script. Once
|
||||||
you think you have collected enough samples (usually this is the case after 3 seconds of
|
you think you have collected enough samples (usually this is the case after 3 seconds of
|
||||||
recording), stop your profiler.
|
recording), stop your profiler.
|
||||||
7. Make sure that in toolbar of the bottom window, `profile` is selected.
|
7. Make sure that in toolbar of the bottom window, `profile` is selected.
|
||||||
8. You should be able to see your call graph.
|
8. You should be able to see your call graph.
|
||||||
* If you don't see the call graph or an incomplete call graph, make sure you have compiled
|
* If you don't see the call graph or an incomplete call graph, make sure you have compiled
|
||||||
zstd and your benchmarking scripg using debug flags. On mac and linux, this just means
|
zstd and your benchmarking script using debug flags. On mac and linux, this just means
|
||||||
you will have to supply the `-g` flag alone with your build script. You might also
|
you will have to supply the `-g` flag alone with your build script. You might also
|
||||||
have to provide the `-fno-omit-frame-pointer` flag
|
have to provide the `-fno-omit-frame-pointer` flag
|
||||||
9. Dig down the graph to find your function call and then inspect it by double clicking
|
9. Dig down the graph to find your function call and then inspect it by double clicking
|
||||||
@ -329,7 +329,7 @@ Some general notes on perf:
|
|||||||
counter statistics. Perf uses a high resolution timer and this is likely one
|
counter statistics. Perf uses a high resolution timer and this is likely one
|
||||||
of the first things your team will run when assessing your PR.
|
of the first things your team will run when assessing your PR.
|
||||||
* Perf has a long list of hardware counters that can be viewed with `perf --list`.
|
* Perf has a long list of hardware counters that can be viewed with `perf --list`.
|
||||||
When measuring optimizations, something worth trying is to make sure the handware
|
When measuring optimizations, something worth trying is to make sure the hardware
|
||||||
counters you expect to be impacted by your change are in fact being so. For example,
|
counters you expect to be impacted by your change are in fact being so. For example,
|
||||||
if you expect the L1 cache misses to decrease with your change, you can look at the
|
if you expect the L1 cache misses to decrease with your change, you can look at the
|
||||||
counter `L1-dcache-load-misses`
|
counter `L1-dcache-load-misses`
|
||||||
@ -368,7 +368,7 @@ Follow these steps to link travis-ci with your github fork of zstd
|
|||||||
TODO
|
TODO
|
||||||
|
|
||||||
### appveyor
|
### appveyor
|
||||||
Follow these steps to link circle-ci with your girhub fork of zstd
|
Follow these steps to link circle-ci with your github fork of zstd
|
||||||
|
|
||||||
1. Make sure you are logged into your github account
|
1. Make sure you are logged into your github account
|
||||||
2. Go to https://www.appveyor.com/
|
2. Go to https://www.appveyor.com/
|
||||||
|
@ -25,7 +25,7 @@
|
|||||||
* Note: MEM_MODULE stops xxhash redefining BYTE, U16, etc., which are also
|
* Note: MEM_MODULE stops xxhash redefining BYTE, U16, etc., which are also
|
||||||
* defined in mem.h (breaking C99 compatibility).
|
* defined in mem.h (breaking C99 compatibility).
|
||||||
*
|
*
|
||||||
* Note: the undefs for xxHash allow Zstd's implementation to coinside with with
|
* Note: the undefs for xxHash allow Zstd's implementation to coincide with with
|
||||||
* standalone xxHash usage (with global defines).
|
* standalone xxHash usage (with global defines).
|
||||||
*
|
*
|
||||||
* Note: multithreading is enabled for all platforms apart from Emscripten.
|
* Note: multithreading is enabled for all platforms apart from Emscripten.
|
||||||
|
@ -25,7 +25,7 @@
|
|||||||
* Note: MEM_MODULE stops xxhash redefining BYTE, U16, etc., which are also
|
* Note: MEM_MODULE stops xxhash redefining BYTE, U16, etc., which are also
|
||||||
* defined in mem.h (breaking C99 compatibility).
|
* defined in mem.h (breaking C99 compatibility).
|
||||||
*
|
*
|
||||||
* Note: the undefs for xxHash allow Zstd's implementation to coinside with with
|
* Note: the undefs for xxHash allow Zstd's implementation to coincide with with
|
||||||
* standalone xxHash usage (with global defines).
|
* standalone xxHash usage (with global defines).
|
||||||
*/
|
*/
|
||||||
#define DEBUGLEVEL 0
|
#define DEBUGLEVEL 0
|
||||||
|
@ -2145,7 +2145,7 @@ static void FSE_init_dtable(FSE_dtable *const dtable,
|
|||||||
|
|
||||||
// "All remaining symbols are sorted in their natural order. Starting from
|
// "All remaining symbols are sorted in their natural order. Starting from
|
||||||
// symbol 0 and table position 0, each symbol gets attributed as many cells
|
// symbol 0 and table position 0, each symbol gets attributed as many cells
|
||||||
// as its probability. Cell allocation is spreaded, not linear."
|
// as its probability. Cell allocation is spread, not linear."
|
||||||
// Place the rest in the table
|
// Place the rest in the table
|
||||||
const u16 step = (size >> 1) + (size >> 3) + 3;
|
const u16 step = (size >> 1) + (size >> 3) + 3;
|
||||||
const u16 mask = size - 1;
|
const u16 mask = size - 1;
|
||||||
|
@ -1124,7 +1124,7 @@ These symbols define a full state reset, reading `Accuracy_Log` bits.
|
|||||||
Then, all remaining symbols, sorted in natural order, are allocated cells.
|
Then, all remaining symbols, sorted in natural order, are allocated cells.
|
||||||
Starting from symbol `0` (if it exists), and table position `0`,
|
Starting from symbol `0` (if it exists), and table position `0`,
|
||||||
each symbol gets allocated as many cells as its probability.
|
each symbol gets allocated as many cells as its probability.
|
||||||
Cell allocation is spreaded, not linear :
|
Cell allocation is spread, not linear :
|
||||||
each successor position follows this rule :
|
each successor position follows this rule :
|
||||||
|
|
||||||
```
|
```
|
||||||
|
@ -125,7 +125,7 @@ The file structure is designed to make this selection manually achievable for an
|
|||||||
`ZSTD_getErrorName` (implied by `ZSTD_LIB_MINIFY`).
|
`ZSTD_getErrorName` (implied by `ZSTD_LIB_MINIFY`).
|
||||||
|
|
||||||
Finally, when integrating into your application, make sure you're doing link-
|
Finally, when integrating into your application, make sure you're doing link-
|
||||||
time optimation and unused symbol garbage collection (via some combination of,
|
time optimization and unused symbol garbage collection (via some combination of,
|
||||||
e.g., `-flto`, `-ffat-lto-objects`, `-fuse-linker-plugin`,
|
e.g., `-flto`, `-ffat-lto-objects`, `-fuse-linker-plugin`,
|
||||||
`-ffunction-sections`, `-fdata-sections`, `-fmerge-all-constants`,
|
`-ffunction-sections`, `-fdata-sections`, `-fmerge-all-constants`,
|
||||||
`-Wl,--gc-sections`, `-Wl,-z,norelro`, and an archiver that understands
|
`-Wl,--gc-sections`, `-Wl,-z,norelro`, and an archiver that understands
|
||||||
|
@ -40,7 +40,7 @@
|
|||||||
|
|
||||||
/**
|
/**
|
||||||
On MSVC qsort requires that functions passed into it use the __cdecl calling conversion(CC).
|
On MSVC qsort requires that functions passed into it use the __cdecl calling conversion(CC).
|
||||||
This explictly marks such functions as __cdecl so that the code will still compile
|
This explicitly marks such functions as __cdecl so that the code will still compile
|
||||||
if a CC other than __cdecl has been made the default.
|
if a CC other than __cdecl has been made the default.
|
||||||
*/
|
*/
|
||||||
#if defined(_MSC_VER)
|
#if defined(_MSC_VER)
|
||||||
|
@ -760,7 +760,7 @@ typedef struct {
|
|||||||
} HUF_CStream_t;
|
} HUF_CStream_t;
|
||||||
|
|
||||||
/**! HUF_initCStream():
|
/**! HUF_initCStream():
|
||||||
* Initializes the bistream.
|
* Initializes the bitstream.
|
||||||
* @returns 0 or an error code.
|
* @returns 0 or an error code.
|
||||||
*/
|
*/
|
||||||
static size_t HUF_initCStream(HUF_CStream_t* bitC,
|
static size_t HUF_initCStream(HUF_CStream_t* bitC,
|
||||||
@ -779,7 +779,7 @@ static size_t HUF_initCStream(HUF_CStream_t* bitC,
|
|||||||
*
|
*
|
||||||
* @param elt The element we're adding. This is a (nbBits, value) pair.
|
* @param elt The element we're adding. This is a (nbBits, value) pair.
|
||||||
* See the HUF_CStream_t docs for the format.
|
* See the HUF_CStream_t docs for the format.
|
||||||
* @param idx Insert into the bistream at this idx.
|
* @param idx Insert into the bitstream at this idx.
|
||||||
* @param kFast This is a template parameter. If the bitstream is guaranteed
|
* @param kFast This is a template parameter. If the bitstream is guaranteed
|
||||||
* to have at least 4 unused bits after this call it may be 1,
|
* to have at least 4 unused bits after this call it may be 1,
|
||||||
* otherwise it must be 0. HUF_addBits() is faster when fast is set.
|
* otherwise it must be 0. HUF_addBits() is faster when fast is set.
|
||||||
|
@ -1333,7 +1333,7 @@ ZSTD_adjustCParams_internal(ZSTD_compressionParameters cPar,
|
|||||||
break;
|
break;
|
||||||
case ZSTD_cpm_createCDict:
|
case ZSTD_cpm_createCDict:
|
||||||
/* Assume a small source size when creating a dictionary
|
/* Assume a small source size when creating a dictionary
|
||||||
* with an unkown source size.
|
* with an unknown source size.
|
||||||
*/
|
*/
|
||||||
if (dictSize && srcSize == ZSTD_CONTENTSIZE_UNKNOWN)
|
if (dictSize && srcSize == ZSTD_CONTENTSIZE_UNKNOWN)
|
||||||
srcSize = minSrcSize;
|
srcSize = minSrcSize;
|
||||||
|
@ -392,7 +392,7 @@ struct ZSTD_CCtx_s {
|
|||||||
ZSTD_blockState_t blockState;
|
ZSTD_blockState_t blockState;
|
||||||
U32* entropyWorkspace; /* entropy workspace of ENTROPY_WORKSPACE_SIZE bytes */
|
U32* entropyWorkspace; /* entropy workspace of ENTROPY_WORKSPACE_SIZE bytes */
|
||||||
|
|
||||||
/* Wether we are streaming or not */
|
/* Whether we are streaming or not */
|
||||||
ZSTD_buffered_policy_e bufferedPolicy;
|
ZSTD_buffered_policy_e bufferedPolicy;
|
||||||
|
|
||||||
/* streaming */
|
/* streaming */
|
||||||
|
@ -219,7 +219,7 @@ MEM_STATIC size_t ZSTD_cwksp_aligned_alloc_size(size_t size) {
|
|||||||
MEM_STATIC size_t ZSTD_cwksp_slack_space_required(void) {
|
MEM_STATIC size_t ZSTD_cwksp_slack_space_required(void) {
|
||||||
/* For alignment, the wksp will always allocate an additional n_1=[1, 64] bytes
|
/* For alignment, the wksp will always allocate an additional n_1=[1, 64] bytes
|
||||||
* to align the beginning of tables section, as well as another n_2=[0, 63] bytes
|
* to align the beginning of tables section, as well as another n_2=[0, 63] bytes
|
||||||
* to align the beginning of the aligned secion.
|
* to align the beginning of the aligned section.
|
||||||
*
|
*
|
||||||
* n_1 + n_2 == 64 bytes if the cwksp is freshly allocated, due to tables and
|
* n_1 + n_2 == 64 bytes if the cwksp is freshly allocated, due to tables and
|
||||||
* aligneds being sized in multiples of 64 bytes.
|
* aligneds being sized in multiples of 64 bytes.
|
||||||
|
@ -478,7 +478,7 @@ static size_t ZSTD_ldm_generateSequences_internal(
|
|||||||
*/
|
*/
|
||||||
if (anchor > ip + hashed) {
|
if (anchor > ip + hashed) {
|
||||||
ZSTD_ldm_gear_reset(&hashState, anchor - minMatchLength, minMatchLength);
|
ZSTD_ldm_gear_reset(&hashState, anchor - minMatchLength, minMatchLength);
|
||||||
/* Continue the outter loop at anchor (ip + hashed == anchor). */
|
/* Continue the outer loop at anchor (ip + hashed == anchor). */
|
||||||
ip = anchor - hashed;
|
ip = anchor - hashed;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
@ -429,7 +429,7 @@ size_t HUF_readDTableX1_wksp_bmi2(HUF_DTable* DTable, const void* src, size_t sr
|
|||||||
|
|
||||||
/* fill DTable
|
/* fill DTable
|
||||||
* We fill all entries of each weight in order.
|
* We fill all entries of each weight in order.
|
||||||
* That way length is a constant for each iteration of the outter loop.
|
* That way length is a constant for each iteration of the outer loop.
|
||||||
* We can switch based on the length to a different inner loop which is
|
* We can switch based on the length to a different inner loop which is
|
||||||
* optimized for that particular case.
|
* optimized for that particular case.
|
||||||
*/
|
*/
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
/* Calling convention:
|
/* Calling convention:
|
||||||
*
|
*
|
||||||
* %rdi contains the first argument: HUF_DecompressAsmArgs*.
|
* %rdi contains the first argument: HUF_DecompressAsmArgs*.
|
||||||
* %rbp is'nt maintained (no frame pointer).
|
* %rbp isn't maintained (no frame pointer).
|
||||||
* %rsp contains the stack pointer that grows down.
|
* %rsp contains the stack pointer that grows down.
|
||||||
* No red-zone is assumed, only addresses >= %rsp are used.
|
* No red-zone is assumed, only addresses >= %rsp are used.
|
||||||
* All register contents are preserved.
|
* All register contents are preserved.
|
||||||
@ -130,7 +130,7 @@ HUF_decompress4X1_usingDTable_internal_bmi2_asm_loop:
|
|||||||
subq $24, %rsp
|
subq $24, %rsp
|
||||||
|
|
||||||
.L_4X1_compute_olimit:
|
.L_4X1_compute_olimit:
|
||||||
/* Computes how many iterations we can do savely
|
/* Computes how many iterations we can do safely
|
||||||
* %r15, %rax may be clobbered
|
* %r15, %rax may be clobbered
|
||||||
* rbx, rdx must be saved
|
* rbx, rdx must be saved
|
||||||
* op3 & ip0 mustn't be clobbered
|
* op3 & ip0 mustn't be clobbered
|
||||||
@ -396,7 +396,7 @@ HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop:
|
|||||||
subq $8, %rsp
|
subq $8, %rsp
|
||||||
|
|
||||||
.L_4X2_compute_olimit:
|
.L_4X2_compute_olimit:
|
||||||
/* Computes how many iterations we can do savely
|
/* Computes how many iterations we can do safely
|
||||||
* %r15, %rax may be clobbered
|
* %r15, %rax may be clobbered
|
||||||
* rdx must be saved
|
* rdx must be saved
|
||||||
* op[1,2,3,4] & ip0 mustn't be clobbered
|
* op[1,2,3,4] & ip0 mustn't be clobbered
|
||||||
|
@ -46,7 +46,7 @@ extern "C" {
|
|||||||
*
|
*
|
||||||
* Zstd can use dictionaries to improve compression ratio of small data.
|
* Zstd can use dictionaries to improve compression ratio of small data.
|
||||||
* Traditionally small files don't compress well because there is very little
|
* Traditionally small files don't compress well because there is very little
|
||||||
* repetion in a single sample, since it is small. But, if you are compressing
|
* repetition in a single sample, since it is small. But, if you are compressing
|
||||||
* many similar files, like a bunch of JSON records that share the same
|
* many similar files, like a bunch of JSON records that share the same
|
||||||
* structure, you can train a dictionary on ahead of time on some samples of
|
* structure, you can train a dictionary on ahead of time on some samples of
|
||||||
* these files. Then, zstd can use the dictionary to find repetitions that are
|
* these files. Then, zstd can use the dictionary to find repetitions that are
|
||||||
@ -132,7 +132,7 @@ extern "C" {
|
|||||||
*
|
*
|
||||||
* # Benchmark levels 1-3 without a dictionary
|
* # Benchmark levels 1-3 without a dictionary
|
||||||
* zstd -b1e3 -r /path/to/my/files
|
* zstd -b1e3 -r /path/to/my/files
|
||||||
* # Benchmark levels 1-3 with a dictioanry
|
* # Benchmark levels 1-3 with a dictionary
|
||||||
* zstd -b1e3 -r /path/to/my/files -D /path/to/my/dictionary
|
* zstd -b1e3 -r /path/to/my/files -D /path/to/my/dictionary
|
||||||
*
|
*
|
||||||
* When should I retrain a dictionary?
|
* When should I retrain a dictionary?
|
||||||
|
@ -247,7 +247,7 @@ ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* dctx,
|
|||||||
*
|
*
|
||||||
* It's possible to reset all parameters to "default" using ZSTD_CCtx_reset().
|
* It's possible to reset all parameters to "default" using ZSTD_CCtx_reset().
|
||||||
*
|
*
|
||||||
* This API supercedes all other "advanced" API entry points in the experimental section.
|
* This API supersedes all other "advanced" API entry points in the experimental section.
|
||||||
* In the future, we expect to remove from experimental API entry points which are redundant with this API.
|
* In the future, we expect to remove from experimental API entry points which are redundant with this API.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
@ -1804,7 +1804,7 @@ ZSTDLIB_API size_t ZSTD_CCtx_refPrefix_advanced(ZSTD_CCtx* cctx, const void* pre
|
|||||||
*
|
*
|
||||||
* Note that this means that the CDict tables can no longer be copied into the
|
* Note that this means that the CDict tables can no longer be copied into the
|
||||||
* CCtx, so the dict attachment mode ZSTD_dictForceCopy will no longer be
|
* CCtx, so the dict attachment mode ZSTD_dictForceCopy will no longer be
|
||||||
* useable. The dictionary can only be attached or reloaded.
|
* usable. The dictionary can only be attached or reloaded.
|
||||||
*
|
*
|
||||||
* In general, you should expect compression to be faster--sometimes very much
|
* In general, you should expect compression to be faster--sometimes very much
|
||||||
* so--and CDict creation to be slightly slower. Eventually, we will probably
|
* so--and CDict creation to be slightly slower. Eventually, we will probably
|
||||||
|
@ -270,7 +270,7 @@ static fileStats DiB_fileStats(const char** fileNamesTable, int nbFiles, size_t
|
|||||||
int n;
|
int n;
|
||||||
memset(&fs, 0, sizeof(fs));
|
memset(&fs, 0, sizeof(fs));
|
||||||
|
|
||||||
// We assume that if chunking is requsted, the chunk size is < SAMPLESIZE_MAX
|
// We assume that if chunking is requested, the chunk size is < SAMPLESIZE_MAX
|
||||||
assert( chunkSize <= SAMPLESIZE_MAX );
|
assert( chunkSize <= SAMPLESIZE_MAX );
|
||||||
|
|
||||||
for (n=0; n<nbFiles; n++) {
|
for (n=0; n<nbFiles; n++) {
|
||||||
@ -339,7 +339,7 @@ int DiB_trainFromFiles(const char* dictFileName, size_t maxDictSize,
|
|||||||
size_t const maxMem = DiB_findMaxMem(fs.totalSizeToLoad * memMult) / memMult;
|
size_t const maxMem = DiB_findMaxMem(fs.totalSizeToLoad * memMult) / memMult;
|
||||||
/* Limit the size of the training data to the free memory */
|
/* Limit the size of the training data to the free memory */
|
||||||
/* Limit the size of the training data to 2GB */
|
/* Limit the size of the training data to 2GB */
|
||||||
/* TODO: there is oportunity to stop DiB_fileStats() early when the data limit is reached */
|
/* TODO: there is opportunity to stop DiB_fileStats() early when the data limit is reached */
|
||||||
loadedSize = (size_t)MIN( MIN((S64)maxMem, fs.totalSizeToLoad), MAX_SAMPLES_SIZE );
|
loadedSize = (size_t)MIN( MIN((S64)maxMem, fs.totalSizeToLoad), MAX_SAMPLES_SIZE );
|
||||||
srcBuffer = malloc(loadedSize+NOISELENGTH);
|
srcBuffer = malloc(loadedSize+NOISELENGTH);
|
||||||
sampleSizes = (size_t*)malloc(fs.nbSamples * sizeof(size_t));
|
sampleSizes = (size_t*)malloc(fs.nbSamples * sizeof(size_t));
|
||||||
|
@ -992,7 +992,7 @@ makeUniqueMirroredDestDirs(char** srcDirNames, unsigned nbFile, const char* outD
|
|||||||
char* prevDirName = srcDirNames[i - 1];
|
char* prevDirName = srcDirNames[i - 1];
|
||||||
char* currDirName = srcDirNames[i];
|
char* currDirName = srcDirNames[i];
|
||||||
|
|
||||||
/* note: we alwasy compare trimmed path, i.e.:
|
/* note: we always compare trimmed path, i.e.:
|
||||||
* src dir of "./foo" and "/foo" will be both saved into:
|
* src dir of "./foo" and "/foo" will be both saved into:
|
||||||
* "outDirName/foo/" */
|
* "outDirName/foo/" */
|
||||||
if (!firstIsParentOrSameDirOfSecond(trimPath(prevDirName),
|
if (!firstIsParentOrSameDirOfSecond(trimPath(prevDirName),
|
||||||
@ -1000,7 +1000,7 @@ makeUniqueMirroredDestDirs(char** srcDirNames, unsigned nbFile, const char* outD
|
|||||||
uniqueDirNr++;
|
uniqueDirNr++;
|
||||||
|
|
||||||
/* we need maintain original src dir name instead of trimmed
|
/* we need maintain original src dir name instead of trimmed
|
||||||
* dir, so we can retrive the original src dir's mode_t */
|
* dir, so we can retrieve the original src dir's mode_t */
|
||||||
uniqueDirNames[uniqueDirNr - 1] = currDirName;
|
uniqueDirNames[uniqueDirNr - 1] = currDirName;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -64,7 +64,7 @@ extern "C" {
|
|||||||
# define SET_REALTIME_PRIORITY /* disabled */
|
# define SET_REALTIME_PRIORITY /* disabled */
|
||||||
# endif
|
# endif
|
||||||
|
|
||||||
#else /* unknown non-unix operating systen */
|
#else /* unknown non-unix operating system */
|
||||||
# define UTIL_sleep(s) /* disabled */
|
# define UTIL_sleep(s) /* disabled */
|
||||||
# define UTIL_sleepMilli(milli) /* disabled */
|
# define UTIL_sleepMilli(milli) /* disabled */
|
||||||
# define SET_REALTIME_PRIORITY /* disabled */
|
# define SET_REALTIME_PRIORITY /* disabled */
|
||||||
|
@ -91,7 +91,7 @@ Note: If \fBwindowLog\fR is set to larger than 27, \fB\-\-long=windowLog\fR or \
|
|||||||
.IP
|
.IP
|
||||||
Note: cannot use both this and \-D together Note: \fB\-\-long\fR mode will be automatically activated if chainLog < fileLog (fileLog being the windowLog required to cover the whole file)\. You can also manually force it\. Node: for all levels, you can use \-\-patch\-from in \-\-single\-thread mode to improve compression ratio at the cost of speed Note: for level 19, you can get increased compression ratio at the cost of speed by specifying \fB\-\-zstd=targetLength=\fR to be something large (i\.e 4096), and by setting a large \fB\-\-zstd=chainLog=\fR
|
Note: cannot use both this and \-D together Note: \fB\-\-long\fR mode will be automatically activated if chainLog < fileLog (fileLog being the windowLog required to cover the whole file)\. You can also manually force it\. Node: for all levels, you can use \-\-patch\-from in \-\-single\-thread mode to improve compression ratio at the cost of speed Note: for level 19, you can get increased compression ratio at the cost of speed by specifying \fB\-\-zstd=targetLength=\fR to be something large (i\.e 4096), and by setting a large \fB\-\-zstd=chainLog=\fR
|
||||||
.IP "\[ci]" 4
|
.IP "\[ci]" 4
|
||||||
\fB\-\-rsyncable\fR : \fBzstd\fR will periodically synchronize the compression state to make the compressed file more rsync\-friendly\. There is a negligible impact to compression ratio, and the faster compression levels will see a small compression speed hit\. This feature does not work with \fB\-\-single\-thread\fR\. You probably don\'t want to use it with long range mode, since it will decrease the effectiveness of the synchronization points, but your milage may vary\.
|
\fB\-\-rsyncable\fR : \fBzstd\fR will periodically synchronize the compression state to make the compressed file more rsync\-friendly\. There is a negligible impact to compression ratio, and the faster compression levels will see a small compression speed hit\. This feature does not work with \fB\-\-single\-thread\fR\. You probably don\'t want to use it with long range mode, since it will decrease the effectiveness of the synchronization points, but your mileage may vary\.
|
||||||
.IP "\[ci]" 4
|
.IP "\[ci]" 4
|
||||||
\fB\-C\fR, \fB\-\-[no\-]check\fR: add integrity check computed from uncompressed data (default: enabled)
|
\fB\-C\fR, \fB\-\-[no\-]check\fR: add integrity check computed from uncompressed data (default: enabled)
|
||||||
.IP "\[ci]" 4
|
.IP "\[ci]" 4
|
||||||
|
@ -171,7 +171,7 @@ the last one takes effect.
|
|||||||
compression speed hit.
|
compression speed hit.
|
||||||
This feature does not work with `--single-thread`. You probably don't want
|
This feature does not work with `--single-thread`. You probably don't want
|
||||||
to use it with long range mode, since it will decrease the effectiveness of
|
to use it with long range mode, since it will decrease the effectiveness of
|
||||||
the synchronization points, but your milage may vary.
|
the synchronization points, but your mileage may vary.
|
||||||
* `-C`, `--[no-]check`:
|
* `-C`, `--[no-]check`:
|
||||||
add integrity check computed from uncompressed data (default: enabled)
|
add integrity check computed from uncompressed data (default: enabled)
|
||||||
* `--[no-]content-size`:
|
* `--[no-]content-size`:
|
||||||
|
@ -56,7 +56,7 @@ optional arguments:
|
|||||||
--mode MODE 'fastmode', 'onetime', 'current', or 'continuous' (see
|
--mode MODE 'fastmode', 'onetime', 'current', or 'continuous' (see
|
||||||
README.md for details)
|
README.md for details)
|
||||||
--dict DICT filename of dictionary to use (when set, this
|
--dict DICT filename of dictionary to use (when set, this
|
||||||
dictioanry will be used to compress the files provided
|
dictionary will be used to compress the files provided
|
||||||
inside --directory)
|
inside --directory)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -296,7 +296,7 @@ if __name__ == "__main__":
|
|||||||
parser.add_argument("--emails", help="email addresses of people who will be alerted upon regression. Only for continuous mode", default=None)
|
parser.add_argument("--emails", help="email addresses of people who will be alerted upon regression. Only for continuous mode", default=None)
|
||||||
parser.add_argument("--frequency", help="specifies the number of seconds to wait before each successive check for new PRs in continuous mode", default=DEFAULT_MAX_API_CALL_FREQUENCY_SEC)
|
parser.add_argument("--frequency", help="specifies the number of seconds to wait before each successive check for new PRs in continuous mode", default=DEFAULT_MAX_API_CALL_FREQUENCY_SEC)
|
||||||
parser.add_argument("--mode", help="'fastmode', 'onetime', 'current', or 'continuous' (see README.md for details)", default="current")
|
parser.add_argument("--mode", help="'fastmode', 'onetime', 'current', or 'continuous' (see README.md for details)", default="current")
|
||||||
parser.add_argument("--dict", help="filename of dictionary to use (when set, this dictioanry will be used to compress the files provided inside --directory)", default=None)
|
parser.add_argument("--dict", help="filename of dictionary to use (when set, this dictionary will be used to compress the files provided inside --directory)", default=None)
|
||||||
|
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
filenames = args.directory
|
filenames = args.directory
|
||||||
|
@ -2538,7 +2538,7 @@ static int basicUnitTests(U32 const seed, double compressibility)
|
|||||||
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
||||||
CHECK_Z( ZSTD_DCtx_loadDictionary(dctx, dictBuffer, dictSize) );
|
CHECK_Z( ZSTD_DCtx_loadDictionary(dctx, dictBuffer, dictSize) );
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
||||||
/* The dictionary should presist across calls. */
|
/* The dictionary should persist across calls. */
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
||||||
/* When we reset the context the dictionary is cleared. */
|
/* When we reset the context the dictionary is cleared. */
|
||||||
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
||||||
@ -2557,7 +2557,7 @@ static int basicUnitTests(U32 const seed, double compressibility)
|
|||||||
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
||||||
CHECK_Z( ZSTD_DCtx_refDDict(dctx, ddict) );
|
CHECK_Z( ZSTD_DCtx_refDDict(dctx, ddict) );
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
||||||
/* The ddict should presist across calls. */
|
/* The ddict should persist across calls. */
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
|
||||||
/* When we reset the context the ddict is cleared. */
|
/* When we reset the context the ddict is cleared. */
|
||||||
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
ZSTD_DCtx_reset(dctx, ZSTD_reset_session_and_parameters);
|
||||||
|
@ -2652,7 +2652,7 @@ static int usage_advanced(void)
|
|||||||
(unsigned)g_timeLimit_s, (double)g_timeLimit_s / 3600);
|
(unsigned)g_timeLimit_s, (double)g_timeLimit_s / 3600);
|
||||||
DISPLAY( " -v : Prints Benchmarking output\n");
|
DISPLAY( " -v : Prints Benchmarking output\n");
|
||||||
DISPLAY( " -D : Next argument dictionary file\n");
|
DISPLAY( " -D : Next argument dictionary file\n");
|
||||||
DISPLAY( " -s : Seperate Files\n");
|
DISPLAY( " -s : Separate Files\n");
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -2707,7 +2707,7 @@ int main(int argc, const char** argv)
|
|||||||
const char* dictFileName = NULL;
|
const char* dictFileName = NULL;
|
||||||
U32 main_pause = 0;
|
U32 main_pause = 0;
|
||||||
int cLevelOpt = 0, cLevelRun = 0;
|
int cLevelOpt = 0, cLevelRun = 0;
|
||||||
int seperateFiles = 0;
|
int separateFiles = 0;
|
||||||
double compressibility = COMPRESSIBILITY_DEFAULT;
|
double compressibility = COMPRESSIBILITY_DEFAULT;
|
||||||
U32 memoTableLog = PARAM_UNSET;
|
U32 memoTableLog = PARAM_UNSET;
|
||||||
constraint_t target = { 0, 0, (U32)-1 };
|
constraint_t target = { 0, 0, (U32)-1 };
|
||||||
@ -2895,7 +2895,7 @@ int main(int argc, const char** argv)
|
|||||||
|
|
||||||
case 's':
|
case 's':
|
||||||
argument++;
|
argument++;
|
||||||
seperateFiles = 1;
|
separateFiles = 1;
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case 'q':
|
case 'q':
|
||||||
@ -2940,7 +2940,7 @@ int main(int argc, const char** argv)
|
|||||||
result = benchSample(compressibility, cLevelRun);
|
result = benchSample(compressibility, cLevelRun);
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
if(seperateFiles) {
|
if(separateFiles) {
|
||||||
for(i = 0; i < argc - filenamesStart; i++) {
|
for(i = 0; i < argc - filenamesStart; i++) {
|
||||||
if (g_optimizer) {
|
if (g_optimizer) {
|
||||||
result = optimizeForSize(argv+filenamesStart + i, 1, dictFileName, target, paramTarget, cLevelOpt, cLevelRun, memoTableLog);
|
result = optimizeForSize(argv+filenamesStart + i, 1, dictFileName, target, paramTarget, cLevelOpt, cLevelRun, memoTableLog);
|
||||||
|
@ -170,7 +170,7 @@ fi
|
|||||||
# ZSTD_BIN="$EXE_PREFIX$ZSTD_BIN"
|
# ZSTD_BIN="$EXE_PREFIX$ZSTD_BIN"
|
||||||
|
|
||||||
# assertions
|
# assertions
|
||||||
[ -n "$ZSTD_BIN" ] || die "zstd not found at $ZSTD_BIN! \n Please define ZSTD_BIN pointing to the zstd binary. You might also consider rebuilding zstd follwing the instructions in README.md"
|
[ -n "$ZSTD_BIN" ] || die "zstd not found at $ZSTD_BIN! \n Please define ZSTD_BIN pointing to the zstd binary. You might also consider rebuilding zstd following the instructions in README.md"
|
||||||
[ -n "$DATAGEN_BIN" ] || die "datagen not found at $DATAGEN_BIN! \n Please define DATAGEN_BIN pointing to the datagen binary. You might also consider rebuilding zstd tests following the instructions in README.md. "
|
[ -n "$DATAGEN_BIN" ] || die "datagen not found at $DATAGEN_BIN! \n Please define DATAGEN_BIN pointing to the datagen binary. You might also consider rebuilding zstd tests following the instructions in README.md. "
|
||||||
println "\nStarting playTests.sh isWindows=$isWindows EXE_PREFIX='$EXE_PREFIX' ZSTD_BIN='$ZSTD_BIN' DATAGEN_BIN='$DATAGEN_BIN'"
|
println "\nStarting playTests.sh isWindows=$isWindows EXE_PREFIX='$EXE_PREFIX' ZSTD_BIN='$ZSTD_BIN' DATAGEN_BIN='$DATAGEN_BIN'"
|
||||||
|
|
||||||
|
@ -1063,7 +1063,7 @@ static int basicUnitTests(U32 seed, double compressibility)
|
|||||||
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
||||||
if (in.pos != in.size) goto _output_error;
|
if (in.pos != in.size) goto _output_error;
|
||||||
}
|
}
|
||||||
/* The dictionary should presist across calls. */
|
/* The dictionary should persist across calls. */
|
||||||
{ ZSTD_outBuffer out = {decodedBuffer, decodedBufferSize, 0};
|
{ ZSTD_outBuffer out = {decodedBuffer, decodedBufferSize, 0};
|
||||||
ZSTD_inBuffer in = {compressedBuffer, cSize, 0};
|
ZSTD_inBuffer in = {compressedBuffer, cSize, 0};
|
||||||
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
||||||
@ -1128,7 +1128,7 @@ static int basicUnitTests(U32 seed, double compressibility)
|
|||||||
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
||||||
if (in.pos != in.size) goto _output_error;
|
if (in.pos != in.size) goto _output_error;
|
||||||
}
|
}
|
||||||
/* The ddict should presist across calls. */
|
/* The ddict should persist across calls. */
|
||||||
{ ZSTD_outBuffer out = {decodedBuffer, decodedBufferSize, 0};
|
{ ZSTD_outBuffer out = {decodedBuffer, decodedBufferSize, 0};
|
||||||
ZSTD_inBuffer in = {compressedBuffer, cSize, 0};
|
ZSTD_inBuffer in = {compressedBuffer, cSize, 0};
|
||||||
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
if (ZSTD_decompressStream(dctx, &out, &in) != 0) goto _output_error;
|
||||||
@ -1175,12 +1175,12 @@ static int basicUnitTests(U32 seed, double compressibility)
|
|||||||
/* We should succeed to decompress with the dictionary. */
|
/* We should succeed to decompress with the dictionary. */
|
||||||
CHECK_Z( ZSTD_initDStream_usingDict(dctx, dictionary.start, dictionary.filled) );
|
CHECK_Z( ZSTD_initDStream_usingDict(dctx, dictionary.start, dictionary.filled) );
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
||||||
/* The dictionary should presist across calls. */
|
/* The dictionary should persist across calls. */
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
||||||
/* We should succeed to decompress with the ddict. */
|
/* We should succeed to decompress with the ddict. */
|
||||||
CHECK_Z( ZSTD_initDStream_usingDDict(dctx, ddict) );
|
CHECK_Z( ZSTD_initDStream_usingDDict(dctx, ddict) );
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
||||||
/* The ddict should presist across calls. */
|
/* The ddict should persist across calls. */
|
||||||
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
CHECK_Z( ZSTD_decompressDCtx(dctx, decodedBuffer, decodedBufferSize, compressedBuffer, cSize) );
|
||||||
/* When we reset the context the ddict is cleared. */
|
/* When we reset the context the ddict is cleared. */
|
||||||
CHECK_Z( ZSTD_initDStream(dctx) );
|
CHECK_Z( ZSTD_initDStream(dctx) );
|
||||||
@ -2277,7 +2277,7 @@ static int fuzzerTests_newAPI(U32 seed, int nbTests, int startTest,
|
|||||||
CHECK_Z( ZSTD_CCtx_refPrefix(zc, dict, dictSize) );
|
CHECK_Z( ZSTD_CCtx_refPrefix(zc, dict, dictSize) );
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Adjust number of workers occassionally - result must be deterministic independent of nbWorkers */
|
/* Adjust number of workers occasionally - result must be deterministic independent of nbWorkers */
|
||||||
CHECK_Z(ZSTD_CCtx_getParameter(zc, ZSTD_c_nbWorkers, &nbWorkers));
|
CHECK_Z(ZSTD_CCtx_getParameter(zc, ZSTD_c_nbWorkers, &nbWorkers));
|
||||||
if (nbWorkers > 0 && (FUZ_rand(&lseed) & 7) == 0) {
|
if (nbWorkers > 0 && (FUZ_rand(&lseed) & 7) == 0) {
|
||||||
DISPLAYLEVEL(6, "t%u: Modify nbWorkers: %d -> %d \n", testNb, nbWorkers, nbWorkers + iter);
|
DISPLAYLEVEL(6, "t%u: Modify nbWorkers: %d -> %d \n", testNb, nbWorkers, nbWorkers + iter);
|
||||||
|
@ -119,7 +119,7 @@ local int recompress(z_streamp inf, z_streamp def)
|
|||||||
if (ret == Z_MEM_ERROR)
|
if (ret == Z_MEM_ERROR)
|
||||||
return ret;
|
return ret;
|
||||||
|
|
||||||
/* compress what was decompresed until done or no room */
|
/* compress what was decompressed until done or no room */
|
||||||
def->avail_in = RAWLEN - inf->avail_out;
|
def->avail_in = RAWLEN - inf->avail_out;
|
||||||
def->next_in = raw;
|
def->next_in = raw;
|
||||||
if (inf->avail_out != 0)
|
if (inf->avail_out != 0)
|
||||||
|
@ -109,7 +109,7 @@ local int recompress(z_streamp inf, z_streamp def)
|
|||||||
if (ret == Z_MEM_ERROR)
|
if (ret == Z_MEM_ERROR)
|
||||||
return ret;
|
return ret;
|
||||||
|
|
||||||
/* compress what was decompresed until done or no room */
|
/* compress what was decompressed until done or no room */
|
||||||
def->avail_in = RAWLEN - inf->avail_out;
|
def->avail_in = RAWLEN - inf->avail_out;
|
||||||
def->next_in = raw;
|
def->next_in = raw;
|
||||||
if (inf->avail_out != 0)
|
if (inf->avail_out != 0)
|
||||||
|
Loading…
x
Reference in New Issue
Block a user