175 Commits

Author SHA1 Message Date
Yann Collet
58e8d793e1 made debug definitions common within zstd_internal.h 2017-06-02 18:20:48 -07:00
Yann Collet
c35e535002 added support for multithreading parameters 2017-06-01 18:44:06 -07:00
Yann Collet
c4a5a21c5c created ZSTDMT_sizeof_CCtx() and POOL_sizeof()
required by ZSTD_sizeofCCtx() while adding a ZSTDMT_CCtx*
2017-06-01 17:56:14 -07:00
Yann Collet
deee6e523f expose ZSTD_compress_generic_simpleArgs()
which is a binding towards ZSTD_compress_generic()
using only integral types for arguments.
2017-05-30 17:42:00 -07:00
Yann Collet
5bcef1ada2 removed mtctx->cstream
use the first cctx in pool when ZSTDMT is used in single-thread mode
now that cctx and cstream are the same object.
2017-05-30 16:37:19 -07:00
Yann Collet
44e45e8423 added ZSTDMT_createCCtx_advanced()
make it possible to use custom allocators
2017-05-30 16:12:06 -07:00
Yann Collet
b6dec4c3ae fixed minor cast warning 2017-05-27 17:09:06 -07:00
Yann Collet
e071159101 mtctx->jobs allocate its own memory space
to make ZSTDMT_CCtx_s size predictable
so that it can be included in CCtx
2017-05-27 00:21:33 -07:00
Yann Collet
31533bacce Changed ZSTD_createCDict_advanced()
It now only uses compressionParameters as argument.
It produces many changes throughout user code,
though hopefully they tend to be simple :
just provide the cParams part from existing ZSTD_parameters.

Some programs might depend on ZSTD_createCDict_advanced() to pass frame parameters.
This change will force them to revisit this strategy and fix it,
since frame parameters are effectively silently ignored in current version.
2017-04-27 00:29:04 -07:00
Yann Collet
2c5514c759 fixed ZSTDMT_initCStream_advanced()
Must use the new ZSTD_compressBegin_usingCDict_advanced()
to enforce correct frame parameters
2017-04-18 22:52:41 -07:00
Yann Collet
20d5e03893 content size is controlled at bufferless level
so it's active for all entry points

Also : added relevant test (wrong content size) in fuzzer
2017-04-11 18:34:02 -07:00
Yann Collet
30c7698970 optimize ZSTDMT_compress() memory usage
does no longer allocate temporary buffers
when there is enough room in dstBuffer to decompress directly there.
(previous method would skip that for 1st chunk only).

Also : fix ZSTD_compressBound() for small srcSize
2017-03-31 18:27:03 -07:00
Yann Collet
eea7858e2b fixed minor warnings in debug code 2017-03-30 16:47:19 -07:00
Yann Collet
34cc487d05 overlap at full windowSize for max compression level
as it provides max compression ratio
2017-03-30 16:23:22 -07:00
Yann Collet
458e955c23 improved ZSTDMT_compress()
Use a bit more threads by default.
Uses overlap segments to boost compression ratio (like the streaming variant)
2017-03-30 15:51:58 -07:00
Yann Collet
ca5a8bbe36 re-added patch ... 2017-03-29 17:15:27 -07:00
Yann Collet
4bcc69b761 solves warnings when compiling with global XXH_STATIC_LINKING_ONLY
XXH_STATIC_LINKING_ONLY protection macro is intended to be triggered just before the include.
The main idea is to keep this setting local :
user module shall explicitly understand and accept the static linking restriction
which becomes transparent when triggering the macro at project level.
Global definition also triggers redefinition warnings for user modules which do locally define the macro.

This new version compiles lib and cli without warning when the macro is set globally.
That's not a scenario to be recommended, since it trades a local effect for a global one,
but it was easy enough to provide from zstd side.
2017-03-01 11:33:25 -08:00
Yann Collet
14312d833e zstdmt : fix : loading prefix from previous segments
There used to be a (very small) chance that
loading prefix from previous segment
would be confused with a real zstd dictionary.
For that to happen, the prefix needs to start
with the same value as dictionary magic.
That's 1 chance in 4 billions if all values have equal probability.
But in fact, since some values are more common (0x00000000 for example)
others are less common, and dictionary magic was selected to be one of them,
so probabilities are likely even lower.

Anyway, this risk is no down to zero
by adding a new CCtx parameter : ZSTD_p_forceRawDict

Current parameter policy : the parameter "stick" to its CCtx,
so any dictionary loading after ZSTD_p_forceRawDict is set
will be loaded in "raw" ("content only") mode,
even if CCtx is re-used multiple times with multiple different dictionary.
It's up to the user to reset this value differently if it needs so.
2017-02-23 23:42:12 -08:00
Yann Collet
831b4890ce minor tests/Makefile refactoring
and update of zstd_manual,html
2017-02-23 23:09:10 -08:00
Yann Collet
48bed91606 Merge pull request #527 from facebook/zstdmt
zstdmt refinements
2017-01-31 16:36:46 -08:00
Yann Collet
b2e1b3d670 fixed overlapLog==0 => no overlap 2017-01-30 14:54:46 -08:00
Yann Collet
3672d06d06 zstdmt : section size is set to be a minimum of overlapSize
the minimum size condition size is applied transparently (no warning, no error)
like previous minimum section size condition (1 KB) which still applies.
2017-01-30 13:35:45 -08:00
Yann Collet
88df1aed61 changed advanced parameter overlapLog
Follows a positive logic (increasing value => increasing overlap)
which is easier to use
2017-01-30 11:00:00 -08:00
Nick Terrell
b42dd27ef5 Add include guards and extern C 2017-01-27 16:00:19 -08:00
Yann Collet
f6d4a786fc reduced zstdmt latency when using small custom section sizes with high compression levels
Previous version was requiring a fairly large initial amount of input data
before starting to create compression jobs.
This new version starts the process much sooner.
2017-01-27 15:55:30 -08:00
Yann Collet
8dafb1acf5 CLI : automatically set overlap size to max (windowSize) for max compression level 2017-01-25 17:01:13 -08:00
Yann Collet
bb0027405a fixed zstdmt corruption issue when enabling overlapped sections
see Asana board for detailed explanation on why and how to fix it
2017-01-25 16:25:38 -08:00
Yann Collet
943cff9c37 fixed zstdmt cli freeze issue with large nb of threads
fileio.c was continually pushing more content without giving a chance to flush compressed one.
It would block the job queue when input data was accumulated too fast (requiring to define many threads).
Fixed : fileio flushes whatever it can after each input attempt.
2017-01-25 12:35:19 -08:00
Yann Collet
dc8dae596a overlapped section, for improved compression
Sections 2+ read a bit of data from previous section
in order to improve compression ratio.
This also costs some CPU, to reference read data.

Read data is currently fixed to window>>3 size
2017-01-24 22:32:12 -08:00
Yann Collet
f14a669054 refactor job creation
code shared accross ZSTDMT_{compress,flush,end}Stream(),
for easier maintenance
2017-01-24 17:41:49 -08:00
Yann Collet
512cbe8c10 zstdmt cli and API allow selection of section sizes
By default, section sizes are 4x window size.
This new setting allow manual selection of section sizes.
The larger they are, the (slightly) better the compression ratio,
but also the higher the memory allocation cost,
and eventually the lesser the nb of possible threads,
since each section is compressed by a single thread.

It also introduces a prototype to set generic parameters,
ZSTDMT_setMTCtxParameter()

The idea is that it's possible to add enums
to extend the list of parameters that can be set this way.
This is more long-term oriented than a fixed-size struct.
Consider it as a test.
2017-01-24 17:08:53 -08:00
Yann Collet
3488a4a473 ZSTDMT now supports frame checksum 2017-01-24 11:48:40 -08:00
Yann Collet
94364bf87a refactor ZSTDMT streaming flush code
now shared by both ZSTDMT_compressStream() and ZSTDMT_flushStream()
2017-01-23 11:50:44 -08:00
Yann Collet
1cbf251e43 ZSTDMT streaming : fall back to (regular) single thread mode
when nbThreads==1
2017-01-23 01:43:58 -08:00
Yann Collet
84581ff8d7 ZSTDMT_compressCCtx : fallback to single-thread mode when nbChunks==1 2017-01-23 01:20:27 -08:00
Yann Collet
1a2547f654 ZSTDMT_compressStream() becomes blocking when required to ensure forward progresses
In some (rare) cases, job list could be blocked by a first job still being processed,
while all following ones are completed, waiting to be flushed.
In such case, the current job-table implementation is unable to accept new job.
As a consequence, a call to ZSTDMT_compressStream() can be useless (nothing read, nothing flushed),
with the risk to trigger a busy-wait on the caller side
(needlessly loop over ZSTDMT_compressStream() ).

In such a case, ZSTDMT_compressStream() will block until the first job is completed and ready to flush.
It ensures some forward progress by guaranteeing it will flush at least a part of the completed job.
Energy-wasting busy-wait is avoided.
2017-01-22 23:49:52 -08:00
Yann Collet
c593348722 ZSTDMT_initCStream_usingDict() can outlive dict
Like ZSTD_initCStream_usingDict(),
ZSTDMT_initCStream_usingDict() now keep a copy of dict internally.
This way, dict can be released :
it does not longer have to outlive all future compression sessions.
2017-01-22 16:44:15 -08:00
Yann Collet
9d6f7637ec protected (mutex) read to jobCompleted, as suggested by @terrelln 2017-01-21 22:14:08 -08:00
Yann Collet
0cf74fa957 optimized pool allocation by 1 slot 2017-01-21 22:06:49 -08:00
Yann Collet
6ed29a8f44 minor : tab to spaces 2017-01-21 21:56:36 -08:00
cyan4973
2e3b659ae1 fixed minor warnings (Visual, conversion, doxygen) 2017-01-20 14:43:09 -08:00
cyan4973
5fba09fa41 updated util's time for Windows compatibility
Correctly measures time on Posix systems when running with
Multi-threading

Todo : check Windows measurement under multi-threading
2017-01-20 12:57:31 -08:00
Yann Collet
500014af49 zstd cli can now compress using multi-threading
added : command -T#
added : ZSTD_resetCStream() (zstdmt_compress)
added : FIO_setNbThreads()  (fileio)
2017-01-19 17:04:28 -08:00
Yann Collet
19d670ba9d Added ZSTDMT_initCStream_advanced() variant
Correctly compress with custom params and dictionary
Added relevant fuzzer test in zstreamtest

Also :
new macro ZSTDMT_SECTION_LOGSIZE_MIN, which sets a minimum size for a full job
(note : a flush() command can still generate a partial job anytime)
2017-01-19 15:32:07 -08:00
Yann Collet
736788f8e8 added streaming fuzzer tests for MT API
Also : fixed corner case, where nb of jobs completed becomes > jobQueueSize
which is possible when many flushes are issued
while there is not enough dst buffer to flush completed ones.
2017-01-19 12:15:29 -08:00
Yann Collet
32dfae6f98 fixed Multi-threaded compression
MT compression generates a single frame.
Multi-threading operates by breaking the frames into independent sections.
But from a decoder perspective, there is no difference :
it's just a suite of blocks.

Problem is, decoder preserves repCodes from previous block to start decoding next block.
This is also valid between sections, since they are no different than changing block.

Previous version would incorrectly initialize repcodes to their default value at the beginning of each section.
When using them, there was a mismatch between encoder (default values) and decoder (values from previous block).

This change ensures that repcodes won't be used at the beginning of a new section.
It works by setting them to 0.
This only works with regular (single segment) variants : extDict variants will fail !
Fortunately, sections beyond the 1st one belong to this category.

To be checked : btopt strategy.
This change was only validated from fast to btlazy2 strategies.
2017-01-19 10:32:55 -08:00
Yann Collet
37226c1e9f Simplified compressChunk job
minor refactoring : compression done in a single call on first chunk
Avoid a mutable hSize variable and eventual recombination to cSize at the end
2017-01-19 10:18:17 -08:00
Yann Collet
6073b3e6b8 ZSTDMT_endStream : nullify input buffer after flush
There will be no more input after ZSTDMT_endStream invocation :
only flush/end is allowed (to fully collect compressed result).
2017-01-18 15:32:38 -08:00
Yann Collet
3a01c46b26 ZSTDMT_initCStream() supports restart from invalid state
ZSTDMT_initCStream() will correcly scrub for resources
when it detects that previous compression was not properly finished.
2017-01-18 15:18:17 -08:00
Yann Collet
4885f591b3 trap compression errors, collect back resources from workers 2017-01-18 14:11:37 -08:00