Commit Graph

1891 Commits (fb4451664125919e0870b256797f826bad576efc)

Author SHA1 Message Date
Nick Terrell 5b7fd7c422 [zdict] Make COVER the default algorithm 2017-06-26 21:09:22 -07:00
Yann Collet c7fb884eea fixed minor conversion warning 2017-06-26 18:02:23 -07:00
Yann Collet dde10b23fe refactored ZSTD_estimateDStreamSize()
now uses windowSize as argument.
Also : created ZSTD_estimateDStreamSize_fromFrame()
2017-06-26 17:44:26 -07:00
Yann Collet 09ae03a570 ZSTD_estimateCDictSize_advanced()
ZSTD_estimateCDictSize() now uses same arguments as ZSTD_createCDict()
ZSTD_estimateCDictSize_advanced() uses same arguments as ZSTD_createCDict_advanced()
2017-06-26 16:47:32 -07:00
Yann Collet 0c9a915a28 ZSTD_estimateCStreamSize_advanced() 2017-06-26 16:02:25 -07:00
Yann Collet 31af8290d1 ZSTD_estimateCCtx_advanced()
ZSTD_estimateCCtx() is now a "simple" function,
taking int compressionLevel as single argument.

ZSTD_estimateCCtx_advanced() takes a CParams argument,
which is both more complete and more complex to generate.
2017-06-26 15:52:39 -07:00
Yann Collet ef269c1b68 Merge pull request #725 from facebook/advancedAPI2
New Advanced API
2017-06-23 09:50:47 -07:00
Yann Collet ecb0f46866 add controls over streaming buffers 2017-06-21 17:25:01 -07:00
Yann Collet dce789281b fixed : decompression of skippable frames in streaming mode 2017-06-21 15:53:42 -07:00
Yann Collet 204b6b7ef6 fixed streaming buffered allocation with CDict compression 2017-06-21 15:13:00 -07:00
Yann Collet 1e4129b27b fixed dangling pointer risk, detected by @terrelln 2017-06-21 13:26:10 -07:00
Yann Collet 83095970e6 free cdictLocal faster, suggested by @terrelln 2017-06-21 12:26:40 -07:00
Yann Collet 7bd1a2900e added ZSTD_dictMode_e to control dictionary loading mode 2017-06-21 11:50:33 -07:00
Yann Collet 9c56b12938 Merge pull request #723 from paulcruz74/dev
Adding zstd -l
2017-06-21 09:41:55 -07:00
Yann Collet e51d51bdf7 fixed memcpy() overlap 2017-06-20 17:44:55 -07:00
Yann Collet 466f92eaa6 removed one useless streaming compression stage, detected by @terrelln 2017-06-20 16:25:29 -07:00
Yann Collet c3bce24ef4 fixed potential dangling pointer, detected by @terrelln 2017-06-20 16:09:11 -07:00
Yann Collet 78b8234554 fixed comments, following suggestion by @terrelln 2017-06-20 14:26:48 -07:00
Yann Collet b44ab82f7a ensure new ZSTD_strategy starts at value 1 2017-06-20 14:11:49 -07:00
Yann Collet c08e649e95 first implementation of bench.c with new API ZSTD_compress_generic()
Doesn't speed optimize this buffer-to-buffer scenario yet.
Still internally defers to streaming implementation.

Also : fixed a long standing bug in ZSTDMT streaming API.
2017-06-19 18:25:35 -07:00
Yann Collet 695a0a3449 fixed IA64 compilation error, by @mcmilk 2017-06-19 15:27:30 -07:00
Yann Collet fe234bf48b fix attempts : fullbench for VS2008 2017-06-19 15:23:19 -07:00
Nick Terrell 55f9cd4942 [libzstd] Fix UBSAN failure 2017-06-19 15:12:28 -07:00
Yann Collet bf99150be3 update new api presentation in zstd.h and manual 2017-06-19 12:56:25 -07:00
Yann Collet d7a3bffba9 new api : setting compression parameters is refused if a dictionary is already loaded 2017-06-19 11:53:01 -07:00
Yuri 92bafda406 INSTALL_DATA instead of INSTALL_LIB for libzstd.a
INSTALL_LIB can be passed -s flag to strip symbols. Static libraries should not be stripped, only dynamic ones should be stripped.
2017-06-17 00:23:41 -07:00
Yann Collet 381e66cfbd added ZSTD_clampCParams()
now ZSTD_adjustCParams() is always successful,
it always produces a valid CParams
2017-06-16 17:34:54 -07:00
Yann Collet aee916e37c fixed +/-1 error for pledgedSrcSizePlusOne 2017-06-16 17:02:35 -07:00
Yann Collet d3de3d51a3 fix attempt 2 : Visual sign conversion warning 2017-06-16 16:51:33 -07:00
Yann Collet 944be54774 fixed attempt : minor Visual sign conversion warning 2017-06-16 14:05:01 -07:00
Yann Collet b26728c9c8 added ZSTD_startNewCompression() 2017-06-16 14:00:46 -07:00
Yann Collet a0ba849fe6 changed frameContentSize field to pledgedSrcSizePlusOne
pledgedSrcSize is proper : it's a promise, not yet fulfilled.
It will be controlled at the end.

PlusOne is meant to have 0 (default) == unknown
2017-06-16 13:29:17 -07:00
Yann Collet 2cf7755da7 fix : pledgedSrcSize correctly reset to unknown in "continue" mode 2017-06-16 12:34:41 -07:00
Yann Collet 9e73f2f320 fix : correctly reset pledgedSrcSize to unknown status
when starting a new compression with an existing context
2017-06-16 12:24:01 -07:00
Yann Collet 33873f0e74 fixed : new advanced AIP : setting nbThreads to the same value > 1 2017-06-16 12:04:21 -07:00
Yann Collet 559ee82e90 fixed : calling ZSTD_compress_generic() to end-flush a stream in multiple steps 2017-06-16 11:58:21 -07:00
Yann Collet bd18c885a3 added ZSTD_CCtx_reset 2017-06-16 10:17:50 -07:00
Yann Collet cc9f9b7f4c protection : ZSTD_CONTENTSIZE_UNKNOWN automatically disables contentSizeFlag 2017-06-15 18:17:34 -07:00
Yann Collet 05ae4b2190 added protection : MT incompatible with Static allocation 2017-06-15 18:03:34 -07:00
Paul Cruz a9b77c83e5 cleaning up code for analyzing frames 2017-06-15 14:13:28 -07:00
Yann Collet f129fd3970 disabled MT code path when ZSTD_MULTITHREAD is not defined 2017-06-11 18:46:09 -07:00
Yann Collet 23aace9778 added control stage to MT mode 2017-06-11 18:32:36 -07:00
Yann Collet f35e2de61c linked newAPI to ZSTDMT 2017-06-05 18:32:48 -07:00
cyan4973 c59162e053 minor fix for -Wdocumentation 2017-06-05 00:12:13 -07:00
cyan4973 8bcbf42617 fixed g++ prototype mismatch 2017-06-04 23:52:00 -07:00
Yann Collet 8c910d2097 updated ZSTDMT streaming API
ZSTDMT streaming API is now similar
and has same capabilites as single-thread streaming API.
It makes it easier to blend them together.
2017-06-03 01:15:02 -07:00
Yann Collet 58e8d793e1 made debug definitions common within zstd_internal.h 2017-06-02 18:20:48 -07:00
Yann Collet 8ddf4c22d5 fixed missing initialization 2017-06-02 17:16:49 -07:00
Yann Collet 33a7e679e5 significant zlib wrapper code refactoring
code indentation
variable scope and names
constify

Only coding style changes.
The logic should remain the same.
2017-06-02 17:10:49 -07:00
Yann Collet 4effccbf56 zlib_wrapper's uncompress() uses ZSTD_isFrame() for routing
more generic and safer than using own routing for magic number comparison
2017-06-02 14:27:11 -07:00
Yann Collet dcb7535352 ensure zlibwrapper uses ZSTD_malloc() and ZSTD_free()
which is compatible with { NULL, NULL, NULL }
2017-06-02 14:01:21 -07:00
Yann Collet b877e834b1 minor indent 2017-06-02 13:47:11 -07:00
Yann Collet 6056e4c3eb added POOL_sizeof() for single-thread 2017-06-02 11:36:47 -07:00
Yann Collet c35e535002 added support for multithreading parameters 2017-06-01 18:44:06 -07:00
Yann Collet c4a5a21c5c created ZSTDMT_sizeof_CCtx() and POOL_sizeof()
required by ZSTD_sizeofCCtx() while adding a ZSTDMT_CCtx*
2017-06-01 17:56:14 -07:00
Yann Collet cd2892fd1e protected impossible switch(){default:} with assert(0)
can be converted into assume(0) in some future
2017-06-01 09:44:54 -07:00
Yann Collet 06589fe516 Merge branch 'advancedAPI2' of github.com:facebook/zstd into advancedAPI2 2017-05-31 10:03:20 -07:00
Yann Collet 18ab5affa5 fixed visual warning 2017-05-31 09:59:22 -07:00
Yann Collet 9a691e0f55 fixed visual warnings 2017-05-31 01:17:44 -07:00
Yann Collet 01b1549f83 finally converted ZSTD_compressStream_generic() to use {in,ou}Buffer
replacing the older read/write variables from ZBUFF_* era.
Mostly to help code readability.

Fixed relevant callers.
2017-05-30 18:10:26 -07:00
Yann Collet c4f46b94ce ZSTD_createCCtx_advanced() now uses ZSTD_calloc()
initially uses calloc() instead of memset().

Performance improvement is unlikely measurable,
since ZSTD_CCtx is now very small,
with all tables transferred into workSpace.
2017-05-30 17:45:37 -07:00
Yann Collet deee6e523f expose ZSTD_compress_generic_simpleArgs()
which is a binding towards ZSTD_compress_generic()
using only integral types for arguments.
2017-05-30 17:42:00 -07:00
Yann Collet ae728a43b8 removed defaultCustomMem
now ZSTD_customCMem is promoted as new default.

Advantages : ZSTD_customCMem = { NULL, NULL, NULL},
so it's natural default after a memset.

ZSTD_customCMem is public constant
(defaultCustomMem was private only).

Also : makes it possible to introduce ZSTD_calloc(),
which can now default to stdlib's calloc()
when it detects system default.

Fixed zlibwrapper which depended on defaultCustomMem.
2017-05-30 17:11:39 -07:00
Yann Collet 5bcef1ada2 removed mtctx->cstream
use the first cctx in pool when ZSTDMT is used in single-thread mode
now that cctx and cstream are the same object.
2017-05-30 16:37:19 -07:00
Yann Collet beb62b15a8 Merge branch 'dev' into advancedAPI2
Fixed conflic in zstd_decompress.c
2017-05-30 16:18:57 -07:00
Yann Collet 44e45e8423 added ZSTDMT_createCCtx_advanced()
make it possible to use custom allocators
2017-05-30 16:12:06 -07:00
Yann Collet f45ca527a1 Merge branch 'advancedAPI2' of github.com:facebook/zstd into advancedAPI2 2017-05-30 10:02:03 -07:00
Yann Collet b6dec4c3ae fixed minor cast warning 2017-05-27 17:09:06 -07:00
Yann Collet e071159101 mtctx->jobs allocate its own memory space
to make ZSTDMT_CCtx_s size predictable
so that it can be included in CCtx
2017-05-27 00:21:33 -07:00
Yann Collet b8136f019a static dctx is incompatible with legacy support
documented, and runtime tested
2017-05-27 00:03:08 -07:00
Yann Collet 7028cbd7fd fixed a few code comments : ZSTD_getFrameParams => ZSTD_getFrameHeader 2017-05-25 18:29:08 -07:00
Yann Collet cdf7e82222 Added ZSTD_initStaticCDict() 2017-05-25 18:05:49 -07:00
Dmitry V. Levin 1ea655c765 Fix typo in libzstd.a-mt make rules
The macro name is ZSTD_MULTITHREAD, not ZSTD_MULTHREAD.

Fixes: ca6fae7808 ("Add MT enabled targets for libzstd")
2017-05-25 23:43:05 +00:00
Yann Collet 57827f906f added ZSTD_initStaticDDict() 2017-05-25 15:44:06 -07:00
Yann Collet 25989e361c updated ZSTD_estimate?DictSize() to pass parameter byReference
resulting ?Dict object is smaller when created byReference.
Seems better than a documentation note.
2017-05-25 15:07:37 -07:00
Yann Collet 0fdc71c3dc added ZSTD_initStaticDCtx() 2017-05-24 17:41:41 -07:00
Yann Collet ba183005d3 merged DStream's inBuff and outBuff into a single buffer
Saves one malloc().
Also : makes it easier to implement static allocation
2017-05-24 15:42:24 -07:00
Nick Terrell 55fc1f91fd [zstd] Fix up formatting edge cases for clang-format 2017-05-24 13:50:10 -07:00
Yann Collet 2e4db3e531 fixed performance regression with ZSTD_decompress() on small files
memset() was a quick fix to initialization problems,
but initialize too much space (tables, buffers)
which show up in decompression speed of ZSTD_decompress()
since it needs to recreate DCtx at each invocation.

Fixed by only initialization relevant pointers and size fields.
2017-05-24 13:15:19 -07:00
Yann Collet 11ea2f7fda Merged ZSTD_DCtx and ZSTD_DStream objects
They are now the same object.
It's recommended to keep both types in source code
as previous versions of library (<v1.3.0)
still need this differentiation.
2017-05-23 16:19:43 -07:00
Yann Collet b81f19ffce move MEM_readMINMATCH() into zstd_opt.h
which is its only user.
Use case too narrow to belong to mem.h.
renamed to ZSTD_readMINMATCH()
2017-05-23 15:41:55 -07:00
Yann Collet c7fe262dc9 added ZSTD_initStaticCCtx()
makes it possible to statically or externally allocate CCtx.
static CCtx will only use provided memory area,
it will never resize nor malloc.
2017-05-23 13:20:41 -07:00
Yann Collet 5ac72b417c Buffered are now allocated inside workSpace 2017-05-23 11:18:24 -07:00
Yann Collet 1880337c30 Simplifier compression call graph
Everything converge towards ZSTD_compressBegin_internal
which delegated to ZSTD_copyCCtx_internal if cdict!=NULL.

This simplifies routing which was previously depending on cdict.
2017-05-22 18:21:51 -07:00
Yann Collet b0739bcf8f simplified reset by removing full-reset policy
this was meant to be applied prior to dictionary loading.
But effectively, it seems redundant with later loading stage,
so it can be skipped safely.
2017-05-22 17:45:15 -07:00
Yann Collet 1ad7c82eb5 Implemented separation between requested and applied parameters
first version to pass cli tests with -DZSTD_NEWAPI
2017-05-22 17:06:04 -07:00
Yann Collet 24de7b0346 Implemented ZSTD_CCtx_refCDict() 2017-05-22 13:05:45 -07:00
Yann Collet ee970398b2 Merge branch 'dev' into advancedAPI2 2017-05-22 12:33:56 -07:00
Yann Collet 8b21ec42a9 ZSTD_compress_generic() can handle dictionary compression 2017-05-19 19:46:15 -07:00
Nick Terrell a1280406b0 [libzstd] Allow users to define custom visibility 2017-05-19 18:01:59 -07:00
Yann Collet 334a288d0d ZSTD_CCtx_setParameter() only works during initialization stage
and generate a stage_wrong error otherwise.
2017-05-19 11:04:41 -07:00
Yann Collet 48855fa0d2 fixed declaration-after-statement warning 2017-05-19 10:56:11 -07:00
Yann Collet fa3671eac7 changed ZSTD_BLOCKSIZE_ABSOLUTEMAX into ZSTD_BLOCKSIZE_MAX
Also :
change ZSTD_getBlockSizeMax() into ZSTD_getBlockSize()
created ZSTD_BLOCKSIZELOG_MAX
2017-05-19 10:51:30 -07:00
Yann Collet 009d604e00 ZSTD_compress_generic() supports multiple successive frames
also : clarified streaming API implementation
2017-05-19 10:17:59 -07:00
Yann Collet 6d4fef36de Added ZSTD_compress_generic()
Used in fileio.c (zstd cli).
Need to set macro ZSTD_NEWAPI to trigger it.
2017-05-17 18:36:15 -07:00
Yann Collet 23c256e44b removed useless variable from CCtx
CStream's pledgedSrcSize is no longer necessary
srcSize control is realized within bufferless interface.
2017-05-16 18:10:11 -07:00
Yann Collet 9f95e445ab minor comment clarifications 2017-05-16 17:26:43 -07:00
Yann Collet 0bdb575c31 Merge branch 'dev' into advancedAPI2 2017-05-16 16:32:29 -07:00
Yann Collet 7101434ec9 pedantic : added one error check
on a function which (today) never fails.
But who knows, maybe tomorrow ...
2017-05-16 16:28:24 -07:00
Yann Collet bfff8999c5 added prototype ZSTD_versionString() 2017-05-16 16:12:23 -07:00
Yann Collet 4eff8136aa added prototype ZSTD_decompressBegin_usingDDict (#700) 2017-05-16 16:05:27 -07:00
Yann Collet 2d4d31c18a removed gcc compilation flag -Wbad-function-cast
It makes it more difficult to directly cast the result of a function,
requiring to store the result in an intermediate variable.
It does not necessarily help readability,
and this restriction can be difficult to overcome in some constructions,
like some macros.

also : fixed minor Visual conversion warnings in datagencli.c
2017-05-16 11:34:38 -07:00
Yann Collet 133f0aee54 fixed redundant declarations in legacy v0.5 and v0.7 decoders
triggered by new flag -Wredundant-decls
2017-05-15 17:44:04 -07:00
Yann Collet 83d0c764dc added several compilation flags 2017-05-15 17:15:46 -07:00
Yann Collet a5ffe3d370 pushed enum values for strategy by one (ZSTD_fast==1)
this makes it possible to use `0` to mean:
"do not change strategy"
2017-05-12 16:29:19 -07:00
Yann Collet add66f816d changed macro LOADCPARAMS by static function ZSTD_cLevelToCParams()
for improved compiler checks.
Also : ensure most parameters can receive value "0"
to mean "do not change".
2017-05-12 16:01:15 -07:00
Yann Collet b0edb7fb0e added ZSTD_CCtx_setParameter() 2017-05-12 15:31:53 -07:00
Yann Collet ef738c1b23 better error code when compressing using NULL CDict
which is not allowed (but detected, and generates an error).
2017-05-12 13:55:25 -07:00
Yann Collet db8e21d5a0 made ZSTD_compress_generic() definition accessible
note that the implementation is not done yet.
2017-05-12 13:46:49 -07:00
Yann Collet 33eb7ac6b6 updated Advanced API proposal
only declarations in zstd.h
2017-05-12 12:36:11 -07:00
Yann Collet bd1964a988 Merge pull request #696 from joscollin/wip-lib-legacy-fallthrough-warn
lib/legacy: warning: this statement may fall through
2017-05-11 10:45:01 -07:00
Yann Collet 4c1cfc0bb6 Merge pull request #695 from joscollin/wip-lib-compress-fallthrough-warn
lib/compress: warning: this statement may fall through
2017-05-11 10:44:27 -07:00
Jos Collin 280510f2d5 lib/legacy: warning: this statement may fall through
The following warning appears during build at sevaral places.

../lib/legacy/zstd_v04.c:819:40: warning: this statement may fall through [-Wimplicit-fallthrough=]
             case 7: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[6]) << (sizeof(size_t)*8 - 16);

../lib/legacy/zstd_v05.c:821:40: warning: this statement may fall through [-Wimplicit-fallthrough=]
             case 7: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[6]) << (sizeof(size_t)*8 - 16);

../lib/legacy/zstd_v06.c:913:40: warning: this statement may fall through [-Wimplicit-fallthrough=]
             case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16);

../lib/legacy/zstd_v07.c:583:40: warning: this statement may fall through [-Wimplicit-fallthrough=]
             case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) <<
             (sizeof(bitD->bitContainer)*8 - 16);

Signed-off-by: Jos Collin <jcollin@redhat.com>
2017-05-11 14:27:40 +05:30
Jos Collin 7cd7a7564b lib/compress: warning: this statement may fall through
The following warning appears during build.

../lib/compress/huf_compress.c: In function ‘HUF_compress1X_usingCTable’:
../lib/compress/huf_compress.c:444:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
     if (sizeof((stream)->bitContainer)*8 < HUF_TABLELOG_MAX*4+7) HUF_FLUSHBITS(stream)
        ^
../lib/compress/huf_compress.c:465:18: note: in expansion of macro ‘HUF_FLUSHBITS_2’
                  HUF_FLUSHBITS_2(&bitC);
                  ^~~~~~~~~~~~~~~
../lib/compress/huf_compress.c:466:9: note: here
         case 2 : HUF_encodeSymbol(&bitC, ip[n+ 1], CTable);

../lib/compress/zstd_compress.c: In function ‘ZSTD_compressStream_generic’:
../lib/compress/zstd_compress.c:3366:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
                 zcs->streamStage = zcss_flush;   /* pass-through to flush stage */
                 ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
../lib/compress/zstd_compress.c:3369:9: note: here
         case zcss_flush:

Signed-off-by: Jos Collin <jcollin@redhat.com>
2017-05-11 13:17:26 +05:30
Jos Collin 05286fdd5a lib/common: warning: this statement may fall through
The following warning appears during the build. Fixed the review comments too.

zstd/lib/common/bitstream.h: In function ‘BIT_initDStream’:
zstd/lib/common/bitstream.h:277:33: warning: this statement may fall through [-Wimplicit-fallthrough=]
      case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) <<
      (sizeof(bitD->bitContainer)*8 - 16);

Signed-off-by: Jos Collin <jcollin@redhat.com>
2017-05-11 09:10:02 +05:30
Nick Terrell 374f868354 Update whitespace 2017-05-10 17:48:42 -07:00
Nick Terrell 5f2c7213c7 Merge remote-tracking branch 'upstream/dev' into btopt
* upstream/dev: (305 commits)
  added test for ZSTD_estimateCStreamSize()
  changed variable name, for clarity
  fixed ZSTD_estimateCStreamSize()
  shortened ZSTD_createCStream_Advanced()
  fixed symbols test
  added ZSTD_estimateDStreamSize()
  changed name frameParams into frameHeader
  regroup memory usage function declarations
  separated ZSTD_estimateCStreamSize() from ZSTD_estimateCCtxSize()
  bumped version number
  added ZSTD_estimateCDictSize() and ZSTD_estimateDDictSize()
  Updated ZSTD_freeCCtx()
  updated ZSTD_estimateCCtxSize()
  Updated ZSTD_sizeof_CCtx()
  merged CCtx and CStream as a single same object
  cli : -d and -t do not stop after a failed decompression
  added dev branch CircleCI badge
  added dev branch Appveyor badge
  keep dev branch status only
  creates a binary archive without the `programs` directory
  ...
2017-05-10 16:49:58 -07:00
Yann Collet ba41b26405 Merge pull request #689 from facebook/cctxMerge
Cctx merge
2017-05-10 14:53:54 -07:00
Yann Collet cef02d9317 changed variable name, for clarity
fhiPtr -> zfhPtr
https://github.com/facebook/zstd/pull/689#discussion_r115638676
2017-05-10 11:14:08 -07:00
Yann Collet 669346fe8b fixed ZSTD_estimateCStreamSize()
https://github.com/facebook/zstd/pull/689#discussion_r115637721
2017-05-10 11:08:00 -07:00
Yann Collet 6fb2f24132 shortened ZSTD_createCStream_Advanced()
https://github.com/facebook/zstd/pull/689#discussion_r115637613
2017-05-10 11:06:06 -07:00
Yann Collet f16f4497ca added ZSTD_estimateDStreamSize() 2017-05-09 16:18:17 -07:00
Yann Collet 542c9dfcf8 changed name frameParams into frameHeader
ZSTD_frameParams => ZSTD_frameHeader
ZSTD_getFrameParams() -> ZSTD_getFrameHeader()

The new naming is more distinctive from ZSTD_frameParameters,
which is used during compression.

ZSTD_frameHeader is clearer in its intention to described frame header content.
It also implies we are decoding a ZSTD frame, hence we are at decoding stage.
2017-05-09 15:46:07 -07:00
Yann Collet 5a36c069e7 regroup memory usage function declarations
in a single paragraph in zstd.h, for clarity
2017-05-09 15:11:30 -07:00
Yann Collet fa8dadb294 separated ZSTD_estimateCStreamSize() from ZSTD_estimateCCtxSize()
for clarity
2017-05-08 18:24:16 -07:00
Yann Collet 51652522a2 bumped version number 2017-05-08 17:52:46 -07:00
Yann Collet a1d6704d7f added ZSTD_estimateCDictSize() and ZSTD_estimateDDictSize()
it complements ZSTD_estimateCCtxSize()
for the special case of ZSTD_initCStream_usingDict()
2017-05-08 17:51:49 -07:00
Yann Collet 7855366598 Updated ZSTD_freeCCtx()
which can also contain streaming buffers now.
Redirected ZSTD_freeCStream() towards it.
2017-05-08 17:15:00 -07:00
Yann Collet fc5145955a updated ZSTD_estimateCCtxSize()
added a parameter streaming,
to estimate memory allocation size
when the CCtx is used for streaming (CStream).

Note : this function is not able to estimate
memory cost of a potential internal CDict
which can only happen when starting with ZSTD_initCStream_usingDict()
2017-05-08 17:07:59 -07:00
Yann Collet 791d744279 Updated ZSTD_sizeof_CCtx()
can now contain buffers if object used as CStream.
ZSTD_sizeof_CStream() is now just a thin wrapper of ZSTD_sizeof_CCtx().
2017-05-08 16:17:30 -07:00
Yann Collet 0be6fd3429 merged CCtx and CStream as a single same object
To be changed : ZSTD_sizeof_CCtx(), ZSTD_estimateCCtxSize()
2017-05-08 16:08:01 -07:00
Yann Collet d47709b6ea Merge pull request #654 from iburinoc/splittable
[RFC] Splittable Format and API
2017-05-08 13:41:56 -07:00
Yann Collet a00e9599f1 removed -g from DEBUGFLAGS
It inflates binary sizes, which is negative for the Windows build.
It also makes it impossible to check if 2 different source codes
get nonetheless compiled to the same binary,
since checksum will be different, due to integrated source code.
2017-05-04 17:24:29 -07:00
Yann Collet 606c04c228 Merge branch 'dev' of github.com:facebook/zstd into dev 2017-05-02 12:13:52 -07:00
Yann Collet 072484a3bf Merge pull request #683 from terrelln/odev
[CLI] Make cover the default dictionary builder
2017-05-02 12:13:23 -07:00
Nick Terrell f376d47c11 [CLI] Switch dictionary builder on CLI to cover 2017-05-02 11:18:27 -07:00
Nick Terrell 020b960e13 [cover] Make optimization faster 2017-05-02 11:02:48 -07:00
Nick Terrell f2d9ef1dc0 [cover] Optimize case where d <= 8 2017-05-02 11:02:43 -07:00
Nick Terrell 865918dd04 Fix typo in zdict.h 2017-05-02 11:02:37 -07:00
Yann Collet b184589c4c minor code refactoring for clarity 2017-05-01 11:35:47 -07:00
Yann Collet 33c38b0925 fixed const in prototype, that Visual doesn't accept 2017-05-01 11:12:30 -07:00
Yann Collet f39a6731ec sync bitstream.h from fse library 2017-05-01 09:56:03 -07:00
Yann Collet 202082f285 sync bitstream from FSE project
add assert into unsafe *_fast() variants
2017-04-28 17:00:31 -07:00
Yann Collet 89f50deec7 minor code refactoring
clearer tables
2017-04-28 16:52:36 -07:00
Yann Collet 68a7d3d49a added HUF_PUBLIC_API macro to huf.h
to make it possible to control symbol visibility.
Also : better separation and comments between "public" and "static" sections
2017-04-28 12:46:48 -07:00
Yann Collet a51cab6e68 Merge pull request #678 from facebook/apiChange
Breaking API Change around CDict
2017-04-28 10:02:45 -07:00
Yann Collet 29297c6751 Changed default level 18 (large input)
Previous -18 : 4.7 MB/s, R:3.833
New -18 : 5.1 MB/s. R:3.825

It's a better fit within -17 (6.8 MB/s) and -19 (4.0 MB/s)
The new level 18 also uses significantly less memory.
And, it makes a good transition between level 17 (mml5)
and level 19 (mml3).
Up to now, there was no level with mml4.

(note : minmatch setting can have a large impact on some (specific) datasets)
2017-04-27 17:44:01 -07:00
Yann Collet a92cbb7004 Added a secondary test, checking dictID presence after setting noDictIdFLag=1 2017-04-27 15:08:56 -07:00
Yann Collet d3694e6c70 removed C4204 2017-04-27 14:29:35 -07:00
Yann Collet 1c3ab0c77f fixed init error on Visual 2008 2017-04-27 12:57:11 -07:00
Yann Collet 8b669535f8 bumped version number to v1.2.0 2017-04-27 12:50:20 -07:00
Yann Collet 77bf59ef50 added ZSTD_initCStream_usingCDict_advanced() 2017-04-27 11:43:04 -07:00
Yann Collet f4bd857d81 created ZSTD_compress_usingCDict_advanced() 2017-04-27 11:31:55 -07:00
Yann Collet 69a54d138a fixed compilation warning : declaration-after-statement 2017-04-27 01:11:26 -07:00
Yann Collet 31533bacce Changed ZSTD_createCDict_advanced()
It now only uses compressionParameters as argument.
It produces many changes throughout user code,
though hopefully they tend to be simple :
just provide the cParams part from existing ZSTD_parameters.

Some programs might depend on ZSTD_createCDict_advanced() to pass frame parameters.
This change will force them to revisit this strategy and fix it,
since frame parameters are effectively silently ignored in current version.
2017-04-27 00:29:04 -07:00
Yann Collet 768df129d2 changed ZSTD_compressBegin_usingCDict()
No longer takes `pledgedSrcSize` as argument
this is in line with similar functions ZSTD_compress_usingCDict()
and ZSTD_initCStream_usingCDict().
2017-04-26 15:42:10 -07:00
Yann Collet e42afbc6fa Comply with suggested comments by @terrelln
created FSE_CTABLE_SIZE() and FSE_DTABLE_SIZE()
2017-04-26 11:39:35 -07:00
Sean Purcell 7d37ca1d5b Merge remote-tracking branch 'origin/dev' into splittable 2017-04-21 14:18:39 -07:00
Yann Collet 7271203bdb transferred entropy scratch space from CCtx into workSpace
Saved 6 KB
2017-04-20 23:21:19 -07:00
Yann Collet a408645f50 made some room for entropy scratch space 2017-04-20 23:09:39 -07:00
Yann Collet 71aaa32c3c transferred FSE tables from CCtx into workspace
Saved 5 KB from CCtx
2017-04-20 23:03:38 -07:00
Yann Collet 71ddeb67b1 made room in workspace for FSE tables
still need to be transferred from CCtx into workspace
2017-04-20 22:54:54 -07:00
Yann Collet a34a39c183 changed size evaluation of entropy tables
so that memcpy() does no longer depends on fse pointer being a static table
2017-04-20 18:26:25 -07:00
Yann Collet 7bb60b17d8 init entropy table pointers only once
per workSpace resize
2017-04-20 17:38:56 -07:00
Yann Collet e6fa70a0a1 reorganized ZSTD_resetCCtx_internal()
clearer separation between variables and buffers
clearer buffers category
kept static buffers at the beginning, favoring cache locality
(it will be easier to add FSE tables there later)

This break a few assumptions that hashTable was always at the beginning.
This is fixed.
And remaining assumptions (namely that tables stand next to each other in memory)
are now tested with assert.
2017-04-20 17:28:31 -07:00
Yann Collet c17e020c9a disable assert when compiling paramgrill
paramgrill is a benchmark calibration function.
Speed accuracy is critical, it cannot be altered by assert.
2017-04-20 12:50:02 -07:00
Yann Collet 16f9c572fc Merge branch 'dev' into compressionFlow 2017-04-20 11:16:40 -07:00
Yann Collet e348dad305 minor long line reformatting 2017-04-20 11:14:13 -07:00
Yann Collet e847730452 slightly refined README comments on lib-mt 2017-04-18 23:15:28 -07:00
Yann Collet 2c5514c759 fixed ZSTDMT_initCStream_advanced()
Must use the new ZSTD_compressBegin_usingCDict_advanced()
to enforce correct frame parameters
2017-04-18 22:52:41 -07:00
Sean Purcell 98cf7fcb2a Update README 2017-04-18 17:03:37 -07:00
Sean Purcell 0f7bd772e6 Update seekable API to simplify IO 2017-04-18 16:48:30 -07:00
Yann Collet a4cab80183 added ZSTD_copyCCtx_internal()
which respects provided fParams.
2017-04-18 14:54:54 -07:00
Sean Purcell ca6fae7808 Add MT enabled targets for libzstd 2017-04-18 14:13:01 -07:00
Yann Collet 30fb499208 Changed ZSTD_resetCCtx_advanced() into ZSTD_resetCCtx_internal()
for naming consistency :
_advanced() can be invoked
while _internal() are strictly static
2017-04-18 14:08:50 -07:00
Yann Collet 715b9aa113 created ZSTD_compressBegin_usingCDict_advanced() 2017-04-18 13:55:53 -07:00
Yann Collet af4f45b682 Improved code comments for block functions 2017-04-18 03:17:44 -07:00
Yann Collet 4f818182b8 clarified frame parameters for ZSTD_compress*_usingCDict()
created ZSTD_compressBegin_usingCDict_internal(),
which gives direct control to frame Parameters.
ZSTD_resetCStream_internal() now points into it.
2017-04-17 18:29:06 -07:00
Yann Collet c47c68f6ca proper evaluation of Huffman CTable size 2017-04-17 16:14:21 -07:00
Sean Purcell 5ee1135f30 s/chunk/frame/ 2017-04-12 11:15:50 -07:00
Yann Collet 88009a8ba2 removed srcSize control from CStream
since it's already done from lower bufferless API level
2017-04-12 00:51:24 -07:00
Yann Collet 20d5e03893 content size is controlled at bufferless level
so it's active for all entry points

Also : added relevant test (wrong content size) in fuzzer
2017-04-11 18:34:02 -07:00
Sean Purcell d048fefef7 Move seekable format content to /contrib 2017-04-11 14:38:56 -07:00
Sean Purcell 45f3bc4801 Add format specification 2017-04-11 13:53:09 -07:00
Sean Purcell a3b7c22604 Make seekable streams work w/ small buffers, misc fixes 2017-04-11 13:53:09 -07:00
Sean Purcell c3ba15e48f Seekable compression demo 2017-04-11 13:53:09 -07:00
Yann Collet 4ee6b15dac force contentSizeFlag=0 when using ZSTD_initCStream_usingCDict()
because by definition srcSize is not known when using this prototype.
added relevant test

Note : this use was already working, because at a later stage
(both ZSTD_compressBegin_usingCDict() and ZSTD_copyCCtx())
pledgedSrcSize=0 is translated into "unknown", no matter the frame parameter.
This is not correct, but of little importance,
as the medium term plan is to no longer set fParams within CDict
2017-04-11 11:59:44 -07:00
Yann Collet ab9162ebb4 simplified call graph
by calling ZSTD_compressBegin_internal() instead of ZSTD_compressBegin_advanced()
2017-04-11 10:46:20 -07:00
Yann Collet e88034fe26 simplified ZSTD_initCStream*() flow
all variants converge towards ZSTD_initCStream_stage2()
2017-04-10 22:24:02 -07:00
Yann Collet 4b987ad8ce Introduce ZSTD_initCStream_internal()
This is now the regroup point for ZSTD_initCStream*() functions

ZSTD_initCStream_advanced() now properly checks for parameters validity.

Also : added <assert.h> usage inside zstd_compress.c
Needs ZSTD_DEBUG=1 macro to be triggered.
Will be triggered by default from `tests` directory
2017-04-10 17:50:44 -07:00
Yann Collet 0181fef545 ensure cctx internal buffer is correctly sized in case of memory error 2017-04-06 01:25:26 -07:00
Yann Collet 36c2a03757 updated comments for ZSTD_resetCStream() 2017-04-05 22:06:21 -07:00
Yann Collet 003a244324 DStream : ensure correct size of internal buffers in case of error 2017-04-05 15:28:56 -07:00
Yann Collet 02d37aa1c1 ensure correct size of internal buffers in case of error 2017-04-05 14:53:51 -07:00
Nick Terrell 405d2a1027 Explicitly convert scratchBuffer to unsigned* 2017-04-04 16:35:31 -07:00
Nick Terrell 16a739cab0 Switch call of FSE_count() to FSE_count_wksp() 2017-04-04 16:17:21 -07:00
Yann Collet 7cf78f1be7 Protects ZSTD_compressBegin_usingCDict() vs NULL cdict dereference
Will issue an error (GENERIC) is cdict==NULL
2017-04-04 12:38:14 -07:00
Nick Terrell 26b046a7c4 Remove unnecessary dictID store 2017-04-03 21:46:28 -07:00
Nick Terrell 39a6cc5172 Make ZSTD_compress_usingCDict() respect contentSizeFlag 2017-04-03 21:09:55 -07:00
Nick Terrell 62ecad3819 Fix ZSTD_initCStream_usingCDict() to use dictionary 2017-04-03 21:05:59 -07:00
Yann Collet 30c7698970 optimize ZSTDMT_compress() memory usage
does no longer allocate temporary buffers
when there is enough room in dstBuffer to decompress directly there.
(previous method would skip that for 1st chunk only).

Also : fix ZSTD_compressBound() for small srcSize
2017-03-31 18:27:03 -07:00
Yann Collet 3f75d52527 Changed ZSTD_compressBound()
required so that if Total = A+B
compressBound(Total) <= compressBound(A) + compressBound(B)
under condition of a minimum size for A and B

Will help for ZSTDMT_compress() memory allocation
2017-03-31 17:11:38 -07:00
Yann Collet 7b70a1969e Merge branch 'dev' into zstdmt 2017-03-31 16:22:33 -07:00
Yann Collet 53203e7c38 Merge pull request #640 from facebook/memAccess
Changed memory strategy to __packed for gcc
2017-03-31 15:49:12 -07:00
Yann Collet eea7858e2b fixed minor warnings in debug code 2017-03-30 16:47:19 -07:00
Yann Collet 34cc487d05 overlap at full windowSize for max compression level
as it provides max compression ratio
2017-03-30 16:23:22 -07:00
Yann Collet 458e955c23 improved ZSTDMT_compress()
Use a bit more threads by default.
Uses overlap segments to boost compression ratio (like the streaming variant)
2017-03-30 15:51:58 -07:00
Yann Collet 6476c51b86 Merge pull request #637 from facebook/zstdmt
Zstdmt
2017-03-30 14:18:37 -07:00
Yann Collet 274f59919d Changed memory strategy to __packed for gcc
Method 1 __packed is always as good or better than memcpy().
But it's not portable, as it depends on compiler extension.

For gcc, __pakced directive works fine.
Furthermore, gcc has serious performance issues with memcpy() on ARM 32 bits.
See #620
2017-03-30 12:52:14 -07:00
Nick Terrell 5152fb2cb2 Convert all tabs to spaces 2017-03-29 18:51:58 -07:00
Yann Collet ca5a8bbe36 re-added patch ... 2017-03-29 17:15:27 -07:00
Yann Collet 2e2e78de47 removed unnecessary restriction on minmatchLength
it's now transparently translated to nearest value when unsupported
(7->6) (3->4)
2017-03-29 16:02:47 -07:00
Yann Collet 26769d88bc Merge branch 'dev' of github.com:facebook/zstd into dev 2017-03-29 15:21:30 -07:00
Yann Collet 933ce4a1dd fix : minmatch 7 conversion
minmatch 7 now converted to minmatch 6 for strategies which do not support 7
Used to folded into "default", which applied minmatch 4
2017-03-29 14:35:38 -07:00
Sean Purcell 4708394bdd Remove extra 'F' from skippable magic mask 2017-03-29 11:46:57 -07:00
Yann Collet 4cf0093571 restored bonus rule 2017-03-26 14:51:00 -07:00
Yann Collet 69017bf253 Merge branch 'dev' into LegacyDictBuilder 2017-03-26 14:39:13 -07:00
Yann Collet 582760818f minor refactor
add const
changed if for easier to add new conditions
2017-03-26 03:04:56 -07:00
Yann Collet 858f72eeb8 fixed dictBuilder issue
dictionary loading would fail during entropy analysis
2017-03-26 02:50:00 -07:00
Yann Collet ecee9f2ef8 fixed conversion warnings 2017-03-26 00:59:14 -07:00
Yann Collet 0246d5c531 Merge pull request #630 from facebook/advancedCliCommands
changed advanced commands --maxdict= and --dictID=
2017-03-26 00:13:35 -07:00
Yann Collet 4c41d37fcc changed test for new syntax
--dictID= and --maxdict=
2017-03-24 18:36:56 -07:00
Yann Collet d41f707e88 minor improvement : remove duplicates with 1 char prefix difference 2017-03-24 17:56:45 -07:00
Yann Collet b364caf455 Merge pull request #628 from facebook/dictBuilder_limits
Ensure all limits derived from same constants
2017-03-24 17:54:42 -07:00
Yann Collet 2238870eb6 Merge pull request #625 from facebook/loadCDict
limited CDict acceptation criteria to be the same as DDict
2017-03-24 16:06:20 -07:00
Yann Collet 96aa3019b2 changed advanced commands --maxdict= and --dictID=
now works with the `=` variant, which is the recommended one.
Old variant `--dictID #` still works, for compatibility with existing scripts.
Long term objective is to remove the old variant..
2017-03-24 16:04:29 -07:00
Yann Collet 9da3b215ec Ensure all limits derived from same constants
Now uses ZDICT_DICTSIZE_MIN and ZDICT_CONTENTSIZE_MIN
from zdict.h.

Also : reduced values to 256 and 128 respectively
2017-03-24 15:02:09 -07:00
Yann Collet ebe9963cf6 Merge pull request #626 from facebook/stricterDictBuilder
dictBuilder fails to create dictionary on certain input
2017-03-24 14:27:28 -07:00
Yann Collet 16a0b10781 fixed ZSTD_loadZstdDictionary()
forgot to add the dictionary content
(tests were not failing, just compressing less).

Also : added size protections when adding dict content
since hc/bt table filling would fail if size < 8
2017-03-24 12:46:46 -07:00
Yann Collet 23776ce290 fixed ERROR_GENERIC on dstSize_tooSmall
required by users which depends on this error code to size dest buffer
2017-03-23 17:59:50 -07:00
Yann Collet f332ece468 dictBuilder fails to create dictionary on certain input
Properly expressed with an error code (see zstd_errors.h)
and a cli return code != 0
2017-03-23 16:24:02 -07:00
Yann Collet bea78e8fc2 limited CDict acceptation criteria to be the same as DDict 2017-03-23 15:46:06 -07:00
Sean Purcell 042ba122ae Change g_displayLevel to int and fix DISPLAYUPDATE flush 2017-03-23 11:21:59 -07:00
Nick Terrell eaf69b07f0 Zero pointers after freeing 2017-03-21 13:20:59 -07:00
Yann Collet f3dfcdccd1 bump version number 2017-03-21 12:18:28 -07:00
Przemyslaw Skibinski 8086d623ca updated build of Windows packages 2017-03-18 11:19:09 +01:00
Yann Collet 7e35b352c6 Merge pull request #602 from iburinoc/doc
Add functions missing from manual, and fix parameter alignment
2017-03-14 14:08:41 -07:00
Sean Purcell dec2b96536 Add functions missing from manual, and fix parameter alignment 2017-03-14 11:24:09 -07:00
Sean Purcell 9830aeeea6 Fix legacy support=0 case and accidental double include of version headers 2017-03-13 17:19:37 -07:00
Sean Purcell 120df494e9 Update builds to not support legacy v01-v03 2017-03-13 14:44:08 -07:00
Sean Purcell 334cb34edb ZSTD_LEGACY_SUPPORT defines lowest supported version 2017-03-13 14:32:30 -07:00
Sean Purcell 784082f49c Change gotoDict type to uPtrDiff 2017-03-10 10:34:45 -08:00
Sean Purcell 8fe5c6862c Fix undefined behaviour in decompressor 2017-03-10 10:17:42 -08:00
Nick Terrell f35ef5c8e9 Whitespace only: tabs to spaces 2017-03-09 12:51:33 -08:00
Nick Terrell eeb31eed39 s/ZSTD_btopt2/ZSTD_btultra/g 2017-03-09 11:44:25 -08:00
Nick Terrell e65aab8e0f Remove 'mem.h' dependency from ZSTD_WINDOWLOG_MAX 2017-03-08 15:40:13 -08:00
Yann Collet a41a4ed39a Merge pull request #594 from terrelln/bugs
Small fixes
2017-03-08 14:56:07 -08:00
Nick Terrell 81512e9ebe Avoid '#define inline /* ... */'
Take definition of `FORCE_INLINE` from `zstd_internal.h`.
2017-03-08 14:00:21 -08:00
Nick Terrell e06c303475 Fix ZSTD_sizeof_CStream() 2017-03-08 13:45:10 -08:00
Sean Purcell 881abe44f1 Reduce point at which we reduce offsets to protect against UB 2017-03-07 16:58:08 -08:00
Sean Purcell 3437bf2feb Add build targets to the Makefile, and update CircleCI tests 2017-03-06 15:05:02 -08:00
Yann Collet 8b1d004031 added -Wformat-security flag, as recommended by @pixelb 2017-03-05 21:17:32 -08:00
Yann Collet 1f2c95c5f3 minor code refactor in HUF module 2017-03-05 21:07:20 -08:00
Yann Collet 5d801278dc Merge pull request #586 from terrelln/repeat-heuristic
Always check Huffman tables for ZSTD_lazy+
2017-03-03 19:38:56 -08:00
Nick Terrell 54c4babd8f Always check Huffman tables for ZSTD_lazy+
The compressor always reuses the existing Huffman table if the literals
size is at most 1 KiB. If the compression strategy is `ZSTD_lazy` or
stronger always check to see if reusing the previous table or creating
a new table is better.

This doesn't yet weigh in decompression speed. I don't want to add any
heuristics there until I have real data to work with to ensure that the
heuristic works for at least one use case, preferably more.
2017-03-03 16:49:38 -08:00
Yann Collet 1af570bd05 Merge pull request #585 from terrelln/cover-leak
Fix COVER_optimizeTrainFromBuffer() resource leaks
2017-03-02 20:46:35 -08:00
Yann Collet f44b55c18d Merge pull request #584 from terrelln/huff-repeat
Allow compressor to repeat Huffman tables
2017-03-02 17:20:11 -08:00
Yann Collet fe5d27062e disable prefetch-decode for 32-bits target
This decoder variant is detrimental to x86 architecture
likely due to register pressure.

Note that the variant is disabled for all 32-bits targets.
It's unclear if it would help for different architectures,
such as ARM, MIPS or PowerPC.
2017-03-02 17:09:21 -08:00
Nick Terrell d051cd5b43 Use workspace for count and CTable 2017-03-02 16:38:07 -08:00
Nick Terrell 976e325b2e Fix COVER_optimizeTrainFromBuffer() resource leaks
Thanks to @nemequ for reporting the resource leaks.
2017-03-02 15:54:39 -08:00
Sean Purcell 553f67e0c1 Remove 'generic' inline strategy
Seems to avoid performance loss for compression.
Same strategy tested on decompression side, did not appear to improve
speed.
2017-03-02 15:18:13 -08:00
Sean Purcell 3d95925a59 Merge remote-tracking branch 'origin/dev' into m32 2017-03-02 15:17:56 -08:00
Nick Terrell a419777eb1 Allow compressor to repeat Huffman tables
* Compressor saves most recently used Huffman table and reuses it
  if it produces better results.
* I attempted to preserve CPU usage profile.
  I intentionally left all of the existing heuristics in place.
  There is only a speed difference on the second block and later.
  When compressing large enough blocks (say >= 4 KiB) there is
  no significant difference in compression speed.
  Dictionary compression of one block is the same speed for blocks
  with literals <= 1 KiB, and after that the difference is not
  very significant.
* In the synthetic data, with blocks 10 KB or smaller, most blocks
  can't use repeated tables because the previous block did not
  contain a symbol that the current block contains.
  Once blocks are about 12 KB or more, most previous blocks have
  valid Huffman tables for the current block, and the compression
  ratio and decompression speed jumped.
* In silesia blocks as small as 4KB can frequently reuse the
  previous Huffman table (85%), but it isn't as profitable, and
  the previous Huffman table only gets used about 3% of the time.
* Microbenchmarks show that `HUF_validateCTable()` takes ~55 ns
  and `HUF_estimateCompressedSize()` takes ~35 ns.
  They are decently well optimized, the first versions took 90 ns
  and 120 ns respectively. `HUF_validateCTable()` could be twice as
  fast, if we cast the `HUF_CElt*` to a `U32*` and compare to 0.
  However, `U32` has an alignment of 4 instead of 2, so I think that
  might be undefined behavior.
* I've ran `zstreamtest` compiled normally, with UASAN and with MSAN
  for 4 hours each.

The worst case for the speed difference is a bunch of small blocks
in the same frame. I modified `bench.c` to compress the input in a
single frame but with blocks of the given block size, set by `-B`.
Benchmarks on level 1:

|  Program  | Block size |   Corpus  | Ratio | Compression MB/s | Decompression MB/s |
|-----------|------------|-----------|-------|------------------|--------------------|
| zstd.base |        256 | synthetic | 2.364 |            110.0 |              297.0 |
|      zstd |        256 | synthetic | 2.367 |            108.9 |              297.0 |
| zstd.base |        256 | silesia   | 2.204 |             93.8 |              415.7 |
|      zstd |        256 | silesia   | 2.204 |             93.4 |              415.7 |
| zstd.base |        512 | synthetic | 2.594 |            144.2 |              420.0 |
|      zstd |        512 | synthetic | 2.599 |            141.5 |              425.7 |
| zstd.base |        512 | silesia   | 2.358 |            118.4 |              432.6 |
|      zstd |        512 | silesia   | 2.358 |            119.8 |              432.6 |
| zstd.base |       1024 | synthetic | 2.790 |            192.3 |              594.1 |
|      zstd |       1024 | synthetic | 2.794 |            192.3 |              600.0 |
| zstd.base |       1024 | silesia   | 2.524 |            148.2 |              464.2 |
|      zstd |       1024 | silesia   | 2.525 |            148.2 |              467.6 |
| zstd.base |       4096 | synthetic | 3.023 |            300.0 |             1000.0 |
|      zstd |       4096 | synthetic | 3.024 |            300.0 |             1010.1 |
| zstd.base |       4096 | silesia   | 2.779 |            223.1 |              623.5 |
|      zstd |       4096 | silesia   | 2.779 |            223.1 |              636.0 |
| zstd.base |      16384 | synthetic | 3.131 |            350.0 |             1150.1 |
|      zstd |      16384 | synthetic | 3.152 |            350.0 |             1630.3 |
| zstd.base |      16384 | silesia   | 2.871 |            296.5 |              883.3 |
|      zstd |      16384 | silesia   | 2.872 |            294.4 |              898.3 |
2017-03-02 13:27:52 -08:00
Yann Collet fdb0fd34b3 Merge pull request #583 from terrelln/set-dictid
Set dictID to 0 for content only dictionaries
2017-03-02 13:15:31 -08:00
Nick Terrell 3475b9b431 Set dictID to 0 for content only dictionaries 2017-03-02 12:33:02 -08:00
Sean Purcell d44703d145 Offsets >= 32MB in 32-bits mode 2017-03-01 16:27:56 -08:00
Yann Collet 76f0494089 xxhash can be included twice in any order
Previously,

followed by :

would fail to include the static definitions,
because the second include was simply skipped by guard macro.

Now it works as intended :
the missing static part is included during the second include.
2017-03-01 13:29:29 -08:00
Yann Collet 4bcc69b761 solves warnings when compiling with global XXH_STATIC_LINKING_ONLY
XXH_STATIC_LINKING_ONLY protection macro is intended to be triggered just before the include.
The main idea is to keep this setting local :
user module shall explicitly understand and accept the static linking restriction
which becomes transparent when triggering the macro at project level.
Global definition also triggers redefinition warnings for user modules which do locally define the macro.

This new version compiles lib and cli without warning when the macro is set globally.
That's not a scenario to be recommended, since it trades a local effect for a global one,
but it was easy enough to provide from zstd side.
2017-03-01 11:33:25 -08:00
Yann Collet 31432cc57d Merge pull request #579 from iburinoc/multiframe
Check to ensure ddict isn't null before dereference
2017-03-01 11:02:04 -08:00
Sean Purcell a81d4fee58 Check to ensure ddict isn't null before dereference 2017-02-28 15:28:29 -08:00
Yann Collet 22d79762ef fixed multi frames 2017-02-28 02:12:42 -08:00
Yann Collet a33ae64204 fixed decoding skippable frames 2017-02-28 01:15:28 -08:00
Yann Collet d1760113ec Improved speed of ZSTD_decompressStream()
When ZSTD_decompressStream() detects
that there is enough space in dst
to complete decompression in a single pass,
delegates to ZSTD_decompress(),
for an extra ~5% speed boost
2017-02-28 00:14:28 -08:00
Yann Collet a81c2e7e44 Merge pull request #573 from facebook/ddict
Improved DDict memory usage
2017-02-27 20:54:42 -08:00
Yann Collet dccd6b6f65 cli : fix : --rm is silent when input is stdin
previously, app would produce an error message, and stop.
2017-02-27 15:57:50 -08:00
Yann Collet 0b9b894b2d reduced ZSTD_DDict memory usage
saved 128 KB
2017-02-27 00:27:30 -08:00
Yann Collet bd7fa21deb added ZSTD_refDDict()
Now DDict does no longer depends on DCtx duplication
2017-02-26 14:43:07 -08:00
Yann Collet d73eebc00f loadEntropy works on new ZSTD_entropy_t type 2017-02-26 10:16:42 -08:00
Yann Collet 8629f0e41f created entropy structure type 2017-02-25 18:33:31 -08:00
Yann Collet 8dff956dbf Added DDict unit test in fuzzer
also : slightly modified loadEntropy :
know src must points at start of dictionary
2017-02-25 10:11:15 -08:00
Yann Collet 14312d833e zstdmt : fix : loading prefix from previous segments
There used to be a (very small) chance that
loading prefix from previous segment
would be confused with a real zstd dictionary.
For that to happen, the prefix needs to start
with the same value as dictionary magic.
That's 1 chance in 4 billions if all values have equal probability.
But in fact, since some values are more common (0x00000000 for example)
others are less common, and dictionary magic was selected to be one of them,
so probabilities are likely even lower.

Anyway, this risk is no down to zero
by adding a new CCtx parameter : ZSTD_p_forceRawDict

Current parameter policy : the parameter "stick" to its CCtx,
so any dictionary loading after ZSTD_p_forceRawDict is set
will be loaded in "raw" ("content only") mode,
even if CCtx is re-used multiple times with multiple different dictionary.
It's up to the user to reset this value differently if it needs so.
2017-02-23 23:42:12 -08:00
Yann Collet 831b4890ce minor tests/Makefile refactoring
and update of zstd_manual,html
2017-02-23 23:09:10 -08:00
Yann Collet cce8d8ba2b Merge pull request #560 from iburinoc/findcompressedsize
Change name to to findFrameCompressedSize and add skippable support
2017-02-23 13:39:23 -08:00
Sean Purcell 83038d236a Fix bug in FSE distribution normalization 2017-02-22 13:52:48 -08:00
Sean Purcell 64417cd2ff Describe ambiguity around skippable frames 2017-02-22 13:29:01 -08:00
Sean Purcell 9757cc811b Update comment 2017-02-22 12:28:21 -08:00
Sean Purcell 9050e1925e Change name to to findFrameCompressedSize and add skippable support 2017-02-22 12:12:34 -08:00
Przemyslaw Skibinski d8114e5802 zstd_compress.c: fix memory leaks 2017-02-21 18:59:56 +01:00
Anders Oleson 517577bf53 spelling fixes in comments
i.e. occurred labeled Huffman
2017-02-20 12:08:59 -08:00
Sean Purcell 6b010dec80 execSequence copies up to 2*WILDCOPY_OVERLENGTH extra 2017-02-16 12:05:40 -08:00
Sean Purcell 887eaa9e21 Fix wildcopy overwriting data still in window 2017-02-15 16:43:45 -08:00
Yann Collet 2252d29a5a Merge branch 'dev' of github.com:facebook/zstd into dev 2017-02-15 12:00:50 -08:00
Yann Collet 4596037042 updated fse version
feature minor refactoring (removing FSE_abs())
also : fix a few minor issues recently introduced in examples
2017-02-15 12:00:03 -08:00
Yann Collet 44f82d781f Merge pull request #545 from terrelln/force-window
[zstdmt] Fix MSAN failure with ZSTD_p_forceWindow
2017-02-15 10:20:15 -08:00
Yann Collet f0b9a8dddb Merge pull request #547 from inikep/dev11
Avoid fseek()'s 2GiB barrier with MacOS and *BSD
2017-02-14 12:29:00 -08:00
Yann Collet 9696bfc2ad Merge pull request #544 from ds77/avoid-empty
Portable way to avoid empty unit warning in threading.c
2017-02-14 00:54:55 -08:00
Przemyslaw Skibinski b876b96ce1 Merge remote-tracking branch 'refs/remotes/facebook/dev' into dev11 2017-02-14 09:26:03 +01:00
Nick Terrell ecf90ca24b [zstdmt] Fix MSAN failure with ZSTD_p_forceWindow
Reproduction steps:

```
make zstreamtest CC=clang CFLAGS="-O3 -g -fsanitize=memory -fsanitize-memory-track-origins"
./zstreamtest -vv -t4178 -i4178 -s4531
```

How to get to the error in gdb (may be a more efficient way):

* 2 breaks at zstd_compress.c:2418  -- in ZSTD_compressContinue_internal()
* 2 breaks at zstd_compress.c:2276  -- in ZSTD_compressBlock_internal()
* 1 break at zstd_compress.c:1547

Why the error occurred:

When `zc->forceWindow == 1`, after calling `ZSTD_loadDictionaryContent()` we
have `zc->loadedDictEnd == zc->nextToUpdate == 0`. But, we've really loaded up
to `iend` into the dictionary. Then in `ZSTD_compressBlock_internal()` we see
that `current > zc->nextToUpdate + 384`, so we load the last 192 bytes a second
time. In this case the bytes we are loading are a block of all 0s, starting in
the previous block. So when we are loading the last 192 bytes, we find a `match`
in the future, 183 bytes beyond `ip`. Since the block is all 0s, the match
extends to the end of the block. But in `ZSTD_count()` we only check that
`pIn < pInLoopLimit`, but since `pMatch > pIn`, `pMatch` eventually points past
the end of the buffer, causing the MSAN failure.

The fix:

The line changed sets sets `zc->nextToUpdate` to the end of the dictionary.
This is the behavior that existed before `ZSTD_p_forceWindow` was introduced.
This fixes the exposing test case. Since the code doesn't fail without
`zc->forceWindow`, it makes sense that this works. I've run the command
`./zstreamtest -T2mn` 64 times without failures. CI should also verify nothing
obvious broke.
2017-02-13 19:11:22 -08:00
Yann Collet 58af614ef2 push version and NEWS to v1.1.4 2017-02-13 18:32:44 -08:00
ds77 08e6a88a97 avoid empty translation unit warning without #pragma 2017-02-14 00:46:47 +01:00
Przemyslaw Skibinski 09c8e5390d __builtin_bswap requires gcc 4.3+ 2017-02-13 12:45:53 +01:00
Sean Purcell d7bfcac18a Expose frameSrcSize to experimental API 2017-02-10 11:55:44 -08:00
Sean Purcell 5069b6c2c3 Merge branch 'dev' into multiframe 2017-02-10 10:08:55 -08:00
Yann Collet bbba42acd1 Merge pull request #537 from terrelln/small-bugs
Fix small bugs
2017-02-10 04:35:43 -08:00
Yann Collet a28c34cb7a Merge pull request #538 from iburinoc/errorstring
Fix ZSTD_getErrorString and add tests
2017-02-10 03:59:56 -08:00
Sean Purcell 269b2cd3d8 Documentation updates 2017-02-09 13:25:30 -08:00
Sean Purcell 2db7249265 Make pledgedSrcSize meaning clear for other functions
- Added tests
- Moved new size functions to static link only
2017-02-09 11:49:58 -08:00
Nick Terrell 545987996a Fix deprecation warnings for clang with C++14 2017-02-08 17:38:17 -08:00
Sean Purcell e0b3265e87 Fix ZSTD_getErrorString and add tests 2017-02-08 17:28:49 -08:00
Sean Purcell 0f5c95af44 Disambiguate pledgedSrcSize == 0
- Modify ZSTD CLI to only set contentSizeFlag if it _knows_ the size
- Change pzstd to stop setting contentSizeFlag without accurate pledgedSrcSize
2017-02-08 15:12:46 -08:00
Sean Purcell ba2ad9f25c ZSTD_decompress now handles multiple frames 2017-02-08 14:50:10 -08:00
Sean Purcell 4e709712e1 Decompressed size functions now handle multiframes and distinguish cases
- Add ZSTD_findDecompressedSize
    - Traverses multiple frames to find total output size
- Add ZSTD_getFrameContentSize
    - Gets the decompressed size of a single frame by reading header
- Deprecate ZSTD_getDecompressedSize
2017-02-08 14:50:10 -08:00
Przemyslaw Skibinski cdf5a7bd9f Merge remote-tracking branch 'refs/remotes/facebook/dev' into dev11 2017-02-08 13:49:35 +01:00
Nick Terrell 71c5263c00 Attribute cover dictionary code 2017-02-07 11:35:07 -08:00
Przemyslaw Skibinski 7060aee8c2 platform.h added to build_package.bat 2017-02-06 19:43:13 +01:00
Yann Collet b54e235bf3 fixed Mac OS-X specific directory in $(RM) list
these directories are now removed with -r command
2017-02-05 10:22:58 -08:00
Yann Collet c2a4632789 release builds use less debug symbols and warnings
release build are triggered through either `make`,
or their specific target `make zstd-release` and `make lib-release`.
2017-02-02 20:54:41 -08:00
Yann Collet 48bed91606 Merge pull request #527 from facebook/zstdmt
zstdmt refinements
2017-01-31 16:36:46 -08:00
Yann Collet b2e1b3d670 fixed overlapLog==0 => no overlap 2017-01-30 14:54:46 -08:00
Yann Collet 3672d06d06 zstdmt : section size is set to be a minimum of overlapSize
the minimum size condition size is applied transparently (no warning, no error)
like previous minimum section size condition (1 KB) which still applies.
2017-01-30 13:35:45 -08:00
Yann Collet 88df1aed61 changed advanced parameter overlapLog
Follows a positive logic (increasing value => increasing overlap)
which is easier to use
2017-01-30 11:00:00 -08:00
Yann Collet b5fd15ccb2 fixed : legacy decoders v04 and v05 2017-01-30 10:45:58 -08:00
Yann Collet cc3d1bc262 Merge pull request #525 from terrelln/covermt
Multithreaded COVER dictionary training
2017-01-30 10:15:33 -08:00
Nick Terrell 43474313f8 Fix documentation about memory usage 2017-01-27 18:43:05 -08:00
Nick Terrell b42dd27ef5 Add include guards and extern C 2017-01-27 16:00:19 -08:00
Yann Collet f6d4a786fc reduced zstdmt latency when using small custom section sizes with high compression levels
Previous version was requiring a fairly large initial amount of input data
before starting to create compression jobs.
This new version starts the process much sooner.
2017-01-27 15:55:30 -08:00
Nick Terrell c43c27127f Merge branch 'dev' into buck
* dev:
  updated NEWS
  fixed MSAN warnings in legacy decoders
  Fix cmake build
  updated NEWS
  Edits as per comments, and change wildcard 'X' to '?'
  Fix Visual Studios project
  Fix pool.c threading.h import
  Fix zstdmt_compress.h include
  Fixed commented issues
  Updated format specification to be easier to understand
  improved #232 fix
  Fixed https://github.com/facebook/zstd/issues/232
  .travis.yml: different tests for "master" branch
  .travis.yml: optimized order of short tests
  .travis.yml: test jobs 12-15
  JOB_NUMBER -eq 9
  improved ZSTD_compressBlock_opt_extDict_generic
2017-01-27 12:05:48 -08:00
Nick Terrell 2fe9126591 Add multithread support to COVER 2017-01-27 11:56:02 -08:00
Yann Collet 609c123a01 Merge pull request #522 from terrelln/benchmt
Fix some includes
2017-01-27 11:40:25 -08:00
Yann Collet cafdd31a38 fixed MSAN warnings in legacy decoders
In some extraordinary circumstances,
*Length field can be generated from reading a partially uninitialized memory segment.
Data is correctly identified as corrupted later on,
but the read taints some later pointer arithmetic operation.
2017-01-27 10:44:03 -08:00
Nick Terrell 9c018cc140 Add BUCK files for Nuclide support 2017-01-27 10:43:12 -08:00
Przemyslaw Skibinski 29157320fb improved ZSTD_compressBlock_opt_extDict_generic 2017-01-27 10:43:02 -08:00
Nick Terrell e628eaf87a Fix pool.c threading.h import 2017-01-26 15:29:10 -08:00
Yann Collet 717c65d690 Merge pull request #519 from inikep/dev11
Dev11
2017-01-26 14:23:44 -08:00
Yann Collet ef33d00532 fixed : ZSTD_setCCtxParameter() properly exposed in DLL 2017-01-26 12:24:21 -08:00
Yann Collet 4a62f79ec9 fixed clang documentation warning 2017-01-26 09:16:56 -08:00
Yann Collet 8dafb1acf5 CLI : automatically set overlap size to max (windowSize) for max compression level 2017-01-25 17:01:13 -08:00
Yann Collet 06e7697f96 added test of new parameter ZSTD_p_forceWindow 2017-01-25 16:39:03 -08:00
Yann Collet bb0027405a fixed zstdmt corruption issue when enabling overlapped sections
see Asana board for detailed explanation on why and how to fix it
2017-01-25 16:25:38 -08:00
Yann Collet 943cff9c37 fixed zstdmt cli freeze issue with large nb of threads
fileio.c was continually pushing more content without giving a chance to flush compressed one.
It would block the job queue when input data was accumulated too fast (requiring to define many threads).
Fixed : fileio flushes whatever it can after each input attempt.
2017-01-25 12:35:19 -08:00
Yann Collet dc8dae596a overlapped section, for improved compression
Sections 2+ read a bit of data from previous section
in order to improve compression ratio.
This also costs some CPU, to reference read data.

Read data is currently fixed to window>>3 size
2017-01-24 22:32:12 -08:00
Yann Collet f14a669054 refactor job creation
code shared accross ZSTDMT_{compress,flush,end}Stream(),
for easier maintenance
2017-01-24 17:41:49 -08:00
Yann Collet 512cbe8c10 zstdmt cli and API allow selection of section sizes
By default, section sizes are 4x window size.
This new setting allow manual selection of section sizes.
The larger they are, the (slightly) better the compression ratio,
but also the higher the memory allocation cost,
and eventually the lesser the nb of possible threads,
since each section is compressed by a single thread.

It also introduces a prototype to set generic parameters,
ZSTDMT_setMTCtxParameter()

The idea is that it's possible to add enums
to extend the list of parameters that can be set this way.
This is more long-term oriented than a fixed-size struct.
Consider it as a test.
2017-01-24 17:08:53 -08:00
Yann Collet 3488a4a473 ZSTDMT now supports frame checksum 2017-01-24 11:48:40 -08:00
Przemyslaw Skibinski 96f152f708 improved ZSTD_compressBlock_opt_extDict_generic 2017-01-24 13:18:50 +01:00
Yann Collet 94364bf87a refactor ZSTDMT streaming flush code
now shared by both ZSTDMT_compressStream() and ZSTDMT_flushStream()
2017-01-23 11:50:44 -08:00
Yann Collet 1cbf251e43 ZSTDMT streaming : fall back to (regular) single thread mode
when nbThreads==1
2017-01-23 01:43:58 -08:00
Yann Collet 84581ff8d7 ZSTDMT_compressCCtx : fallback to single-thread mode when nbChunks==1 2017-01-23 01:20:27 -08:00
Yann Collet 1a2547f654 ZSTDMT_compressStream() becomes blocking when required to ensure forward progresses
In some (rare) cases, job list could be blocked by a first job still being processed,
while all following ones are completed, waiting to be flushed.
In such case, the current job-table implementation is unable to accept new job.
As a consequence, a call to ZSTDMT_compressStream() can be useless (nothing read, nothing flushed),
with the risk to trigger a busy-wait on the caller side
(needlessly loop over ZSTDMT_compressStream() ).

In such a case, ZSTDMT_compressStream() will block until the first job is completed and ready to flush.
It ensures some forward progress by guaranteeing it will flush at least a part of the completed job.
Energy-wasting busy-wait is avoided.
2017-01-22 23:49:52 -08:00
Yann Collet c593348722 ZSTDMT_initCStream_usingDict() can outlive dict
Like ZSTD_initCStream_usingDict(),
ZSTDMT_initCStream_usingDict() now keep a copy of dict internally.
This way, dict can be released :
it does not longer have to outlive all future compression sessions.
2017-01-22 16:44:15 -08:00