updated documentation of streaming compression api

2018-04-24 14:44:27 -07:00 · 2018-04-24 14:44:27 -07:00 · ace856a835
commit ace856a835
parent 2c3ad05812
2 changed files with 107 additions and 60 deletions
--- a/doc/zstd_manual.html
+++ b/doc/zstd_manual.html
@ -20,15 +20,17 @@
 <li><a href="#Chapter10">START OF ADVANCED AND EXPERIMENTAL FUNCTIONS</a></li>
 <li><a href="#Chapter11">Advanced types</a></li>
 <li><a href="#Chapter12">Frame size functions</a></li>
-<li><a href="#Chapter13">Memory management</a></li>
-<li><a href="#Chapter14">Advanced compression functions</a></li>
-<li><a href="#Chapter15">Advanced decompression functions</a></li>
-<li><a href="#Chapter16">Advanced streaming functions</a></li>
-<li><a href="#Chapter17">Buffer-less and synchronous inner streaming functions</a></li>
-<li><a href="#Chapter18">Buffer-less streaming compression (synchronous mode)</a></li>
-<li><a href="#Chapter19">Buffer-less streaming decompression (synchronous mode)</a></li>
-<li><a href="#Chapter20">New advanced API (experimental)</a></li>
-<li><a href="#Chapter21">Block level API</a></li>
+<li><a href="#Chapter13">ZSTD_frameHeaderSize() :</a></li>
+<li><a href="#Chapter14">Memory management</a></li>
+<li><a href="#Chapter15">Advanced compression functions</a></li>
+<li><a href="#Chapter16">Advanced decompression functions</a></li>
+<li><a href="#Chapter17">Advanced streaming functions</a></li>
+<li><a href="#Chapter18">Buffer-less and synchronous inner streaming functions</a></li>
+<li><a href="#Chapter19">Buffer-less streaming compression (synchronous mode)</a></li>
+<li><a href="#Chapter20">Buffer-less streaming decompression (synchronous mode)</a></li>
+<li><a href="#Chapter21">New advanced API (experimental)</a></li>
+<li><a href="#Chapter22">ZSTD_getFrameHeader_advanced() :</a></li>
+<li><a href="#Chapter23">Block level API</a></li>
 </ol>
 <hr>
 <a name="Chapter1"></a><h2>Introduction</h2><pre>
@ -232,33 +234,38 @@ size_t     ZSTD_freeDCtx(ZSTD_DCtx* dctx);
  since it will play nicer with system's memory, by re-using already allocated memory.
  Use one separate ZSTD_CStream per thread for parallel execution.

-  Start a new compression by initializing ZSTD_CStream.
+  Start a new compression by initializing ZSTD_CStream context.
  Use ZSTD_initCStream() to start a new compression operation.
-  Use ZSTD_initCStream_usingDict() or ZSTD_initCStream_usingCDict() for a compression which requires a dictionary (experimental section)
+  Use variants ZSTD_initCStream_usingDict() or ZSTD_initCStream_usingCDict() for streaming with dictionary (experimental section)

-  Use ZSTD_compressStream() repetitively to consume input stream.
-  The function will automatically update both `pos` fields.
-  Note that it may not consume the entire input, in which case `pos < size`,
-  and it's up to the caller to present again remaining data.
+  Use ZSTD_compressStream() as many times as necessary to consume input stream.
+  The function will automatically update both `pos` fields within `input` and `output`.
+  Note that the function may not consume the entire input,
+  for example, because the output buffer is already full,
+  in which case `input.pos < input.size`.
+  The caller must check if input has been entirely consumed.
+  If not, the caller must make some room to receive more compressed data,
+  typically by emptying output buffer, or allocating a new output buffer,
+  and then present again remaining input data.
  @return : a size hint, preferred nb of bytes to use as input for next function call
            or an error code, which can be tested using ZSTD_isError().
            Note 1 : it's just a hint, to help latency a little, any other value will work fine.
            Note 2 : size hint is guaranteed to be <= ZSTD_CStreamInSize()

-  At any moment, it's possible to flush whatever data remains within internal buffer, using ZSTD_flushStream().
-  `output->pos` will be updated.
-  Note that some content might still be left within internal buffer if `output->size` is too small.
-  @return : nb of bytes still present within internal buffer (0 if it's empty)
+  At any moment, it's possible to flush whatever data might remain stuck within internal buffer,
+  using ZSTD_flushStream(). `output->pos` will be updated.
+  Note that, if `output->size` is too small, a single invocation of ZSTD_flushStream() might not be enough (return code > 0).
+  In which case, make some room to receive more compressed data, and call again ZSTD_flushStream().
+  @return : 0 if internal buffers are entirely flushed,
+            >0 if some data still present within internal buffer (the value is minimal estimation of remaining size),
            or an error code, which can be tested using ZSTD_isError().

  ZSTD_endStream() instructs to finish a frame.
  It will perform a flush and write frame epilogue.
  The epilogue is required for decoders to consider a frame completed.
-  ZSTD_endStream() may not be able to flush full data if `output->size` is too small.
-  In which case, call again ZSTD_endStream() to complete the flush.
+  flush() operation is the same, and follows same rules as ZSTD_flushStream().
  @return : 0 if frame fully completed and fully flushed,
-             or >0 if some data is still present within internal buffer
-                  (value is minimum size estimation for remaining data to flush, but it could be more)
+            >0 if some data still present within internal buffer (the value is minimal estimation of remaining size),
            or an error code, which can be tested using ZSTD_isError().

 
@ -388,13 +395,12 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB
            however it does mean that all frame data must be present and valid. 
 </p></pre><BR>

-<pre><b>size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize);
-</b><p>   `src` should point to the start of a ZSTD frame
-   `srcSize` must be >= ZSTD_frameHeaderSize_prefix.
-   @return : size of the Frame Header 
-</p></pre><BR>
+<a name="Chapter13"></a><h2>ZSTD_frameHeaderSize() :</h2><pre>  srcSize must be >= ZSTD_frameHeaderSize_prefix.
+ @return : size of the Frame Header,
+           or an error code (if srcSize is too small) 
+<BR></pre>

-<a name="Chapter13"></a><h2>Memory management</h2><pre></pre>
+<a name="Chapter14"></a><h2>Memory management</h2><pre></pre>

 <pre><b>size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx);
 size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx);
@ -484,7 +490,7 @@ static ZSTD_customMem const ZSTD_defaultCMem = { NULL, NULL, NULL };  </b>/**< t
 
 </p></pre><BR>

-<a name="Chapter14"></a><h2>Advanced compression functions</h2><pre></pre>
+<a name="Chapter15"></a><h2>Advanced compression functions</h2><pre></pre>

 <pre><b>ZSTD_CDict* ZSTD_createCDict_byReference(const void* dictBuffer, size_t dictSize, int compressionLevel);
 </b><p>  Create a digested dictionary for compression
@ -526,7 +532,7 @@ static ZSTD_customMem const ZSTD_defaultCMem = { NULL, NULL, NULL };  </b>/**< t
 </b><p>   Same as ZSTD_compress_usingCDict(), with fine-tune control over frame parameters 
 </p></pre><BR>

-<a name="Chapter15"></a><h2>Advanced decompression functions</h2><pre></pre>
+<a name="Chapter16"></a><h2>Advanced decompression functions</h2><pre></pre>

 <pre><b>unsigned ZSTD_isFrame(const void* buffer, size_t size);
 </b><p>  Tells if the content of `buffer` starts with a valid Frame Identifier.
@ -566,7 +572,7 @@ static ZSTD_customMem const ZSTD_defaultCMem = { NULL, NULL, NULL };  </b>/**< t
  When identifying the exact failure cause, it's possible to use ZSTD_getFrameHeader(), which will provide a more precise error code. 
 </p></pre><BR>

-<a name="Chapter16"></a><h2>Advanced streaming functions</h2><pre></pre>
+<a name="Chapter17"></a><h2>Advanced streaming functions</h2><pre></pre>

 <h3>Advanced Streaming compression functions</h3><pre></pre><b><pre>size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize);   </b>/**< pledgedSrcSize must be correct. If it is not known at init time, use ZSTD_CONTENTSIZE_UNKNOWN. Note that, for compatibility with older programs, "0" also disables frame content size field. It may be enabled in the future. */<b>
 size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); </b>/**< creates of an internal CDict (incompatible with static CCtx), except if dict == NULL or dictSize < 8, in which case no dict is used. Note: dict is loaded with ZSTD_dm_auto (treated as a full zstd dictionary if it begins with ZSTD_MAGIC_DICTIONARY, else as raw content) and ZSTD_dlm_byCopy.*/<b>
@ -598,14 +604,14 @@ size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t di
 size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict);  </b>/**< note : ddict is referenced, it must outlive decompression session */<b>
 size_t ZSTD_resetDStream(ZSTD_DStream* zds);  </b>/**< re-use decompression parameters from previous init; saves dictionary loading */<b>
 </pre></b><BR>
-<a name="Chapter17"></a><h2>Buffer-less and synchronous inner streaming functions</h2><pre>
+<a name="Chapter18"></a><h2>Buffer-less and synchronous inner streaming functions</h2><pre>
  This is an advanced API, giving full control over buffer management, for users which need direct control over memory.
  But it's also a complex one, with several restrictions, documented below.
  Prefer normal streaming API for an easier experience.
 
 <BR></pre>

-<a name="Chapter18"></a><h2>Buffer-less streaming compression (synchronous mode)</h2><pre>
+<a name="Chapter19"></a><h2>Buffer-less streaming compression (synchronous mode)</h2><pre>
  A ZSTD_CCtx object is required to track streaming operations.
  Use ZSTD_createCCtx() / ZSTD_freeCCtx() to manage resource.
  ZSTD_CCtx object can be re-used multiple times within successive compression operations.
@ -641,7 +647,7 @@ size_t ZSTD_compressBegin_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict);
 size_t ZSTD_compressBegin_usingCDict_advanced(ZSTD_CCtx* const cctx, const ZSTD_CDict* const cdict, ZSTD_frameParameters const fParams, unsigned long long const pledgedSrcSize);   </b>/* compression parameters are already set within cdict. pledgedSrcSize must be correct. If srcSize is not known, use macro ZSTD_CONTENTSIZE_UNKNOWN */<b>
 size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx, unsigned long long pledgedSrcSize); </b>/**<  note: if pledgedSrcSize is not known, use ZSTD_CONTENTSIZE_UNKNOWN */<b>
 </pre></b><BR>
-<a name="Chapter19"></a><h2>Buffer-less streaming decompression (synchronous mode)</h2><pre>
+<a name="Chapter20"></a><h2>Buffer-less streaming decompression (synchronous mode)</h2><pre>
  A ZSTD_DCtx object is required to track streaming operations.
  Use ZSTD_createDCtx() / ZSTD_freeDCtx() to manage it.
  A ZSTD_DCtx object can be re-used multiple times.
@ -722,12 +728,17 @@ typedef struct {
    unsigned dictID;
    unsigned checksumFlag;
 } ZSTD_frameHeader;
+</b>/** ZSTD_getFrameHeader() :<b>
+ *  decode Frame Header, or requires larger `srcSize`.
+ * @return : 0, `zfhPtr` is correctly filled,
+ *          >0, `srcSize` is too small, value is wanted `srcSize` amount,
+ *           or an error code, which can be tested using ZSTD_isError() */
 size_t ZSTD_getFrameHeader(ZSTD_frameHeader* zfhPtr, const void* src, size_t srcSize);   </b>/**< doesn't consume input */<b>
 size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long long frameContentSize);  </b>/**< when frame content size is not known, pass in frameContentSize == ZSTD_CONTENTSIZE_UNKNOWN */<b>
 </pre></b><BR>
 <pre><b>typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e;
 </b></pre><BR>
-<a name="Chapter20"></a><h2>New advanced API (experimental)</h2><pre></pre>
+<a name="Chapter21"></a><h2>New advanced API (experimental)</h2><pre></pre>

 <pre><b>typedef enum {
    </b>/* Opened question : should we have a format ZSTD_f_auto ?<b>
@ -762,16 +773,19 @@ size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long
                              * Special: value 0 means "use default windowLog".
                              * Note: Using a window size greater than ZSTD_MAXWINDOWSIZE_DEFAULT (default: 2^27)
                              *       requires explicitly allowing such window size during decompression stage. */
-    ZSTD_p_hashLog,          </b>/* Size of the probe table, as a power of 2.<b>
+    ZSTD_p_hashLog,          </b>/* Size of the initial probe table, as a power of 2.<b>
                              * Resulting table size is (1 << (hashLog+2)).
                              * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX.
                              * Larger tables improve compression ratio of strategies <= dFast,
                              * and improve speed of strategies > dFast.
                              * Special: value 0 means "use default hashLog". */
-    ZSTD_p_chainLog,         </b>/* Size of the full-search table, as a power of 2.<b>
+    ZSTD_p_chainLog,         </b>/* Size of the multi-probe search table, as a power of 2.<b>
                              * Resulting table size is (1 << (chainLog+2)).
+                              * Must be clamped between ZSTD_CHAINLOG_MIN and ZSTD_CHAINLOG_MAX.
                              * Larger tables result in better and slower compression.
                              * This parameter is useless when using "fast" strategy.
+                              * Note it's still useful when using "dfast" strategy,
+                              * in which case it defines a secondary probe table.
                              * Special: value 0 means "use default chainLog". */
    ZSTD_p_searchLog,        </b>/* Number of search attempts, as a power of 2.<b>
                              * More attempts result in better and slower compression.
@ -866,13 +880,22 @@ size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long
 </b></pre><BR>
 <pre><b>size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value);
 </b><p>  Set one compression parameter, selected by enum ZSTD_cParameter.
-  Setting a parameter is generally only possible during frame initialization (before starting compression),
-  except for a few exceptions which can be updated during compression: compressionLevel, hashLog, chainLog, searchLog, minMatch, targetLength and strategy.
-  Note : when `value` is an enum, cast it to unsigned for proper type checking.
-  @result : informational value (typically, value being set clamped correctly),
+  Setting a parameter is generally only possible during frame initialization (before starting compression).
+  Exception : when using multi-threading mode (nbThreads >= 1),
+              following parameters can be updated _during_ compression (within same frame):
+              => compressionLevel, hashLog, chainLog, searchLog, minMatch, targetLength and strategy.
+              new parameters will be active on next job, or after a flush().
+  Note : when `value` type is not unsigned (int, or enum), cast it to unsigned for proper type checking.
+  @result : informational value (typically, value being set, correctly clamped),
            or an error code (which can be tested with ZSTD_isError()). 
 </p></pre><BR>

+<pre><b>size_t ZSTD_CCtx_getParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned* value);
+</b><p> Get the requested value of one compression parameter, selected by enum ZSTD_cParameter.
+ @result : 0, or an error code (which can be tested with ZSTD_isError()).
+ 
+</p></pre><BR>
+
 <pre><b>size_t ZSTD_CCtx_setPledgedSrcSize(ZSTD_CCtx* cctx, unsigned long long pledgedSrcSize);
 </b><p>  Total input data size to be compressed as a single frame.
  This value will be controlled at the end, and result in error if not respected.
@ -936,9 +959,16 @@ size_t ZSTD_CCtx_refPrefix_advanced(ZSTD_CCtx* cctx, const void* prefix, size_t
 </b><p>  Return a CCtx to clean state.
  Useful after an error, or to interrupt an ongoing compression job and start a new one.
  Any internal data not yet flushed is cancelled.
+  The parameters and dictionary are kept unchanged, to reset them use ZSTD_CCtx_resetParameters().
+ 
+</p></pre><BR>
+
+<pre><b>size_t ZSTD_CCtx_resetParameters(ZSTD_CCtx* cctx);
+</b><p>  All parameters are back to default values (compression level is ZSTD_CLEVEL_DEFAULT).
  Dictionary (if any) is dropped.
-  All parameters are back to default values.
-  It's possible to modify compression parameters after a reset.
+  Resetting parameters is only possible during frame initialization (before starting compression).
+  To reset the context use ZSTD_CCtx_reset().
+  @return 0 or an error code (which can be checked with ZSTD_isError()).
 
 </p></pre><BR>

@ -1033,6 +1063,13 @@ size_t ZSTD_freeCCtxParams(ZSTD_CCtx_params* params);
 
 </p></pre><BR>

+<pre><b>size_t ZSTD_CCtxParam_getParameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, unsigned* value);
+</b><p> Similar to ZSTD_CCtx_getParameter.
+ Get the requested value of one compression parameter, selected by enum ZSTD_cParameter.
+ @result : 0, or an error code (which can be tested with ZSTD_isError()).
+ 
+</p></pre><BR>
+
 <pre><b>size_t ZSTD_CCtx_setParametersUsingCCtxParams(
        ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params);
 </b><p>  Apply a set of ZSTD_CCtx_params to the compression context.
@ -1043,7 +1080,8 @@ size_t ZSTD_freeCCtxParams(ZSTD_CCtx_params* params);
 
 </p></pre><BR>

-<h3>Advanced parameters for decompression API</h3><pre></pre><b><pre></pre></b><BR>
+<h3>Advanced decompression API</h3><pre></pre><b><pre></b>/* ==================================== */<b>
+</pre></b><BR>
 <pre><b>size_t ZSTD_DCtx_loadDictionary(ZSTD_DCtx* dctx, const void* dict, size_t dictSize);
 size_t ZSTD_DCtx_loadDictionary_byReference(ZSTD_DCtx* dctx, const void* dict, size_t dictSize);
 size_t ZSTD_DCtx_loadDictionary_advanced(ZSTD_DCtx* dctx, const void* dict, size_t dictSize, ZSTD_dictLoadMethod_e dictLoadMethod, ZSTD_dictContentType_e dictContentType);
@ -1105,6 +1143,10 @@ size_t ZSTD_DCtx_refPrefix_advanced(ZSTD_DCtx* dctx, const void* prefix, size_t
 
 </p></pre><BR>

+<a name="Chapter22"></a><h2>ZSTD_getFrameHeader_advanced() :</h2><pre>  same as ZSTD_getFrameHeader(),
+  with added capability to select a format (like ZSTD_f_zstd1_magicless) 
+<BR></pre>
+
 <pre><b>size_t ZSTD_decompress_generic(ZSTD_DCtx* dctx,
                               ZSTD_outBuffer* output,
                               ZSTD_inBuffer* input);
@ -1137,7 +1179,7 @@ size_t ZSTD_DCtx_refPrefix_advanced(ZSTD_DCtx* dctx, const void* prefix, size_t
 
 </p></pre><BR>

-<a name="Chapter21"></a><h2>Block level API</h2><pre></pre>
+<a name="Chapter23"></a><h2>Block level API</h2><pre></pre>

 <pre><b></b><p>    Frame metadata cost is typically ~18 bytes, which can be non-negligible for very small blocks (< 100 bytes).
    User will have to take in charge required information to regenerate data, such as compressed and content sizes.
--- a/lib/zstd.h
+++ b/lib/zstd.h
@ -272,33 +272,38 @@ typedef struct ZSTD_outBuffer_s {
 *  since it will play nicer with system's memory, by re-using already allocated memory.
 *  Use one separate ZSTD_CStream per thread for parallel execution.
 *
-*  Start a new compression by initializing ZSTD_CStream.
+*  Start a new compression by initializing ZSTD_CStream context.
 *  Use ZSTD_initCStream() to start a new compression operation.
-*  Use ZSTD_initCStream_usingDict() or ZSTD_initCStream_usingCDict() for a compression which requires a dictionary (experimental section)
+*  Use variants ZSTD_initCStream_usingDict() or ZSTD_initCStream_usingCDict() for streaming with dictionary (experimental section)
 *
-*  Use ZSTD_compressStream() repetitively to consume input stream.
-*  The function will automatically update both `pos` fields.
-*  Note that it may not consume the entire input, in which case `pos < size`,
-*  and it's up to the caller to present again remaining data.
+*  Use ZSTD_compressStream() as many times as necessary to consume input stream.
+*  The function will automatically update both `pos` fields within `input` and `output`.
+*  Note that the function may not consume the entire input,
+*  for example, because the output buffer is already full,
+*  in which case `input.pos < input.size`.
+*  The caller must check if input has been entirely consumed.
+*  If not, the caller must make some room to receive more compressed data,
+*  typically by emptying output buffer, or allocating a new output buffer,
+*  and then present again remaining input data.
 *  @return : a size hint, preferred nb of bytes to use as input for next function call
 *            or an error code, which can be tested using ZSTD_isError().
 *            Note 1 : it's just a hint, to help latency a little, any other value will work fine.
 *            Note 2 : size hint is guaranteed to be <= ZSTD_CStreamInSize()
 *
-*  At any moment, it's possible to flush whatever data remains within internal buffer, using ZSTD_flushStream().
-*  `output->pos` will be updated.
-*  Note that some content might still be left within internal buffer if `output->size` is too small.
-*  @return : nb of bytes still present within internal buffer (0 if it's empty)
+*  At any moment, it's possible to flush whatever data might remain stuck within internal buffer,
+*  using ZSTD_flushStream(). `output->pos` will be updated.
+*  Note that, if `output->size` is too small, a single invocation of ZSTD_flushStream() might not be enough (return code > 0).
+*  In which case, make some room to receive more compressed data, and call again ZSTD_flushStream().
+*  @return : 0 if internal buffers are entirely flushed,
+*            >0 if some data still present within internal buffer (the value is minimal estimation of remaining size),
 *            or an error code, which can be tested using ZSTD_isError().
 *
 *  ZSTD_endStream() instructs to finish a frame.
 *  It will perform a flush and write frame epilogue.
 *  The epilogue is required for decoders to consider a frame completed.
-*  ZSTD_endStream() may not be able to flush full data if `output->size` is too small.
-*  In which case, call again ZSTD_endStream() to complete the flush.
+*  flush() operation is the same, and follows same rules as ZSTD_flushStream().
 *  @return : 0 if frame fully completed and fully flushed,
-             or >0 if some data is still present within internal buffer
-                  (value is minimum size estimation for remaining data to flush, but it could be more)
+*            >0 if some data still present within internal buffer (the value is minimal estimation of remaining size),
 *            or an error code, which can be tested using ZSTD_isError().
 *
 * *******************************************************************/