Merge remote-tracking branch 'refs/remotes/Cyan4973/dev' into dev

2016-06-17 15:17:37 +02:00 · 2016-06-17 15:17:37 +02:00 · dba8b44370
commit dba8b44370
parent e16f65675b 19cab46f2f
25 changed files with 276 additions and 130 deletions
--- a/.gitattributes
+++ b/.gitattributes
@ -18,3 +18,4 @@

 # Windows
 *.bat text eol=crlf
+*.cmd text eol=crlf
--- a/.gitignore
+++ b/.gitignore
@ -13,6 +13,7 @@
 *.dylib

 # Executables
+zstd
 *.exe
 *.out
 *.app
--- a/1
+++ b/1
@ -51,6 +51,7 @@ all:

 zstdprogram:
 	$(MAKE) -C $(PRGDIR)
+	mv $(PRGDIR)/zstd .

 zlibwrapper:
 	$(MAKE) -C $(ZSTDDIR) all
--- a/5
+++ b/5
@ -5,8 +5,9 @@ New : Visual build scripts, by Christophe Chevalier
 New : Support for Sparse File-systems (do not use space for zero-filled sectors)
 New : Frame checksum support
 New : Support pass-through mode (when using `-df`)
-New : API : dictionary files from custom content, by Giuseppe Ottaviano
-New : API support for custom malloc/free functions
+API : more efficient Dictionary API : `ZSTD_compress_usingCDict()`, `ZSTD_decompress_usingDDict()`
+API : create dictionary files from custom content, by Giuseppe Ottaviano
+API : support for custom malloc/free functions
 New : controllable Dictionary ID
 New : Support for skippable frames

--- a/README.md
+++ b/README.md
@ -1,4 +1,4 @@
- **Zstd**, short for Zstandard, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level compression ratio.
+ **Zstd**, short for Zstandard, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios.

 It is provided as a BSD-license package, hosted on Github.

@ -7,7 +7,7 @@ It is provided as a BSD-license package, hosted on Github.
 |master      | [![Build Status](https://travis-ci.org/Cyan4973/zstd.svg?branch=master)](https://travis-ci.org/Cyan4973/zstd) |
 |dev         | [![Build Status](https://travis-ci.org/Cyan4973/zstd.svg?branch=dev)](https://travis-ci.org/Cyan4973/zstd) |

-As a reference, several fast compression algorithms were tested and compared to [zlib] on a Core i7-3930K CPU @ 4.5GHz, using [lzbench], an open-source in-memory benchmark by @inikep compiled with gcc 5.2.1, on the [Silesia compression corpus].
+As a reference, several fast compression algorithms were tested and compared on a Core i7-3930K CPU @ 4.5GHz, using [lzbench], an open-source in-memory benchmark by @inikep compiled with gcc 5.2.1, with the [Silesia compression corpus].

 [lzbench]: https://github.com/inikep/lzbench
 [Silesia compression corpus]: http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
@ -16,7 +16,7 @@ As a reference, several fast compression algorithms were tested and compared to
 |Name             | Ratio | C.speed | D.speed |
 |-----------------|-------|--------:|--------:|
 |                 |       |   MB/s  |  MB/s   |
-|**zstd 0.6.0 -1**|**2.877**|**330**| **915** |
+|**zstd 0.7.0 -1**|**2.877**|**325**| **930** |
 | [zlib] 1.2.8 -1 | 2.730 |    95   |   360   |
 | brotli -0       | 2.708 |   220   |   430   |
 | QuickLZ 1.5     | 2.237 |   510   |   605   |
@ -28,16 +28,16 @@ As a reference, several fast compression algorithms were tested and compared to
 [zlib]:http://www.zlib.net/
 [LZ4]: http://www.lz4.org/

-Zstd can also offer stronger compression ratio at the cost of compression speed. 
-Speed vs Compression trade-off is configurable by small increment. Decompression speed is preserved and remain roughly the same at all settings, a property shared by most LZ compression algorithms, such as [zlib].
+Zstd can also offer stronger compression ratios at the cost of compression speed.
+Speed vs Compression trade-off is configurable by small increment. Decompression speed is preserved and remain roughly the same at all settings, a property shared by most LZ compression algorithms, such as [zlib] or lzma.

-The following test is run on a Core i7-3930K CPU @ 4.5GHz, using [lzbench], an open-source in-memory benchmark by @inikep compiled with gcc 5.2.1, on the [Silesia compression corpus].
+The following tests were run on a Core i7-3930K CPU @ 4.5GHz, using [lzbench], an open-source in-memory benchmark by @inikep compiled with gcc 5.2.1, on the [Silesia compression corpus].

 Compression Speed vs Ratio | Decompression Speed
 ---------------------------|--------------------
 ![Compression Speed vs Ratio](images/Cspeed4.png "Compression Speed vs Ratio") | ![Decompression Speed](images/Dspeed4.png "Decompression Speed")

-Several algorithms can produce higher compression ratio at slower speed, falling outside of the graph.
+Several algorithms can produce higher compression ratio but at slower speed, falling outside of the graph.
 For a larger picture including very slow modes, [click on this link](images/DCspeed5.png) .


@ -74,8 +74,10 @@ Hence, deploying one dictionary per type of data will provide the greater benefi

 ### Status

-Zstd is in development. The internal format evolves to reach better performance. "Final Format" is projected H1 2016, and will be tagged `v1.0`. Zstd offers legacy support, meaning any data compressed by any version >= 0.1 (therefore including current one) remain decodable in the future.
-The library is also quite robust, able to withstand hazards situations, including invalid inputs. Library reliability has been tested using [Fuzz Testing](https://en.wikipedia.org/wiki/Fuzz_testing), with both [internal tools](programs/fuzzer.c) and [external ones](http://lcamtuf.coredump.cx/afl). Therefore, Zstandard is considered safe for production environments.
+Zstd compression format has reached "Final status". It means it is planned to become the official stable zstd format and be tagged `v1.0`. The reason it's not yet tagged `v1.0` is that it currently performs its "validation period", making sure the format holds all its promises and nothing was missed.
+Zstd library also offers legacy decoder support. Any data compressed by any version >= `v0.1` (hence including current one) remains decodable now and in the future.
+The library has been validated using strong [fuzzer tests](https://en.wikipedia.org/wiki/Fuzz_testing), including both [internal tools](programs/fuzzer.c) and [external ones](http://lcamtuf.coredump.cx/afl). It's able to withstand hazard situations, including invalid inputs.
+As a consequence, Zstandard is considered safe for, and is currently used in, production environments.

 ### Branch Policy

--- a/lib/common/fse.h
+++ b/lib/common/fse.h
@ -132,8 +132,8 @@ size_t FSE_count(unsigned* count, unsigned* maxSymbolValuePtr, const void* src,
 /*! FSE_optimalTableLog():
    dynamically downsize 'tableLog' when conditions are met.
    It saves CPU time, by using smaller tables, while preserving or even improving compression ratio.
-    @return : recommended tableLog (necessarily <= initial 'tableLog') */
-unsigned FSE_optimalTableLog(unsigned tableLog, size_t srcSize, unsigned maxSymbolValue);
+    @return : recommended tableLog (necessarily <= 'maxTableLog') */
+unsigned FSE_optimalTableLog(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue);

 /*! FSE_normalizeCount():
    normalize counts so that sum(count[]) == Power_of_2 (2^tableLog)
--- a/lib/common/zstd_internal.h
+++ b/lib/common/zstd_internal.h
@ -64,7 +64,7 @@
 #endif

 #define ZSTD_OPT_NUM    (1<<12)
-#define ZSTD_DICT_MAGIC  0xEC30A437
+#define ZSTD_DICT_MAGIC  0xEC30A437   /* v0.7 */

 #define ZSTD_REP_NUM    3
 #define ZSTD_REP_INIT   ZSTD_REP_NUM
--- a/lib/compress/zstd_compress.c
+++ b/lib/compress/zstd_compress.c
@ -1128,7 +1128,6 @@ void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx,
    size_t offset_1=cctx->rep[0], offset_2=cctx->rep[1];

    /* init */
-    ZSTD_resetSeqStore(seqStorePtr);
    ip += (ip==lowest);
    {   U32 const maxRep = (U32)(ip-lowest);
        if (offset_1 > maxRep) offset_1 = 0;
@ -1239,7 +1238,6 @@ static void ZSTD_compressBlock_fast_extDict_generic(ZSTD_CCtx* ctx,
    U32 offset_1=ctx->rep[0], offset_2=ctx->rep[1];

    /* init */
-    ZSTD_resetSeqStore(seqStorePtr);
    /* skip first position to avoid read overflow during repcode match check */
    hashTable[ZSTD_hashPtr(ip, hBits, mls)] = (U32)(ip-base);
    ip++;
@ -1743,7 +1741,6 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
    /* init */
    ip += (ip==base);
    ctx->nextToUpdate3 = ctx->nextToUpdate;
-    ZSTD_resetSeqStore(seqStorePtr);
    {   U32 i;
        U32 const maxRep = (U32)(ip-base);
        for (i=0; i<ZSTD_REP_INIT; i++) {
@ -1847,7 +1844,7 @@ _storeSequence:
    /* Save reps for next block */
    {   int i;
        for (i=0; i<ZSTD_REP_NUM; i++) {
-            if (!rep[i]) rep[i] = (U32)(iend-base);   /* in case some zero are left */
+            if (!rep[i]) rep[i] = (U32)(iend - ctx->base);   /* in case some zero are left */
            ctx->savedRep[i] = rep[i];
    }   }

@ -1913,7 +1910,6 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
    { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) rep[i]=ctx->rep[i]; }

    ctx->nextToUpdate3 = ctx->nextToUpdate;
-    ZSTD_resetSeqStore(seqStorePtr);
    ip += (ip == prefixStart);

    /* Match Loop */
@ -2097,11 +2093,7 @@ typedef void (*ZSTD_blockCompressor) (ZSTD_CCtx* ctx, const void* src, size_t sr
 static ZSTD_blockCompressor ZSTD_selectBlockCompressor(ZSTD_strategy strat, int extDict)
 {
    static const ZSTD_blockCompressor blockCompressor[2][6] = {
-#if 1
        { ZSTD_compressBlock_fast, ZSTD_compressBlock_greedy, ZSTD_compressBlock_lazy, ZSTD_compressBlock_lazy2, ZSTD_compressBlock_btlazy2, ZSTD_compressBlock_btopt },
-#else
-        { ZSTD_compressBlock_fast_extDict, ZSTD_compressBlock_greedy_extDict, ZSTD_compressBlock_lazy_extDict,ZSTD_compressBlock_lazy2_extDict, ZSTD_compressBlock_btlazy2_extDict, ZSTD_compressBlock_btopt_extDict },
-#endif
        { ZSTD_compressBlock_fast_extDict, ZSTD_compressBlock_greedy_extDict, ZSTD_compressBlock_lazy_extDict,ZSTD_compressBlock_lazy2_extDict, ZSTD_compressBlock_btlazy2_extDict, ZSTD_compressBlock_btopt_extDict }
    };

@ -2111,8 +2103,9 @@ static ZSTD_blockCompressor ZSTD_selectBlockCompressor(ZSTD_strategy strat, int

 static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
-    ZSTD_blockCompressor blockCompressor = ZSTD_selectBlockCompressor(zc->params.cParams.strategy, zc->lowLimit < zc->dictLimit);
+    ZSTD_blockCompressor const blockCompressor = ZSTD_selectBlockCompressor(zc->params.cParams.strategy, zc->lowLimit < zc->dictLimit);
    if (srcSize < MIN_CBLOCK_SIZE+ZSTD_blockHeaderSize+1) return 0;   /* don't even attempt compression below a certain srcSize */
+    ZSTD_resetSeqStore(&(zc->seqStore));
    blockCompressor(zc, src, srcSize);
    return ZSTD_compressSequences(zc, dst, dstCapacity, srcSize);
 }
@ -2335,52 +2328,60 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_CCtx* zc, const void* src, size_t
 /* Dictionary format :
     Magic == ZSTD_DICT_MAGIC (4 bytes)
     HUF_writeCTable(256)
+     FSE_writeNCount(ml)
+     FSE_writeNCount(off)
+     FSE_writeNCount(ll)
+     RepOffsets
     Dictionary content
 */
 /*! ZSTD_loadDictEntropyStats() :
-    @return : size read from dictionary */
-static size_t ZSTD_loadDictEntropyStats(ZSTD_CCtx* zc, const void* dict, size_t dictSize)
+    @return : size read from dictionary
+    note : magic number supposed already checked */
+static size_t ZSTD_loadDictEntropyStats(ZSTD_CCtx* cctx, const void* dict, size_t dictSize)
 {
-    /* note : magic number already checked */
-    size_t const dictSizeStart = dictSize;
+    const BYTE* dictPtr = (const BYTE*)dict;
+    const BYTE* const dictEnd = dictPtr + dictSize;

-    {   size_t const hufHeaderSize = HUF_readCTable(zc->hufTable, 255, dict, dictSize);
+    {   size_t const hufHeaderSize = HUF_readCTable(cctx->hufTable, 255, dict, dictSize);
        if (HUF_isError(hufHeaderSize)) return ERROR(dictionary_corrupted);
-        dict = (const char*)dict + hufHeaderSize;
-        dictSize -= hufHeaderSize;
+        dictPtr += hufHeaderSize;
    }

    {   short offcodeNCount[MaxOff+1];
        unsigned offcodeMaxValue = MaxOff, offcodeLog = OffFSELog;
-        size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dict, dictSize);
+        size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dictPtr, dictEnd-dictPtr);
        if (FSE_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
-        { size_t const errorCode = FSE_buildCTable(zc->offcodeCTable, offcodeNCount, offcodeMaxValue, offcodeLog);
+        { size_t const errorCode = FSE_buildCTable(cctx->offcodeCTable, offcodeNCount, offcodeMaxValue, offcodeLog);
          if (FSE_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dict = (const char*)dict + offcodeHeaderSize;
-        dictSize -= offcodeHeaderSize;
+        dictPtr += offcodeHeaderSize;
    }

    {   short matchlengthNCount[MaxML+1];
        unsigned matchlengthMaxValue = MaxML, matchlengthLog = MLFSELog;
-        size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dict, dictSize);
+        size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dictPtr, dictEnd-dictPtr);
        if (FSE_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
-        { size_t const errorCode = FSE_buildCTable(zc->matchlengthCTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog);
+        { size_t const errorCode = FSE_buildCTable(cctx->matchlengthCTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog);
          if (FSE_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dict = (const char*)dict + matchlengthHeaderSize;
-        dictSize -= matchlengthHeaderSize;
+        dictPtr += matchlengthHeaderSize;
    }

    {   short litlengthNCount[MaxLL+1];
        unsigned litlengthMaxValue = MaxLL, litlengthLog = LLFSELog;
-        size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dict, dictSize);
+        size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dictPtr, dictEnd-dictPtr);
        if (FSE_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
-        { size_t const errorCode = FSE_buildCTable(zc->litlengthCTable, litlengthNCount, litlengthMaxValue, litlengthLog);
+        { size_t const errorCode = FSE_buildCTable(cctx->litlengthCTable, litlengthNCount, litlengthMaxValue, litlengthLog);
          if (FSE_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dictSize -= litlengthHeaderSize;
+        dictPtr += litlengthHeaderSize;
    }

-    zc->flagStaticTables = 1;
-    return (dictSizeStart-dictSize);
+    if (dictPtr+12 > dictEnd) return ERROR(dictionary_corrupted);
+    cctx->rep[0] = MEM_readLE32(dictPtr+0); if (cctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted);
+    cctx->rep[1] = MEM_readLE32(dictPtr+4); if (cctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted);
+    cctx->rep[2] = MEM_readLE32(dictPtr+8); if (cctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted);
+    dictPtr += 12;
+
+    cctx->flagStaticTables = 1;
+    return dictPtr - (const BYTE*)dict;
 }

 /** ZSTD_compress_insertDictionary() :
--- a/lib/compress/zstd_opt.h
+++ b/lib/compress/zstd_opt.h
@ -465,7 +465,6 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,

    /* init */
    ctx->nextToUpdate3 = ctx->nextToUpdate;
-    ZSTD_resetSeqStore(seqStorePtr);
    ZSTD_rescaleFreqs(seqStorePtr);
    ip += (ip==prefixStart);
    { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) rep[i]=ctx->rep[i]; }
@ -574,7 +573,7 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,
           best_mlen = minMatch;
           {   U32 i;
               for (i=0; i<ZSTD_REP_NUM; i++) {
-                   if ((rep[i]<(U32)(inr-prefixStart))
+                   if ((opt[cur].rep[i]<(U32)(inr-prefixStart))
                       && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(inr - opt[cur].rep[i], minMatch))) {  /* check rep */
                       mlen = (U32)ZSTD_count(inr+minMatch, inr+minMatch - opt[cur].rep[i], iend) + minMatch;
                       ZSTD_LOG_PARSER("%d: Found REP %d/%d mlen=%d off=%d rep=%d opt[%d].off=%d\n", (int)(inr-base), i, ZSTD_REP_NUM, mlen, i, opt[cur].rep[i], cur, opt[cur].off);
@ -757,7 +756,6 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
    { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) rep[i]=ctx->rep[i]; }

    ctx->nextToUpdate3 = ctx->nextToUpdate;
-    ZSTD_resetSeqStore(seqStorePtr);
    ZSTD_rescaleFreqs(seqStorePtr);
    ip += (ip==prefixStart);

--- a/lib/decompress/huf_decompress.c
+++ b/lib/decompress/huf_decompress.c
@ -625,7 +625,7 @@ size_t HUF_decompress1X4_usingDTable(
    const HUF_DTable* DTable)
 {
    DTableDesc dtd = HUF_getDTableDesc(DTable);
-    if (dtd.tableType != 0) return ERROR(GENERIC);
+    if (dtd.tableType != 1) return ERROR(GENERIC);
    return HUF_decompress1X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
 }

--- a/lib/decompress/zstd_decompress.c
+++ b/lib/decompress/zstd_decompress.c
@ -223,8 +223,10 @@ void ZSTD_copyDCtx(ZSTD_DCtx* dstDCtx, const ZSTD_DCtx* srcDCtx)
    // new
   1 byte - FrameHeaderDescription :
   bit 0-1 : dictID (0, 1, 2 or 4 bytes)
-   bit 2-4 : reserved (must be zero)
-   bit 5   : SkippedWindowLog (if 1, WindowLog byte is not present)
+   bit 2   : checksumFlag
+   bit 3   : reserved (must be zero)
+   bit 4   : reserved (unused, can be any value)
+   bit 5   : Single Segment (if 1, WindowLog byte is not present)
   bit 6-7 : FrameContentFieldSize (0, 2, 4, or 8)
             if (SkippedWindowLog && !FrameContentFieldsize) FrameContentFieldsize=1;

@ -365,7 +367,7 @@ size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t
        U32 windowSize = 0;
        U32 dictID = 0;
        U64 frameContentSize = 0;
-        if ((fhdByte & 0x18) != 0) return ERROR(frameParameter_unsupported);   /* reserved bits */
+        if ((fhdByte & 0x08) != 0) return ERROR(frameParameter_unsupported);   /* reserved bits, which must be zero */
        if (!directMode) {
            BYTE const wlByte = ip[pos++];
            U32 const windowLog = (wlByte >> 3) + ZSTD_WINDOWLOG_ABSOLUTEMIN;
@ -1186,47 +1188,51 @@ static void ZSTD_refDictContent(ZSTD_DCtx* dctx, const void* dict, size_t dictSi
    dctx->previousDstEnd = (const char*)dict + dictSize;
 }

-static size_t ZSTD_loadEntropy(ZSTD_DCtx* dctx, const void* dict, size_t const dictSizeStart)
+static size_t ZSTD_loadEntropy(ZSTD_DCtx* dctx, const void* const dict, size_t const dictSize)
 {
-    size_t dictSize = dictSizeStart;
+    const BYTE* dictPtr = (const BYTE*)dict;
+    const BYTE* const dictEnd = dictPtr + dictSize;

    {   size_t const hSize = HUF_readDTableX4(dctx->hufTable, dict, dictSize);
        if (HUF_isError(hSize)) return ERROR(dictionary_corrupted);
-        dict = (const char*)dict + hSize;
-        dictSize -= hSize;
+        dictPtr += hSize;
    }

    {   short offcodeNCount[MaxOff+1];
        U32 offcodeMaxValue=MaxOff, offcodeLog=OffFSELog;
-        size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dict, dictSize);
+        size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dictPtr, dictEnd-dictPtr);
        if (FSE_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
        { size_t const errorCode = FSE_buildDTable(dctx->OffTable, offcodeNCount, offcodeMaxValue, offcodeLog);
          if (FSE_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dict = (const char*)dict + offcodeHeaderSize;
-        dictSize -= offcodeHeaderSize;
+        dictPtr += offcodeHeaderSize;
    }

    {   short matchlengthNCount[MaxML+1];
        unsigned matchlengthMaxValue = MaxML, matchlengthLog = MLFSELog;
-        size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dict, dictSize);
+        size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dictPtr, dictEnd-dictPtr);
        if (FSE_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
        { size_t const errorCode = FSE_buildDTable(dctx->MLTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog);
          if (FSE_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dict = (const char*)dict + matchlengthHeaderSize;
-        dictSize -= matchlengthHeaderSize;
+        dictPtr += matchlengthHeaderSize;
    }

    {   short litlengthNCount[MaxLL+1];
        unsigned litlengthMaxValue = MaxLL, litlengthLog = LLFSELog;
-        size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dict, dictSize);
+        size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dictPtr, dictEnd-dictPtr);
        if (FSE_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
        { size_t const errorCode = FSE_buildDTable(dctx->LLTable, litlengthNCount, litlengthMaxValue, litlengthLog);
          if (FSE_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dictSize -= litlengthHeaderSize;
+        dictPtr += litlengthHeaderSize;
    }

+    if (dictPtr+12 > dictEnd) return ERROR(dictionary_corrupted);
+    dctx->rep[0] = MEM_readLE32(dictPtr+0); if (dctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted);
+    dctx->rep[1] = MEM_readLE32(dictPtr+4); if (dctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted);
+    dctx->rep[2] = MEM_readLE32(dictPtr+8); if (dctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted);
+    dictPtr += 12;
+
    dctx->litEntropy = dctx->fseEntropy = 1;
-    return dictSizeStart - dictSize;
+    return dictPtr - (const BYTE*)dict;
 }

 static size_t ZSTD_decompress_insertDictionary(ZSTD_DCtx* dctx, const void* dict, size_t dictSize)
--- a/lib/dictBuilder/zdict.c
+++ b/lib/dictBuilder/zdict.c
@ -578,9 +578,10 @@ typedef struct
    void* workPlace;   /* must be ZSTD_BLOCKSIZE_MAX allocated */
 } EStats_ress_t;

+#define MAXREPOFFSET 1024

 static void ZDICT_countEStats(EStats_ress_t esr,
-                            U32* countLit, U32* offsetcodeCount, U32* matchlengthCount, U32* litlengthCount,
+                            U32* countLit, U32* offsetcodeCount, U32* matchlengthCount, U32* litlengthCount, U32* repOffsets,
                            const void* src, size_t srcSize)
 {
    const seqStore_t* seqStorePtr;
@ -614,6 +615,17 @@ static void ZDICT_countEStats(EStats_ress_t esr,
            size_t u;
            for (u=0; u<nbSeq; u++) litlengthCount[codePtr[u]]++;
    }   }
+
+    /* rep offsets */
+    {   const U32* const offsetPtr = seqStorePtr->offsetStart;
+        U32 offset1 = offsetPtr[0] - 3;
+        U32 offset2 = offsetPtr[1] - 3;
+        if (offset1 >= MAXREPOFFSET) offset1 = 0;
+        if (offset2 >= MAXREPOFFSET) offset2 = 0;
+        repOffsets[offset1] += 3;
+        repOffsets[offset2] += 1;
+    }
+
 }

 /*
@ -629,12 +641,29 @@ static size_t ZDICT_maxSampleSize(const size_t* fileSizes, unsigned nbFiles)

 static size_t ZDICT_totalSampleSize(const size_t* fileSizes, unsigned nbFiles)
 {
-    size_t total;
+    size_t total=0;
    unsigned u;
-    for (u=0, total=0; u<nbFiles; u++) total += fileSizes[u];
+    for (u=0; u<nbFiles; u++) total += fileSizes[u];
    return total;
 }

+typedef struct { U32 offset; U32 count; } offsetCount_t;
+
+static void ZDICT_insertSortCount(offsetCount_t table[ZSTD_REP_NUM+1], U32 val, U32 count)
+{
+    U32 u;
+    table[ZSTD_REP_NUM].offset = val;
+    table[ZSTD_REP_NUM].count = count;
+    for (u=ZSTD_REP_NUM; u>0; u--) {
+        offsetCount_t tmp;
+        if (table[u-1].count >= table[u].count) break;
+        tmp = table[u-1];
+        table[u-1] = table[u];
+        table[u] = tmp;
+    }
+}
+
+
 #define OFFCODE_MAX 18  /* only applicable to first block */
 static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
                                 unsigned compressionLevel,
@ -649,6 +678,8 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
    short matchLengthNCount[MaxML+1];
    U32 litLengthCount[MaxLL+1];
    short litLengthNCount[MaxLL+1];
+    U32 repOffset[MAXREPOFFSET] = { 0 };
+    offsetCount_t bestRepOffset[ZSTD_REP_NUM+1];
    EStats_ress_t esr;
    ZSTD_parameters params;
    U32 u, huffLog = 12, Offlog = OffFSELog, mlLog = MLFSELog, llLog = LLFSELog, total;
@ -656,12 +687,15 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
    size_t eSize = 0;
    size_t const totalSrcSize = ZDICT_totalSampleSize(fileSizes, nbFiles);
    size_t const averageSampleSize = totalSrcSize / nbFiles;
+    BYTE* dstPtr = (BYTE*)dstBuffer;

    /* init */
    for (u=0; u<256; u++) countLit[u]=1;   /* any character must be described */
    for (u=0; u<=OFFCODE_MAX; u++) offcodeCount[u]=1;
    for (u=0; u<=MaxML; u++) matchLengthCount[u]=1;
    for (u=0; u<=MaxLL; u++) litLengthCount[u]=1;
+    repOffset[1] = repOffset[4] = repOffset[8] = 1;
+    memset(bestRepOffset, 0, sizeof(bestRepOffset));
    esr.ref = ZSTD_createCCtx();
    esr.zc = ZSTD_createCCtx();
    esr.workPlace = malloc(ZSTD_BLOCKSIZE_MAX);
@ -679,7 +713,7 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
    /* collect stats on all files */
    for (u=0; u<nbFiles; u++) {
        ZDICT_countEStats(esr,
-                        countLit, offcodeCount, matchLengthCount, litLengthCount,
+                        countLit, offcodeCount, matchLengthCount, litLengthCount, repOffset,
           (const char*)srcBuffer + pos, fileSizes[u]);
        pos += fileSizes[u];
    }
@ -693,6 +727,13 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
    }
    huffLog = (U32)errorCode;

+    /* looking for most common first offsets */
+    {   U32 offset;
+        for (offset=1; offset<MAXREPOFFSET; offset++)
+            ZDICT_insertSortCount(bestRepOffset, offset, repOffset[offset]);
+    }
+    /* note : the result of this phase should be used to better appreciate the impact on statistics */
+
    total=0; for (u=0; u<=OFFCODE_MAX; u++) total+=offcodeCount[u];
    errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, OFFCODE_MAX);
    if (FSE_isError(errorCode)) {
@ -720,46 +761,70 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
    }
    llLog = (U32)errorCode;

+
    /* write result to buffer */
-    errorCode = HUF_writeCTable(dstBuffer, maxDstSize, hufTable, 255, huffLog);
-    if (HUF_isError(errorCode)) {
+    {   size_t const hhSize = HUF_writeCTable(dstPtr, maxDstSize, hufTable, 255, huffLog);
+        if (HUF_isError(hhSize)) {
            eSize = ERROR(GENERIC);
            DISPLAYLEVEL(1, "HUF_writeCTable error");
            goto _cleanup;
        }
-    dstBuffer = (char*)dstBuffer + errorCode;
-    maxDstSize -= errorCode;
-    eSize += errorCode;
+        dstPtr += hhSize;
+        maxDstSize -= hhSize;
+        eSize += hhSize;
+    }

-    errorCode = FSE_writeNCount(dstBuffer, maxDstSize, offcodeNCount, OFFCODE_MAX, Offlog);
-    if (FSE_isError(errorCode)) {
+    {   size_t const ohSize = FSE_writeNCount(dstPtr, maxDstSize, offcodeNCount, OFFCODE_MAX, Offlog);
+        if (FSE_isError(ohSize)) {
            eSize = ERROR(GENERIC);
            DISPLAYLEVEL(1, "FSE_writeNCount error with offcodeNCount");
            goto _cleanup;
        }
-    dstBuffer = (char*)dstBuffer + errorCode;
-    maxDstSize -= errorCode;
-    eSize += errorCode;
+        dstPtr += ohSize;
+        maxDstSize -= ohSize;
+        eSize += ohSize;
+    }

-    errorCode = FSE_writeNCount(dstBuffer, maxDstSize, matchLengthNCount, MaxML, mlLog);
-    if (FSE_isError(errorCode)) {
+    {   size_t const mhSize = FSE_writeNCount(dstPtr, maxDstSize, matchLengthNCount, MaxML, mlLog);
+        if (FSE_isError(mhSize)) {
            eSize = ERROR(GENERIC);
            DISPLAYLEVEL(1, "FSE_writeNCount error with matchLengthNCount");
            goto _cleanup;
        }
-    dstBuffer = (char*)dstBuffer + errorCode;
-    maxDstSize -= errorCode;
-    eSize += errorCode;
+        dstPtr += mhSize;
+        maxDstSize -= mhSize;
+        eSize += mhSize;
+    }

-    errorCode = FSE_writeNCount(dstBuffer, maxDstSize, litLengthNCount, MaxLL, llLog);
-    if (FSE_isError(errorCode)) {
+    {   size_t const lhSize = FSE_writeNCount(dstPtr, maxDstSize, litLengthNCount, MaxLL, llLog);
+        if (FSE_isError(lhSize)) {
            eSize = ERROR(GENERIC);
            DISPLAYLEVEL(1, "FSE_writeNCount error with litlengthNCount");
            goto _cleanup;
        }
-    dstBuffer = (char*)dstBuffer + errorCode;
-    maxDstSize -= errorCode;
-    eSize += errorCode;
+        dstPtr += lhSize;
+        maxDstSize -= lhSize;
+        eSize += lhSize;
+    }
+
+    if (maxDstSize<12) {
+        eSize = ERROR(GENERIC);
+        DISPLAYLEVEL(1, "not enough space to write RepOffsets");
+        goto _cleanup;
+    }
+# if 0
+    MEM_writeLE32(dstPtr+0, bestRepOffset[0].offset);
+    MEM_writeLE32(dstPtr+4, bestRepOffset[1].offset);
+    MEM_writeLE32(dstPtr+8, bestRepOffset[2].offset);
+#else
+    /* at this stage, we don't use the result of "most common first offset",
+       as the impact of statistics is not properly evaluated */
+    MEM_writeLE32(dstPtr+0, repStartValue[0]);
+    MEM_writeLE32(dstPtr+4, repStartValue[1]);
+    MEM_writeLE32(dstPtr+8, repStartValue[2]);
+#endif
+    dstPtr += 12;
+    eSize += 12;

 _cleanup:
    ZSTD_freeCCtx(esr.ref);
--- a/lib/dictBuilder/zdict.h
+++ b/lib/dictBuilder/zdict.h
@ -79,7 +79,7 @@ const char* ZDICT_getErrorName(size_t errorCode);

 /* ====================================================================================
 * The definitions in this section are considered experimental.
- * They should never be used in association with a dynamic library, as they may change in the future.
+ * They should never be used with a dynamic library, as they may change in the future.
 * They are provided for advanced usages.
 * Use them only in association with static linking.
 * ==================================================================================== */
--- a/programs/.gitignore
+++ b/programs/.gitignore
@ -50,3 +50,4 @@ afl
 # Misc files
 *.bat
 fileTests.sh
+dirTest*
--- a/programs/fuzzer.c
+++ b/programs/fuzzer.c
@ -847,10 +847,17 @@ int main(int argc, const char** argv)
    /* Get Seed */
    DISPLAY("Starting zstd tester (%i-bits, %s)\n", (int)(sizeof(size_t)*8), ZSTD_VERSION_STRING);

-    if (!seedset) seed = (U32)(clock() % 10000);
+    if (!seedset) {
+        time_t const t = time(NULL);
+        U32 const h = XXH32(&t, sizeof(t), 1);
+        seed = h % 10000;
+    }
+
    DISPLAY("Seed = %u\n", seed);
    if (proba!=FUZ_compressibility_default) DISPLAY("Compressibility : %u%%\n", proba);

+    if (nbTests < testNb) nbTests = testNb;
+
    if (testNb==0)
        result = basicUnitTests(0, ((double)proba) / 100);  /* constant seed for predictability */
    if (!result)
--- a/programs/playTests.sh
+++ b/programs/playTests.sh
@ -133,31 +133,6 @@ diff tmpSparse2M tmpSparseRegenerated
 rm tmpSparse*


-$ECHO "\n**** dictionary tests **** "
-
-./datagen > tmpDict
-./datagen -g1M | $MD5SUM > tmp1
-./datagen -g1M | $ZSTD -D tmpDict | $ZSTD -D tmpDict -dvq | $MD5SUM > tmp2
-diff -q tmp1 tmp2
-$ECHO "Create first dictionary"
-$ZSTD --train *.c -o tmpDict
-cp zstdcli.c tmp
-$ZSTD -f tmp -D tmpDict
-$ZSTD -d tmp.zst -D tmpDict -of result
-diff zstdcli.c result
-$ECHO "Create second (different) dictionary"
-$ZSTD --train *.c *.h -o tmpDictC
-$ZSTD -d tmp.zst -D tmpDictC -of result && die "wrong dictionary not detected!"
-$ECHO "Create dictionary with short dictID"
-$ZSTD --train *.c --dictID 1 -o tmpDict1
-cmp tmpDict tmpDict1 && die "dictionaries should have different ID !"
-$ECHO "Compress without dictID"
-$ZSTD -f tmp -D tmpDict1 --no-dictID
-$ZSTD -d tmp.zst -D tmpDict -of result
-diff zstdcli.c result
-rm tmp*
-
-
 $ECHO "\n**** multiple files tests **** "

 ./datagen -s1        > tmp1 2> /dev/null
@ -181,9 +156,46 @@ $ECHO "compress multiple files including a missing one (notHere) : "
 $ZSTD -f tmp1 notHere tmp2 && die "missing file not detected!"


+$ECHO "\n**** dictionary tests **** "
+
+./datagen > tmpDict
+./datagen -g1M | $MD5SUM > tmp1
+./datagen -g1M | $ZSTD -D tmpDict | $ZSTD -D tmpDict -dvq | $MD5SUM > tmp2
+diff -q tmp1 tmp2
+$ECHO "- Create first dictionary"
+$ZSTD --train *.c -o tmpDict
+cp zstdcli.c tmp
+$ZSTD -f tmp -D tmpDict
+$ZSTD -d tmp.zst -D tmpDict -of result
+diff zstdcli.c result
+$ECHO "- Create second (different) dictionary"
+$ZSTD --train *.c *.h -o tmpDictC
+$ZSTD -d tmp.zst -D tmpDictC -of result && die "wrong dictionary not detected!"
+$ECHO "- Create dictionary with short dictID"
+$ZSTD --train *.c --dictID 1 -o tmpDict1
+cmp tmpDict tmpDict1 && die "dictionaries should have different ID !"
+$ECHO "- Compress without dictID"
+$ZSTD -f tmp -D tmpDict1 --no-dictID
+$ZSTD -d tmp.zst -D tmpDict -of result
+diff zstdcli.c result
+$ECHO "- Compress multiple files with dictionary"
+rm -rf dirTestDict
+mkdir dirTestDict
+cp *.c dirTestDict
+cp *.h dirTestDict
+cat dirTestDict/* | $MD5SUM > tmph1  # note : we expect same file order to generate same hash
+$ZSTD -f dirTestDict/* -D tmpDictC
+$ZSTD -d dirTestDict/*.zst -D tmpDictC -c | $MD5SUM > tmph2
+diff -q tmph1 tmph2
+rm -rf dirTestDict
+rm tmp*
+
+
 $ECHO "\n**** integrity tests **** "

 $ECHO "test one file (tmp1.zst) "
+./datagen > tmp1
+$ZSTD tmp1
 $ZSTD -t tmp1.zst
 $ZSTD --test tmp1.zst
 $ECHO "test multiple files (*.zst) "
--- a/programs/zstd.1
+++ b/programs/zstd.1
@ -43,7 +43,7 @@ It also features a very fast decoder, with speed > 500 MB/s per core.
 .SH OPTIONS
 .TP
 .B \-#
- # compression level [1-21] (default:1)
+ # compression level [1-22] (default:1)
 .TP
 .BR \-d ", " --decompress
 decompression
--- a/projects/build/README.md
+++ b/projects/build/README.md
@ -0,0 +1,51 @@
+Here are a few command lines for reference :
+
+### Build with Visual Studio 2013 for msvcr120.dll
+
+Running the following command will build both the `Release Win32` and `Release x64` versions:
+```batch
+build\build.VS2013.cmd
+```
+The result of each build will be in the corresponding `build\bin\Release\{ARCH}\` folder.
+
+If you want to only need one architecture:
+- Win32: `build\build.generic.cmd VS2013 Win32 Release v120`
+- x64: `build\build.generic.cmd VS2013 x64 Release v120`
+
+If you want a Debug build:
+- Win32: `build\build.generic.cmd VS2013 Win32 Debug v120`
+- x64: `build\build.generic.cmd VS2013 x64 Debug v120`
+
+### Build with Visual Studio 2015 for msvcr140.dll
+
+Running the following command will build both the `Release Win32` and `Release x64` versions:
+```batch
+build\build.VS2015.cmd
+```
+The result of each build will be in the corresponding `build\bin\Release\{ARCH}\` folder.
+
+If you want to only need one architecture:
+- Win32: `build\build.generic.cmd VS2015 Win32 Release v140`
+- x64: `build\build.generic.cmd VS2015 x64 Release v140`
+
+If you want a Debug build:
+- Win32: `build\build.generic.cmd VS2015 Win32 Debug v140`
+- x64: `build\build.generic.cmd VS2015 x64 Debug v140`
+
+### Build with Visual Studio 2015 for msvcr120.dll
+
+You need to invoke `build\build.generic.cmd` with the proper arguments:
+
+**For Win32**
+```batch
+build\build.generic.cmd VS2015 Win32 Release v120
+```
+The result of the build will be in the `build\bin\Release\Win32\` folder.
+
+**For x64**
+```batch
+build\build.generic.cmd VS2015 x64 Release v120
+```
+The result of the build will be in the `build\bin\Release\x64\` folder.
+
+If you want Debug builds, replace `Release` with `Debug`.
--- a/projects/build/build.VS2010.cmd
+++ b/projects/build/build.VS2010.cmd
--- a/projects/build/build.VS2012.cmd
+++ b/projects/build/build.VS2012.cmd
--- a/projects/build/build.VS2013.cmd
+++ b/projects/build/build.VS2013.cmd
--- a/projects/build/build.VS2015.cmd
+++ b/projects/build/build.VS2015.cmd
--- a/projects/build/build.generic.cmd
+++ b/projects/build/build.generic.cmd
@ -33,7 +33,7 @@ IF %msbuild_version% == VS2013 SET msbuild="%programfiles(x86)%\MSBuild\12.0\Bin
 IF %msbuild_version% == VS2015 SET msbuild="%programfiles(x86)%\MSBuild\14.0\Bin\MSBuild.exe"
 rem TODO: Visual Studio "15" (vNext) will use MSBuild 15.0 ?

-SET project="%~p0\..\projects\VS2010\zstd.sln"
+SET project="%~p0\..\VS2010\zstd.sln"

 SET msbuild_params=/verbosity:minimal /nologo /t:Clean,Build /p:Platform=%msbuild_platform% /p:Configuration=%msbuild_configuration%
 IF NOT "%msbuild_toolset%" == "" SET msbuild_params=%msbuild_params% /p:PlatformToolset=%msbuild_toolset%
@ -50,4 +50,3 @@ IF ERRORLEVEL 1 EXIT /B 1
 echo # Success
 echo # OutDir: %output%
 echo #
-