Commit Graph

588 Commits (18b795374a4d1e0b308fdf6e86b1ea08b4489489)

Author SHA1 Message Date
Yann Collet cd3115b284 added control from frame content size at end of decompression
adding check at end of single-pass ZSTD_decompressFrame().
Check within ZSTD_decompressContinue() was already added in a previous patch : b3f33ccfb3
2017-09-21 16:21:10 -07:00
Nick Terrell 1fe762e236 [zstdcli] Fix LDM advanced options parsing 2017-09-18 14:49:35 -07:00
Yann Collet 31829cb057 Merge pull request #847 from terrelln/fuzzer
[fuzzer] Fuzz long range matching & new API
2017-09-15 12:09:00 -07:00
Nick Terrell 39357c41cb [fuzzer] Fuzz long range matching & new API 2017-09-14 14:48:08 -07:00
Yann Collet 218c09e5b3 Merge pull request #844 from terrelln/fuzzer
Fuzzer
2017-09-14 11:40:25 -07:00
Nick Terrell 9712d5ebe6 [fuzzer] Fix bugs in fuzz.py 2017-09-13 19:08:35 -07:00
Nick Terrell a6f08b4783 [fuzzer] Fix FUZZ_seed() 2017-09-13 18:41:32 -07:00
Nick Terrell 6c6412cef9 [fuzzer] Update README.md 2017-09-13 18:23:52 -07:00
Nick Terrell 6b8236cf7e [fuzz] Add fuzzing helper script 2017-09-13 17:45:21 -07:00
Nick Terrell b7e1522330 Add block fuzzers 2017-09-13 17:44:41 -07:00
Nick Terrell def3214d74 [fuzzer] Handle single empty directory 2017-09-13 17:44:30 -07:00
Yann Collet 739b620814 Merge pull request #842 from stellamplau/decodeCorpus-maxSize
Add flag to limit max decompressed size in decodeCorpus
2017-09-13 17:26:55 -07:00
Nick Terrell 8b6c80ada8 Update fuzzer Makefile 2017-09-13 16:16:57 -07:00
Nick Terrell 677c2cbf89 Update fuzzer sources 2017-09-13 16:16:57 -07:00
Stella Lau 963558a072 Fix implicit conversion error 2017-09-13 16:01:16 -07:00
Stella Lau 40bf0ced7d Add flag to limit max decompressed size in decodeCorpus 2017-09-13 15:16:56 -07:00
Yann Collet f1571dad8f Merge pull request #838 from stellamplau/ldm-mergeDev
Add long distance matcher
2017-09-13 13:24:08 -07:00
Yann Collet be1f2dac5b Merge pull request #841 from facebook/utilTimeAPI
modified util::time API (T19505791)
2017-09-13 11:41:01 -07:00
Yann Collet a1bc08834f Merge pull request #840 from stellamplau/decodeCorpus-blocks
Make decodecorpus generate raw compressed blocks
2017-09-13 09:34:04 -07:00
Yann Collet c95c0c9725 modified util::time API
for easier invocation.
- no longer expose frequency timer :
it's either useless, or stored internally in a static variable (init is only necessary once).
- UTIL_getTime() provides result by function return.
2017-09-12 18:12:46 -07:00
Stella Lau e89065506e Make decodecorpus generate raw compressed blocks 2017-09-12 17:18:45 -07:00
Stella Lau 3d8e313f64 Reduce ldm hash table size in test 2017-09-11 17:21:28 -07:00
Stella Lau eb3327c10a Merge branch 'dev' of https://github.com/facebook/zstd into ldm-mergeDev 2017-09-11 15:00:01 -07:00
Yann Collet b3f33ccfb3 use ZSTD_decodingBufferSize_min() inside ZSTD_decompressStream()
Use same definition as public one
minor : reduce allocated buffer size in some cases
(when frameContentSize is known and == windowSize)
2017-09-09 14:37:28 -07:00
Yann Collet 058ed2ad33 ZSTD_decodingBufferSize_min()
supporting function for bufferless streaming API (ZSTD_decompressContinue())
makes it possible to correctly size a round buffer for decoding using this API.

also : added field blockSizeMax within ZSTD_frameHeader,
as it's a necessary information to know when to restart at beginning of decoding buffer.
2017-09-09 01:03:29 -07:00
Yann Collet 3128e03be6 updated license header
to clarify dual-license meaning as "or"
2017-09-08 00:09:23 -07:00
Stella Lau eeff55dfa8 Merge remote-tracking branch 'upstream/dev' into ldm-mergeDev 2017-09-06 15:56:32 -07:00
Stella Lau 9e4060200b Add tests and fix pointer alignment 2017-09-06 09:14:05 -07:00
Stella Lau af4068a697 Fix function name in tests/fuzz/regression_driver 2017-09-05 22:14:41 -07:00
Stella Lau 67d4a6161c Add ldmBucketSizeLog param 2017-09-02 21:55:29 -07:00
Stella Lau a1f04d518d Move hashEveryLog to cctxParams and update cli 2017-09-01 15:05:47 -07:00
Stella Lau 767a0b3be1 Move ldm hashLog, bucketLog, and mml to cctxParams 2017-09-01 12:24:59 -07:00
Stella Lau 17d8e0bdcc Merge remote-tracking branch 'upstream/longRangeMatcher' into ldm-integrate 2017-09-01 10:19:38 -07:00
Stella Lau 8081becadc Add long distance matching as a CCtxParam 2017-09-01 09:18:58 -07:00
Eiichi Tsukata 7492e7f1c7 tests/fuzz: change ZSTD_BLOCKSIZE_ABSOLUTEMAX into ZSTD_BLOCKSIZE_MAX
ZSTD_BLOCKSIZE_ABSOLUTEMAX is changed at the commit:
fa3671eac7
2017-09-01 16:37:39 +09:00
Eiichi Tsukata 6639395979 tests/fuzz: fix make all target names 2017-09-01 16:32:40 +09:00
Yann Collet d7ad99b2ab Merge branch 'longRangeMatcher' into dev 2017-08-31 18:08:37 -07:00
Yann Collet e0cecd8736 fixed poolTests
needs more dependencies from zstd for custom allocators and error codes
2017-08-31 15:13:31 -07:00
Stella Lau 6a546efb8c Add long distance matcher
Move last literals section to ZSTD_block_internal
2017-08-31 12:53:19 -07:00
Yann Collet b0cb081dc8 last batch of header files changed to reflect new license (#825)
only remains to update contrib/linux-kernel (@terrelln)
2017-08-31 12:20:50 -07:00
Yann Collet e21384fffb fixed more file headers after license change (#825) 2017-08-31 12:11:57 -07:00
Yann Collet e9dc204f42 fixed a bunch of headers after license change (#825) 2017-08-31 11:24:54 -07:00
Stella Lau 90a31bfa16 Pass dictMode to ZSTDMT_initCStream; fix nits
- Return error code in estimate{CCtx,CStream}Size functions
2017-08-30 16:19:07 -07:00
Stella Lau ee65701720 Minor fixes; remove formatting only changes 2017-08-29 20:27:35 -07:00
Stella Lau a6e20e1bd7 Add test for raw content starting with dict header 2017-08-29 18:36:18 -07:00
Stella Lau 82d636b76a Rename applyCCtxParams() 2017-08-29 18:03:06 -07:00
Stella Lau c88fb9267f Replace 'byReference' with enum 2017-08-29 11:55:02 -07:00
Stella Lau b5b9275e67 Rename estimateCCtxSize_advanced() and estimateCStreamSize_advanced() 2017-08-29 10:49:29 -07:00
Stella Lau 18224608ff Remove ZSTD_setCCtxParameter() 2017-08-25 13:58:41 -07:00
Stella Lau 9911153723 Move jobSize and overlapLog in zstdmt to cctxParams 2017-08-25 13:14:51 -07:00
Stella Lau eb7bbab36a Remove ZSTD_p_refDictContent and dictContentByRef 2017-08-25 11:11:45 -07:00
Stella Lau 15fdeb9e41 Enforce nbThreads<=1 for estimateCCtxSize 2017-08-24 16:28:49 -07:00
Stella Lau 1c81f725ff Remove duplicated testing code 2017-08-23 15:47:15 -07:00
Stella Lau 6f1a21c7e9 Remove formatting-only changes 2017-08-23 10:24:19 -07:00
Stella Lau 8fd1636776 Remove unused functions 2017-08-22 13:33:58 -07:00
Stella Lau 73c73bf16a Reduce code duplication in zstreamtest 2017-08-21 12:41:19 -07:00
Nick Terrell 3587556873 [cover] Test small maxdict 2017-08-21 11:16:47 -07:00
Stella Lau 91b30dbe84 Remove test parameter 2017-08-21 10:09:06 -07:00
Stella Lau f181f33bdf Disable tests and refactor 2017-08-21 01:59:08 -07:00
Stella Lau 023b24e6d4 Add cctx param tests 2017-08-20 22:55:07 -07:00
Yann Collet d6394cc4c3 fixed test-zstd-nolegacy 2017-08-20 10:15:44 -07:00
Yann Collet 32fb407c9d updated a bunch of headers
for the new license
2017-08-18 16:52:05 -07:00
Yann Collet 8049556928 Merge pull request #778 from terrelln/bad-huff
[libzstd] Fix bug in Huffman decompresser
2017-08-07 14:05:58 -07:00
Nick Terrell abe12b3399 [libzstd] Fix bug in Huffman decompresser
The zstd format specification doesn't enforce that Huffman compressed
literals (including the table) have to be smaller than the uncompressed
literals. The compressor will never Huffman compress literals if the
compressed size is larger than the uncompressed size. The decompresser
doesn't accept Huffman compressed literals with 4 streams whose compressed
size is at least as large as the uncompressed size.

* Make the decompresser accept Huffman compressed literals whose size
  increases.
* Add a test case that exposes the bug. The compressed file has to be
  statically generated, since the compressor won't normally produce files
  that expose the bug.
2017-08-07 12:37:48 -07:00
Stella Lau e1abc2a367 Switch the sleep function to UTIL_sleepMilli 2017-08-07 11:49:13 -07:00
Stella Lau 1e366f9dea Add test for deadlock 2017-08-02 11:27:50 -07:00
Stella Lau 5adceeed01 Allow queueSize=0 in pool.c and update poolTests 2017-07-31 10:10:16 -07:00
Yann Collet 38ba7002f2 fixed minor warning on unused variable in shell function 2017-07-20 18:39:04 -07:00
Yann Collet 5e6c5203f3 fixed fuzzer test for non OS-X platforms 2017-07-20 15:11:56 -07:00
Yann Collet 1ca1288689 added --memtest=# command to fuzzer
to jump directly to relevant test section
2017-07-19 16:01:16 -07:00
Yann Collet 44b0838253 Merge pull request #770 from terrelln/test-mode
[zstdcli] Fix -t in streaming mode
2017-07-18 15:40:59 -07:00
Nick Terrell d0b27483ae [zstdcli] Fix -t in streaming mode 2017-07-18 14:45:49 -07:00
Nick Terrell cc1522351f [libzstd] Fix bug in Huffman encoding
Summary:
Huffman encoding with a bad dictionary can encode worse than the
HUF_BLOCKBOUND(srcSize), since we don't filter out incompressible
input, and even if we did, the dictionaries Huffman table could be
ill suited to compressing actual data.

The fast optimization doesn't seem to improve compression speed,
even when I hard coded fast = 1, the speed didn't improve over hard coding
it to 0.

Benchmarks:
$ ./zstd.dev -b1e5
Benchmarking levels from 1 to 5
 1#Synthetic 50%     :  10000000 ->   3139163 (3.186), 524.8 MB/s ,1890.0 MB/s
 2#Synthetic 50%     :  10000000 ->   3115138 (3.210), 372.6 MB/s ,1830.2 MB/s
 3#Synthetic 50%     :  10000000 ->   3222672 (3.103), 223.3 MB/s ,1400.2 MB/s
 4#Synthetic 50%     :  10000000 ->   3276678 (3.052), 198.0 MB/s ,1280.1 MB/s
 5#Synthetic 50%     :  10000000 ->   3271570 (3.057), 107.8 MB/s ,1200.0 MB/s
$ ./zstd -b1e5
Benchmarking levels from 1 to 5
 1#Synthetic 50%     :  10000000 ->   3139163 (3.186), 524.8 MB/s ,1870.2 MB/s
 2#Synthetic 50%     :  10000000 ->   3115138 (3.210), 370.0 MB/s ,1810.3 MB/s
 3#Synthetic 50%     :  10000000 ->   3222672 (3.103), 223.3 MB/s ,1380.1 MB/s
 4#Synthetic 50%     :  10000000 ->   3276678 (3.052), 196.1 MB/s ,1270.0 MB/s
 5#Synthetic 50%     :  10000000 ->   3271570 (3.057), 106.8 MB/s ,1180.1 MB/s
$ ./zstd.dev -b1e5 ../silesia.tar
Benchmarking levels from 1 to 5
 1#silesia.tar       : 211988480 ->  73651685 (2.878), 429.7 MB/s ,1096.5 MB/s
 2#silesia.tar       : 211988480 ->  70158785 (3.022), 321.2 MB/s ,1029.1 MB/s
 3#silesia.tar       : 211988480 ->  66993813 (3.164), 243.7 MB/s , 981.4 MB/s
 4#silesia.tar       : 211988480 ->  66306481 (3.197), 226.7 MB/s , 972.4 MB/s
 5#silesia.tar       : 211988480 ->  64757852 (3.274), 150.3 MB/s , 963.6 MB/s
$ ./zstd -b1e5 ../silesia.tar
Benchmarking levels from 1 to 5
 1#silesia.tar       : 211988480 ->  73651685 (2.878), 429.7 MB/s ,1087.1 MB/s
 2#silesia.tar       : 211988480 ->  70158785 (3.022), 318.8 MB/s ,1029.1 MB/s
 3#silesia.tar       : 211988480 ->  66993813 (3.164), 246.5 MB/s , 981.4 MB/s
 4#silesia.tar       : 211988480 ->  66306481 (3.197), 229.2 MB/s , 972.4 MB/s
 5#silesia.tar       : 211988480 ->  64757852 (3.274), 149.3 MB/s , 963.6 MB/s

Test Plan:
I added a test case to the fuzzer which crashed with ASAN before the patch
and succeeded after.
2017-07-18 13:20:40 -07:00
Yann Collet fa3aa04ccd Merge pull request #761 from paulcruz74/file-rename
renamed pool.c to poolTests.c
2017-07-14 09:09:45 -07:00
Yann Collet 3a60efd3a9 policy change : ZSTDMT automatically caps nbThreads to ZSTDMT_NBTHREADS_MAX (#760)
Previously, ZSTDMT would refuse to create the compressor.
Also : increased ZSTDMT_NBTHREADS_MAX to 256,
updated doc,
and added relevant test
2017-07-13 10:17:23 -07:00
Yann Collet 052a95f77c fix : ZSTDMT_compress_advanced() correctly generates checksum
when params.fParams.checksumFlag==1.
This use case used to be impossible when only ZSTD_compress() was available
2017-07-11 17:18:26 -07:00
Yann Collet ef0ff7fe7f zstdmt: removed margin for improved memory usage 2017-07-11 08:54:29 -07:00
Yann Collet 4616fad18b improved ZSTDMT_compress() memory usage
does not need the input buffer for streaming operations

also : reduced a few tests time length
2017-07-10 17:16:41 -07:00
Yann Collet 670b1fc547 optimized memory usage for ZSTDMT_compress()
Previously, each job would reserve a CCtx right before being posted.
The CCtx would be "part of the job description",
and only released when the job is completed (aka flushed).
For ZSTDMT_compress(), which creates all jobs first and only join at the end,
that meant one CCtx per job.
The nb of jobs used to be == nb of threads,
but since latest modification,
which reduces the size of jobs in order to spread the load of difficult areas,
it also increases the nb of jobs for large sources / small compression level.
This resulted in many more CCtx being created.

In this new version, CCtx are reserved within the worker thread.
It guaranteea there cannot be more CCtx reserved than workers (<= nb threads).

To do that, it required to make the CCtx Pool multi-threading-safe :
it can now be called from multiple threads in parallel.
2017-07-10 16:30:55 -07:00
Yann Collet 3510efb02d fix : custom allocator correctly propagated to child contexts 2017-07-10 14:21:40 -07:00
Yann Collet ee3423d709 extended fuzzer MT memory tests 2017-07-10 14:09:16 -07:00
Yann Collet 88da8f1816 fix : propagate custom allocator to ZSTDMT though ZSTD_CCtx_setParameter()
also : compile fuzzer with MT enabled
2017-07-10 14:02:33 -07:00
Yann Collet f9524cf366 added --memtest to fuzzer 2017-07-10 13:48:41 -07:00
Yann Collet e32fb0c1fe added ZSTD_sizeof_CCtx() test 2017-07-10 12:29:57 -07:00
Paul Cruz 89190ef07d renamed pool.c to poolTests.c 2017-07-10 11:32:30 -07:00
Yann Collet ed0243a63c removed zbufftest from list of `all` tests 2017-07-07 16:16:14 -07:00
Yann Collet 990449b89d new field : ZSTD_frameHeader.frameType
Makes frame type (zstd,skippable) detection more straighforward.
ZSTD_getFrameHeader set frameContentSize=ZSTD_CONTENTSIZE_UNKNOWN to mean "field not present"
2017-07-07 15:21:35 -07:00
Yann Collet 7758ed8458 fixed fullbench, part 2 2017-07-06 02:48:00 -07:00
Yann Collet 9b2c1acfc0 fixed fullbench 2017-07-06 02:22:57 -07:00
Yann Collet 27e883371d fixed wrong assert() condition
A single job created by ZSTDMT_compress() can be < 256KB
if data to compress is < 256 KB
(in which case it is delegated to single thread mode)
2017-07-04 19:33:16 -07:00
Yann Collet 2cb9774f5e more precise estimation of amount to flush at end of stream (single thread mode)
also : can use DEBUGLEVEL variable in /tests
2017-07-04 12:39:26 -07:00
Yann Collet 5051dd39ca Merge pull request #743 from facebook/fullbench
compress_generic() automatic optimization opportunities
2017-07-03 21:26:38 -07:00
Yann Collet 95c4a6e2c8 Merge pull request #745 from terrelln/libfuzzer
[fuzz] Add libFuzzer targets
2017-07-03 15:15:20 -07:00
Nick Terrell bea0f0cfa0 [fuzz] Move from fuzz/ to tests/fuzz/ 2017-07-03 12:40:12 -07:00
cyan4973 4b26306cb8 blindfix : fullbench's one-time leak, detected by valgrind 2017-07-01 08:03:59 -07:00
cyan4973 c07e43c2b5 added --show-leak-kind=all to valgrind tests 2017-07-01 07:05:11 -07:00
cyan4973 b5bb7c6d95 fixed Visual compilation of fullbench-dll 2017-06-29 19:59:37 -07:00
Yann Collet e7e5a8cef7 made fullbench compatible with multi-threading
fullbench 61/62 measure speed of ZSTD_compress_generic with 2 threads
2017-06-29 18:56:24 -07:00
Yann Collet afb0aca739 zstreamtest : big tests are only enabled in 64-bits mode
to avoid requesting too much memory in 32-bits mode during MT tests
2017-06-29 18:19:09 -07:00
Yann Collet 2e84bec9ac updated fullbench to also measure ZSTD_compress_generic()
will make it possible to visualize
optimization opportunity for ZSTD_e_end
2017-06-29 13:03:10 -07:00