Commit Graph

2898 Commits (2500dcfa5f6f99dc015231d5561c35d9a4016965)

Author SHA1 Message Date
Sean Purcell 2500dcfa5f Add testing description 2017-03-09 16:05:23 -08:00
Sean Purcell 7c8f5d5bc7 Make test times overwritable 2017-03-09 16:05:23 -08:00
Sean Purcell daec40db24 Update .travis.yml and Makefile for medium tests 2017-03-09 16:05:22 -08:00
Nick Terrell e65aab8e0f Remove 'mem.h' dependency from ZSTD_WINDOWLOG_MAX 2017-03-08 15:40:13 -08:00
Yann Collet a41a4ed39a Merge pull request #594 from terrelln/bugs
Small fixes
2017-03-08 14:56:07 -08:00
Nick Terrell 81512e9ebe Avoid '#define inline /* ... */'
Take definition of `FORCE_INLINE` from `zstd_internal.h`.
2017-03-08 14:00:21 -08:00
Nick Terrell e06c303475 Fix ZSTD_sizeof_CStream() 2017-03-08 13:45:10 -08:00
Yann Collet 15c9dd80a8 Merge pull request #593 from iburinoc/undef
Reduce point at which we reduce offsets to protect against UB
2017-03-07 18:31:38 -08:00
Sean Purcell 881abe44f1 Reduce point at which we reduce offsets to protect against UB 2017-03-07 16:58:08 -08:00
Yann Collet baa9b114f8 minor text refactor in readme 2017-03-07 16:24:54 -08:00
Yann Collet 15235ef3c8 Merge pull request #592 from iburinoc/ci
Fix travis test broken by Makefile change
2017-03-07 13:51:33 -08:00
Sean Purcell d66450fd7d Fix travis test broken by Makefile change 2017-03-07 11:36:19 -08:00
Yann Collet 15a7a99653 Merge pull request #590 from iburinoc/ci
Set up "short" tests on CircleCI
2017-03-06 23:29:26 -08:00
Sean Purcell a1a195044f Use test section 2017-03-06 18:21:38 -08:00
Yann Collet 38ab1db3cd fixed lzbench link 2017-03-06 17:24:34 -08:00
Yann Collet eeb9758c39 fix : remove mempcpy line in bench 2017-03-06 17:22:47 -08:00
Yann Collet 764c2fdfed updated benchmark table
zstd v1.1.3, new station i7-6700K
2017-03-06 17:20:44 -08:00
Sean Purcell 3437bf2feb Add build targets to the Makefile, and update CircleCI tests 2017-03-06 15:05:02 -08:00
Yann Collet 3db00373c5 Merge branch 'dev' of github.com:facebook/zstd into dev 2017-03-05 21:18:25 -08:00
Yann Collet 8b1d004031 added -Wformat-security flag, as recommended by @pixelb 2017-03-05 21:17:32 -08:00
Yann Collet 1f2c95c5f3 minor code refactor in HUF module 2017-03-05 21:07:20 -08:00
Yann Collet 9ba81a3c63 Merge pull request #588 from pixelb/fedora-warnings
support -Werror=format-security
2017-03-05 21:06:13 -08:00
Pádraig Brady 38a3428b37 support -Werror=format-security
Fedora now enables this option by default, resulting
in the following build failure:

Logging.h: In instantiation of
'void pzstd::Logger::operator()(int, const char*, Args ...)
Pzstd.cpp:413:48:   required from here
Logging.h:46:17: error: format not a string literal and no format arguments
[-Werror=format-security]
     std::fprintf(out_, fmt, args...);
     ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
2017-03-05 19:42:51 -08:00
Yann Collet 5d801278dc Merge pull request #586 from terrelln/repeat-heuristic
Always check Huffman tables for ZSTD_lazy+
2017-03-03 19:38:56 -08:00
Nick Terrell 54c4babd8f Always check Huffman tables for ZSTD_lazy+
The compressor always reuses the existing Huffman table if the literals
size is at most 1 KiB. If the compression strategy is `ZSTD_lazy` or
stronger always check to see if reusing the previous table or creating
a new table is better.

This doesn't yet weigh in decompression speed. I don't want to add any
heuristics there until I have real data to work with to ensure that the
heuristic works for at least one use case, preferably more.
2017-03-03 16:49:38 -08:00
Yann Collet 1af570bd05 Merge pull request #585 from terrelln/cover-leak
Fix COVER_optimizeTrainFromBuffer() resource leaks
2017-03-02 20:46:35 -08:00
Yann Collet f44b55c18d Merge pull request #584 from terrelln/huff-repeat
Allow compressor to repeat Huffman tables
2017-03-02 17:20:11 -08:00
Yann Collet e02409fdc3 update NEWS on @iburinoc's 32-bits version improvement 2017-03-02 17:14:57 -08:00
Yann Collet fe5d27062e disable prefetch-decode for 32-bits target
This decoder variant is detrimental to x86 architecture
likely due to register pressure.

Note that the variant is disabled for all 32-bits targets.
It's unclear if it would help for different architectures,
such as ARM, MIPS or PowerPC.
2017-03-02 17:09:21 -08:00
Yann Collet 3a55d8be26 Merge pull request #582 from iburinoc/m32
Encode/decode offsets >= 32MB in 32-bits mode
2017-03-02 16:42:50 -08:00
Nick Terrell d051cd5b43 Use workspace for count and CTable 2017-03-02 16:38:07 -08:00
Nick Terrell 976e325b2e Fix COVER_optimizeTrainFromBuffer() resource leaks
Thanks to @nemequ for reporting the resource leaks.
2017-03-02 15:54:39 -08:00
Sean Purcell 553f67e0c1 Remove 'generic' inline strategy
Seems to avoid performance loss for compression.
Same strategy tested on decompression side, did not appear to improve
speed.
2017-03-02 15:18:13 -08:00
Sean Purcell 3d95925a59 Merge remote-tracking branch 'origin/dev' into m32 2017-03-02 15:17:56 -08:00
Nick Terrell a419777eb1 Allow compressor to repeat Huffman tables
* Compressor saves most recently used Huffman table and reuses it
  if it produces better results.
* I attempted to preserve CPU usage profile.
  I intentionally left all of the existing heuristics in place.
  There is only a speed difference on the second block and later.
  When compressing large enough blocks (say >= 4 KiB) there is
  no significant difference in compression speed.
  Dictionary compression of one block is the same speed for blocks
  with literals <= 1 KiB, and after that the difference is not
  very significant.
* In the synthetic data, with blocks 10 KB or smaller, most blocks
  can't use repeated tables because the previous block did not
  contain a symbol that the current block contains.
  Once blocks are about 12 KB or more, most previous blocks have
  valid Huffman tables for the current block, and the compression
  ratio and decompression speed jumped.
* In silesia blocks as small as 4KB can frequently reuse the
  previous Huffman table (85%), but it isn't as profitable, and
  the previous Huffman table only gets used about 3% of the time.
* Microbenchmarks show that `HUF_validateCTable()` takes ~55 ns
  and `HUF_estimateCompressedSize()` takes ~35 ns.
  They are decently well optimized, the first versions took 90 ns
  and 120 ns respectively. `HUF_validateCTable()` could be twice as
  fast, if we cast the `HUF_CElt*` to a `U32*` and compare to 0.
  However, `U32` has an alignment of 4 instead of 2, so I think that
  might be undefined behavior.
* I've ran `zstreamtest` compiled normally, with UASAN and with MSAN
  for 4 hours each.

The worst case for the speed difference is a bunch of small blocks
in the same frame. I modified `bench.c` to compress the input in a
single frame but with blocks of the given block size, set by `-B`.
Benchmarks on level 1:

|  Program  | Block size |   Corpus  | Ratio | Compression MB/s | Decompression MB/s |
|-----------|------------|-----------|-------|------------------|--------------------|
| zstd.base |        256 | synthetic | 2.364 |            110.0 |              297.0 |
|      zstd |        256 | synthetic | 2.367 |            108.9 |              297.0 |
| zstd.base |        256 | silesia   | 2.204 |             93.8 |              415.7 |
|      zstd |        256 | silesia   | 2.204 |             93.4 |              415.7 |
| zstd.base |        512 | synthetic | 2.594 |            144.2 |              420.0 |
|      zstd |        512 | synthetic | 2.599 |            141.5 |              425.7 |
| zstd.base |        512 | silesia   | 2.358 |            118.4 |              432.6 |
|      zstd |        512 | silesia   | 2.358 |            119.8 |              432.6 |
| zstd.base |       1024 | synthetic | 2.790 |            192.3 |              594.1 |
|      zstd |       1024 | synthetic | 2.794 |            192.3 |              600.0 |
| zstd.base |       1024 | silesia   | 2.524 |            148.2 |              464.2 |
|      zstd |       1024 | silesia   | 2.525 |            148.2 |              467.6 |
| zstd.base |       4096 | synthetic | 3.023 |            300.0 |             1000.0 |
|      zstd |       4096 | synthetic | 3.024 |            300.0 |             1010.1 |
| zstd.base |       4096 | silesia   | 2.779 |            223.1 |              623.5 |
|      zstd |       4096 | silesia   | 2.779 |            223.1 |              636.0 |
| zstd.base |      16384 | synthetic | 3.131 |            350.0 |             1150.1 |
|      zstd |      16384 | synthetic | 3.152 |            350.0 |             1630.3 |
| zstd.base |      16384 | silesia   | 2.871 |            296.5 |              883.3 |
|      zstd |      16384 | silesia   | 2.872 |            294.4 |              898.3 |
2017-03-02 13:27:52 -08:00
Yann Collet fdb0fd34b3 Merge pull request #583 from terrelln/set-dictid
Set dictID to 0 for content only dictionaries
2017-03-02 13:15:31 -08:00
Nick Terrell 3475b9b431 Set dictID to 0 for content only dictionaries 2017-03-02 12:33:02 -08:00
Yann Collet 78208bd8be fixed : build zstd cli after libzstd 2017-03-01 21:02:06 -08:00
Yann Collet 27526c7201 make : added target shortest
shortest only run fast part of playTests.sh .
cc @iburinoc
2017-03-01 17:02:49 -08:00
Yann Collet c1c040eae1 added gzip tests
also : made sure zstd --format=gzip -V
would fail if gzip compatibility is not supported
2017-03-01 16:49:20 -08:00
Sean Purcell d44703d145 Offsets >= 32MB in 32-bits mode 2017-03-01 16:27:56 -08:00
Yann Collet 76f0494089 xxhash can be included twice in any order
Previously,

followed by :

would fail to include the static definitions,
because the second include was simply skipped by guard macro.

Now it works as intended :
the missing static part is included during the second include.
2017-03-01 13:29:29 -08:00
Yann Collet 4bcc69b761 solves warnings when compiling with global XXH_STATIC_LINKING_ONLY
XXH_STATIC_LINKING_ONLY protection macro is intended to be triggered just before the include.
The main idea is to keep this setting local :
user module shall explicitly understand and accept the static linking restriction
which becomes transparent when triggering the macro at project level.
Global definition also triggers redefinition warnings for user modules which do locally define the macro.

This new version compiles lib and cli without warning when the macro is set globally.
That's not a scenario to be recommended, since it trades a local effect for a global one,
but it was easy enough to provide from zstd side.
2017-03-01 11:33:25 -08:00
Yann Collet 31432cc57d Merge pull request #579 from iburinoc/multiframe
Check to ensure ddict isn't null before dereference
2017-03-01 11:02:04 -08:00
Yann Collet 51598510c0 Merge pull request #580 from facebook/speedStream
Improve streaming decompression speed
2017-03-01 10:59:51 -08:00
Yann Collet 43764cdb1d updated NEWS for 1.1.4
cmake, performance
2017-02-28 17:44:17 -08:00
Yann Collet c896735b8d Merge pull request #575 from Majlen/cmake-improvement
Cmake improvement
2017-02-28 15:32:21 -08:00
Sean Purcell a81d4fee58 Check to ensure ddict isn't null before dereference 2017-02-28 15:28:29 -08:00
Yann Collet a5cbc02ed1 Merge pull request #578 from inikep/dev
decompression: --rm is silent when input is stdin
2017-02-28 15:21:28 -08:00
Przemyslaw Skibinski 5c1c80cbb6 travis.yml: fixed pull_request 2017-02-28 18:34:39 +01:00