History

Yann Collet 281f06e01f saves 3-bytes on small input with streaming API

zstd streaming API was adding a null-block at end of frame for small input.

Reason is : on small input, a single block is enough.
ZSTD_CStream would size its input buffer to expect a single block of this size,
automatically triggering a flush on reaching this size.

Unfortunately, that last byte was generally received before the "end" directive (at least in `fileio`).
The later "end" directive would force the creation of a 3-bytes last block to indicate end of frame.

The solution is to not flush automatically, which is btw the expected behavior.
It happens in this case because blocksize is defined with exactly the same size as input.
Just adding one-byte is enough to stop triggering the automatic flush.

I initially looked at another solution, solving the problem directly in the compression context.
But it felt awkward.
Now, the underlying compression API `ZSTD_compressContinue()` would take the decision the close a frame
on reaching its expected end (`pledgedSrcSize`).
This feels awkward, a responsability over-reach, beyond the definition of this API.
ZSTD_compressContinue() is clearly documented as a guaranteed flush,
with ZSTD_compressEnd() generating a guaranteed end.

I faced similar issue when trying to port a similar mechanism at the higher streaming layer.
Having ZSTD_CStream end a frame automatically on reaching `pledgedSrcSize` can surprise the caller,
since it did not explicitly requested an end of frame.
The only sensible action remaining after that is to end the frame with no additional input.
This adds additional logic in the ZSTD_CStream state to check this condition.
Plus some potential confusion on the meaning of ZSTD_endStream() with no additional input (ending confirmation ? new 0-size frame ?)

In the end, just enlarging input buffer by 1 byte feels the least intrusive change.
It's also a contract remaining inside the streaming layer, so the logic is contained in this part of the code.

The patch also introduces a new test checking that size of small frame is as expected, without additional 3-bytes null block.

2017-12-14 11:47:02 -08:00

files

[libzstd] Fix bug in Huffman decompresser

2017-08-07 12:37:48 -07:00

fuzz

Merge branch 'dev' into shorterTests

2017-09-28 12:19:28 -07:00

gzip

fixed a bunch of headers after license change (#825 )

2017-08-31 11:24:54 -07:00

.gitignore

fixed poolTests

2017-08-31 15:13:31 -07:00

datagencli.c

updated license header

2017-09-08 00:09:23 -07:00

decodecorpus.c

Combine definitions of SEC_TO_MICRO

2017-11-30 19:40:53 -08:00

fullbench.c

UTIL_getFileSize() returns UTIL_FILESIZE_UNKNOWN on failure

2017-10-17 16:14:25 -07:00

fuzzer.c

Fix cdict compressor repcodes

2017-12-13 11:31:20 -08:00

invalidDictionaries.c

updated license header

2017-09-08 00:09:23 -07:00

legacy.c

updated license header

2017-09-08 00:09:23 -07:00

longmatch.c

updated license header

2017-09-08 00:09:23 -07:00

Makefile

Improved tests

2017-12-13 11:48:30 -08:00

namespaceTest.c

updated license header

2017-09-08 00:09:23 -07:00

paramgrill.c

Combine definitions of SEC_TO_MICRO

2017-11-30 19:40:53 -08:00

playTests.sh

saves 3-bytes on small input with streaming API

2017-12-14 11:47:02 -08:00

poolTests.c

updated license header

2017-09-08 00:09:23 -07:00

README.md

Update tests/README.md

2017-02-23 10:27:00 -08:00

roundTripCrash.c

updated license header

2017-09-08 00:09:23 -07:00

seqgen.c

[test] Exercise all codes in dictionary tables

2017-10-16 18:05:36 -07:00

seqgen.h

[test] Exercise all codes in dictionary tables

2017-10-16 18:05:36 -07:00

symbols.c

updated license header

2017-09-08 00:09:23 -07:00

test-zstd-speed.py

last batch of header files changed to reflect new license (#825 )

2017-08-31 12:20:50 -07:00

test-zstd-versions.py

last batch of header files changed to reflect new license (#825 )

2017-08-31 12:20:50 -07:00

zbufftest.c

Combine definitions of SEC_TO_MICRO

2017-11-30 19:40:53 -08:00

zstreamtest.c

zstreamtest : added missing CHECK_Z()

2017-12-13 15:35:49 -08:00

README.md

Programs and scripts for automated testing of Zstandard

This directory contains the following programs and scripts:

datagen : Synthetic and parametrable data generator, for tests
fullbench : Precisely measure speed for each zstd inner functions
fuzzer : Test tool, to check zstd integrity on target platform
paramgrill : parameter tester for zstd
test-zstd-speed.py : script for testing zstd speed difference between commits
test-zstd-versions.py : compatibility test between zstd versions stored on Github (v0.1+)
zbufftest : Test tool to check ZBUFF (a buffered streaming API) integrity
zstreamtest : Fuzzer test tool for zstd streaming API
legacy : Test tool to test decoding of legacy zstd frames
decodecorpus : Tool to generate valid Zstandard frames, for verifying decoder implementations

`test-zstd-versions.py` - script for testing zstd interoperability between versions

This script creates versionsTest directory to which zstd repository is cloned. Then all tagged (released) versions of zstd are compiled. In the following step interoperability between zstd versions is checked.

`test-zstd-speed.py` - script for testing zstd speed difference between commits

This script creates speedTest directory to which zstd repository is cloned. Then it compiles all branches of zstd and performs a speed benchmark for a given list of files (the testFileNames parameter). After sleepTime (an optional parameter, default 300 seconds) seconds the script checks repository for new commits. If a new commit is found it is compiled and a speed benchmark for this commit is performed. The results of the speed benchmark are compared to the previous results. If compression or decompression speed for one of zstd levels is lower than lowerLimit (an optional parameter, default 0.98) the speed benchmark is restarted. If second results are also lower than lowerLimit the warning e-mail is send to recipients from the list (the emails parameter).

Additional remarks:

To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel
Using the script with virtual machines can lead to large variations of speed results
The speed benchmark is not performed until computers' load average is lower than maxLoadAvg (an optional parameter, default 0.75)
The script sends e-mails using mutt; if mutt is not available it sends e-mails without attachments using mail; if both are not available it only prints a warning

The example usage with two test files, one e-mail address, and with an additional message:

./test-zstd-speed.py "silesia.tar calgary.tar" "email@gmail.com" --message "tested on my laptop" --sleepTime 60

To run the script in background please use:

nohup ./test-zstd-speed.py testFileNames emails &

The full list of parameters:

positional arguments:
  testFileNames         file names list for speed benchmark
  emails                list of e-mail addresses to send warnings

optional arguments:
  -h, --help            show this help message and exit
  --message MESSAGE     attach an additional message to e-mail
  --lowerLimit LOWERLIMIT
                        send email if speed is lower than given limit
  --maxLoadAvg MAXLOADAVG
                        maximum load average to start testing
  --lastCLevel LASTCLEVEL
                        last compression level for testing
  --sleepTime SLEEPTIME
                        frequency of repository checking in seconds

`decodecorpus` - tool to generate Zstandard frames for decoder testing

Command line tool to generate test .zst files.

This tool will generate .zst files with checksums, as well as optionally output the corresponding correct uncompressed data for extra verfication.

Example:

./decodecorpus -ptestfiles -otestfiles -n10000 -s5

will generate 10,000 sample .zst files using a seed of 5 in the testfiles directory, with the zstd checksum field set, as well as the 10,000 original files for more detailed comparison of decompression results.

./decodecorpus -t -T1mn

will choose a random seed, and for 1 minute, generate random test frames and ensure that the zstd library correctly decompresses them in both simple and streaming modes.

README.md

Programs and scripts for automated testing of Zstandard

test-zstd-versions.py - script for testing zstd interoperability between versions

test-zstd-speed.py - script for testing zstd speed difference between commits

decodecorpus - tool to generate Zstandard frames for decoder testing

`test-zstd-versions.py` - script for testing zstd interoperability between versions

`test-zstd-speed.py` - script for testing zstd speed difference between commits

`decodecorpus` - tool to generate Zstandard frames for decoder testing