1545 Commits

Author SHA1 Message Date
Jennifer Liu
9d6ed9def3 Merge fastCover into DictBuilder (#1274)
* Minor fix

* Run non-optimize FASTCOVER 5 times in benchmark

* Merge fastCover into dictBuilder

* Fix mixed declaration issue

* Add fastcover to symbol.c

* Add fastCover.c and cover.h to build

* Change fastCover.c to fastcover.c

* Update benchmark to run FASTCOVER in dictBuilder

* Undo spliting fastcover_param into cover_param and f

* Remove convert param functions

* Assign f to parameter

* Add zdict.h to Makefile in lib

* Add cover.h to BUCK

* Cast 1 to U64 before shifting

* Remove trimming of zero freq head and tail in selectSegment and rebenchmark

* Remove f as a separate parameter of tryParam

* Read 8 bytes when d is 6

* Add trimming off zero frequency head and tail

* Use best functions from COVER and remove trimming part(which leads to worse compression ratio after previous bugs were fixed)

* Add finalize= argument to FASTCOVER to specify percentage of training samples passed to ZDICT_finalizeDictionary

* Change nbDmer to always read 8 bytes even when d=6

* Add skip=# argument to allow skipping dmers in computeFrequency in FASTCOVER

* Update comments and benchmarking result

* Change default method of ZDICT_trainFromBuffer to ZDICT_optimizeTrainFromBuffer_fastCover

* Add dictType enum and fix bug about passing zParam when converting to coverParam

* Combine finalize and skip into a single parameter

* Update acceleration parameters and benchmark on 3 sample sets

* Change default splitPoint of FASTCOVER to 0.75 and benchmark first 3 sample sets

* Initialize variables outside of for loop in benchmark.c

* Update benchmark result for hg-manifest

* Remove cover.h from install-includes

* Add explanation of f

* Set default compression level for trainFromBuffer to 3

* Add assertion of fastCoverParams in DiB_trainFromFiles

* Add checkTotalCompressedSize function + some minor fixes

* Add test for multithreading fastCovr

* Initialize segmentFreqs in every FASTCOVER_selectSegment and move mutex_unnlock to end of COVER_best_finish

* Free segmentFreqs

* Initialize segmentFreqs before calling FASTCOVER_buildDictionary instead of in FASTCOVER_selectSegment

* Add FASTCOVER_MEMMULT

* Minor fix

* Update benchmarking result
2018-08-23 12:06:20 -07:00
Yann Collet
77e805e3db bench: changed creation/reset function to timedFnState
for consistency
2018-08-21 18:19:27 -07:00
Yann Collet
801e3bcd97
Merge pull request #1290 from edenzik/ezik/1119-safe-strcpy-in-fileio
Fixed unsafe string copy and concat in `fileio.c`.
2018-08-21 13:18:44 -07:00
Eden Zik
78af534f82 Fixed unsafe string copy and concat in fileio.c.
Per warnings from flawfinder: "Does not check for buffer overflows when
copying to destination [MS-banned] (CWE-120). Consider using snprintf,
strcpy_s, or strlcpy (warning: strncpy easily misused).".

Replaced called to strcpy and strcat in `fileio.c` to calls with a
specified size (`strncpy` and `strncat`).

Tested the changes on OSX, Linux, Windows.
On OSX + Linux, changes were tested with ASAN. The following flags were
used: 'check_initialization_order=1:strict_init_order=1:detect_odr_violation=1:detect_stack_use_after_return=1'

To reproduce warning:
./flawfinder.py ./programs/fileio.c
2018-08-20 22:15:24 -04:00
Yann Collet
105677c6db created ZSTDMT_toFlushNow()
tells in a non-blocking way if there is something ready to flush right now.
only works with multi-threading for the time being.

Useful to know if flush speed will be limited by lack of production.
2018-08-17 18:11:54 -07:00
Yann Collet
09e63c58ac fix : no longer slow down on input saturation
only slows down when all buffers are full
2018-08-17 16:27:43 -07:00
Yann Collet
8b674d7dc7 ensured compression level is maxed at ZSTD_maxCLevel() 2018-08-17 16:01:56 -07:00
Yann Collet
b4e7f71055 Merge branch 'dev' into adapt 2018-08-17 15:54:13 -07:00
Yann Collet
42a02ab745 fixed minor warnings issued by scan-build 2018-08-15 14:36:02 -07:00
George Lu
e89f1fb45c Fix scan-build warnings in bench.c 2018-08-14 14:44:47 -07:00
Yann Collet
3e4617ef54 frameProgression reports nbActiveWorkers and output flushed 2018-08-14 11:49:25 -07:00
Yann Collet
973a8d42c7
Merge pull request #1236 from GeorgeLu97/paramgrillconstraints
ParamgrillConstraints
2018-08-13 15:44:50 -07:00
Yann Collet
0853f86044 adaptive mode uses default window size of 8 MB 2018-08-13 13:13:22 -07:00
Yann Collet
33f7709c71 fileio: changed parameter type from ptr to plain structure
safer : this parameter is read-only,
we don't want original structure to be modified
2018-08-13 13:02:03 -07:00
Yann Collet
f3aa510738 rateLimiter does not "catch up" when input speed is slow 2018-08-13 11:38:55 -07:00
Yann Collet
e7a49c6683 introduced command --adapt 2018-08-11 20:48:06 -07:00
Yann Collet
9d26cb6a75 slow down faster when output speed is limited 2018-08-09 17:44:30 -07:00
Yann Collet
3d7b533f68 Merge branch 'dev' into adapt 2018-08-09 15:57:36 -07:00
Yann Collet
754942cb79 fixed assert() condition 2018-08-09 15:57:19 -07:00
Yann Collet
2dd76037be zstd cli can increase level when input is too slow 2018-08-09 15:51:30 -07:00
Yann Collet
79a35ac20d minor code comments improvements 2018-08-09 15:16:31 -07:00
Yann Collet
51e71a5ec7 added zstdgrep documentation
presenting `zstdgrep` limit regarding dictionary compression
with workaround recommended by @tobwen (#1268)
2018-08-09 12:28:25 -07:00
George Lu
bfe8392e23 Remove ctx from benchMem 2018-08-09 12:07:57 -07:00
George Lu
8278a49cb6 const srcPtrs 2018-08-09 10:42:58 -07:00
George Lu
3d230db853 Change speed representation from floating point to integral 2018-08-09 10:42:58 -07:00
George Lu
dd270b2f75 Renaming / Style fixes 2018-08-09 10:42:58 -07:00
George Lu
e148db366e Separate capacity vs size
Also:
Make suggested fixes
-varInds_t
-reorder some arguments
-remove code duplication
-update README / -h
-Fix memory leaks
2018-08-09 10:42:58 -07:00
George Lu
df026e159f Fix windows implicit casting bugs 2018-08-09 10:42:58 -07:00
George Lu
7b5b3d7ae3 BenchMem with block compressed sizes passed back up 2018-08-09 10:42:58 -07:00
George Lu
3adc217ea4 Total Changes:
Add different constraint types (decompression speed, compression memory, parameter constraints)
Separate search space by strategy + strategy selection
Memoize results
Real random restarts
Support multiple files
Support Dictionary inputs
Debug Macro for extra printing
2018-08-09 10:42:58 -07:00
George Lu
eb21b7f482 Not crashing 2018-08-09 10:42:58 -07:00
George Lu
5f49034520 Working V1 2018-08-09 10:42:58 -07:00
George Lu
cffb6da339 Parses additional parameters
Additional constraint checking

Minor fixes

more param parsing

Add Memory

Change paramVariation

work on feasibility

reformat bench

Changed Paramgrill to use bench.c benchmarking

customlevel macro

Printing Flag

Minor changes

Explicit casting

Makefile fix

casting, type fix

Printing Flag

Minor Changes

comments, helper fn's
2018-08-09 10:42:58 -07:00
Yann Collet
5808027abf Merge branch 'dev' into fix1241 2018-08-03 16:08:33 -07:00
Yann Collet
2fdab1629b fix unused variable warning 2018-08-03 08:30:01 -07:00
Yann Collet
5203f01774 fix : zstd cli can be built with build macro ZSTD_NOBENCH
which disables bench.c module
2018-08-03 07:54:29 -07:00
cyan4973
3f535007e4 fix %zu support under minGW
and relevant test on Appveyor
2018-07-30 16:56:18 +02:00
George Lu
09ccd977c3 no zero 2018-07-26 15:17:58 -07:00
Yann Collet
effa84c8d1
Merge pull request #1230 from terrelln/train-out
zstdcli: Allow -o before --train
2018-07-18 16:34:10 +02:00
Nick Terrell
4e706d7f2c fileio: Error in compression on read errors
We can write a corrupted file if the input file errors during a read.
We should return a non-zero error code in this case.
2018-07-17 15:26:30 -07:00
Nick Terrell
58b8219475 zstdcli: Allow -o before --train
Only set the default value if `outFileName` is unset.

Fixes #1227.
2018-07-16 12:45:34 -07:00
Nick Terrell
45821fac0c
Merge pull request #1225 from jennifermliu/dev
Split samples when building dictionary for COVER
2018-07-13 13:26:15 -07:00
Jennifer Liu
612b346ed5 Add explanation for split=100 2018-07-11 15:50:28 -07:00
Jennifer Liu
5021441d86 Change default splitPoint to 100 2018-07-10 11:19:33 -07:00
Jennifer Liu
bfad1af031 Update doc for split==100 2018-07-05 11:05:31 -07:00
Jennifer Liu
0881184c89 Some edits based on pull request comments 2018-07-03 17:53:27 -07:00
Yann Collet
689bfecd48
Merge pull request #1188 from GeorgeLu97/BenchModule
Bench module
2018-07-02 13:33:27 -07:00
Jennifer Liu
8afcb8eea7 Update documentation 2018-07-01 19:59:37 -07:00
Jennifer Liu
84e8b2a305 Fix another declaration issue 2018-06-29 18:02:02 -07:00
Jennifer Liu
348e5f77a9 Add split=# to cli 2018-06-29 17:54:41 -07:00