Commit Graph

53 Commits (6beb3c0159a60a6c49334f24e32ae92df5dbba4d)

Author SHA1 Message Date
Ed Maste b81d7cc6a0 remove extraneous doubled ;s 2019-08-15 21:17:06 -04:00
Yann Collet 59a7116cc2 benchfn dependencies reduced to only timefn
benchfn used to rely on mem.h, and util,
which in turn relied on platform.h.
Using benchfn outside of zstd required to bring all these dependencies.

Now, dependency is reduced to timefn only.
This required to create a separate timefn from util,
and rewrite benchfn and timefn to no longer need mem.h.

Separating timefn from util has a wide effect accross the code base,
as usage of time functions is widespread.
A lot of build scripts had to be updated to also include timefn.
2019-04-10 12:37:03 -07:00
Yann Collet ededcfca57 fix confusion between unsigned <-> U32
as suggested in #1441.

generally U32 and unsigned are the same thing,
except when they are not ...

case : 32-bit compilation for MIPS (uint32_t == unsigned long)

A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).

Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
  but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
  but the caller user a pointer to U32 instead.

These fixes are useful.

However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.

As a consequence, the amount of code changed is larger than I would expect for such a patch.

Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.

So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.

I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.

Thoughts ?
2018-12-21 18:09:41 -08:00
Nick Terrell f2d6db45cd [zstd] Add -Wmissing-prototypes 2018-09-27 15:24:48 -07:00
Jennifer Liu 9d6ed9def3 Merge fastCover into DictBuilder (#1274)
* Minor fix

* Run non-optimize FASTCOVER 5 times in benchmark

* Merge fastCover into dictBuilder

* Fix mixed declaration issue

* Add fastcover to symbol.c

* Add fastCover.c and cover.h to build

* Change fastCover.c to fastcover.c

* Update benchmark to run FASTCOVER in dictBuilder

* Undo spliting fastcover_param into cover_param and f

* Remove convert param functions

* Assign f to parameter

* Add zdict.h to Makefile in lib

* Add cover.h to BUCK

* Cast 1 to U64 before shifting

* Remove trimming of zero freq head and tail in selectSegment and rebenchmark

* Remove f as a separate parameter of tryParam

* Read 8 bytes when d is 6

* Add trimming off zero frequency head and tail

* Use best functions from COVER and remove trimming part(which leads to worse compression ratio after previous bugs were fixed)

* Add finalize= argument to FASTCOVER to specify percentage of training samples passed to ZDICT_finalizeDictionary

* Change nbDmer to always read 8 bytes even when d=6

* Add skip=# argument to allow skipping dmers in computeFrequency in FASTCOVER

* Update comments and benchmarking result

* Change default method of ZDICT_trainFromBuffer to ZDICT_optimizeTrainFromBuffer_fastCover

* Add dictType enum and fix bug about passing zParam when converting to coverParam

* Combine finalize and skip into a single parameter

* Update acceleration parameters and benchmark on 3 sample sets

* Change default splitPoint of FASTCOVER to 0.75 and benchmark first 3 sample sets

* Initialize variables outside of for loop in benchmark.c

* Update benchmark result for hg-manifest

* Remove cover.h from install-includes

* Add explanation of f

* Set default compression level for trainFromBuffer to 3

* Add assertion of fastCoverParams in DiB_trainFromFiles

* Add checkTotalCompressedSize function + some minor fixes

* Add test for multithreading fastCovr

* Initialize segmentFreqs in every FASTCOVER_selectSegment and move mutex_unnlock to end of COVER_best_finish

* Free segmentFreqs

* Initialize segmentFreqs before calling FASTCOVER_buildDictionary instead of in FASTCOVER_selectSegment

* Add FASTCOVER_MEMMULT

* Minor fix

* Update benchmarking result
2018-08-23 12:06:20 -07:00
Yann Collet 42a02ab745 fixed minor warnings issued by scan-build 2018-08-15 14:36:02 -07:00
Jennifer Liu 8afcb8eea7 Update documentation 2018-07-01 19:59:37 -07:00
Nick Terrell dab8cfa3c7 Combine definitions of SEC_TO_MICRO 2017-11-30 19:40:53 -08:00
Nick Terrell 9a2f6f477b Use util.h for timing 2017-11-30 14:57:25 -08:00
Yann Collet 18b795374a UTIL_getFileSize() returns UTIL_FILESIZE_UNKNOWN on failure
UTIL_getFileSize() used to return zero on failure.
This made it impossible to distinguish a failure from a genuine empty file.
Both cases where coalesced.

Adding UTIL_FILESIZE_UNKNOWN constant has many consequences on user code,
since in many places, the `0` was assumed to mean "error".
This is no longer the case, and the error code must be actively checked.
2017-10-17 16:14:25 -07:00
Yann Collet 1722055799 add comment on using -B# to split input file for dictionary training 2017-09-15 16:23:50 -07:00
Yann Collet c68d17f2da ensures that sampleSizes table is large enough
as recommended by @terrelln
2017-09-15 15:31:31 -07:00
Yann Collet 25a60488dd fixed 64-to-32 conversion warnings 2017-09-15 11:55:13 -07:00
Yann Collet a9694231ca fixed minor conversion warning 2017-09-15 10:16:26 -07:00
Yann Collet 086b9597d9 added ability to split input files for dictionary training
using command -B#
This is the same behavior as benchmark module,
which can also split input into arbitrary size blocks, using -B#.
2017-09-14 16:45:10 -07:00
Yann Collet 77c137b3ae minor comment refactor 2017-09-14 15:12:57 -07:00
Yann Collet 3128e03be6 updated license header
to clarify dual-license meaning as "or"
2017-09-08 00:09:23 -07:00
Yann Collet 32fb407c9d updated a bunch of headers
for the new license
2017-08-18 16:52:05 -07:00
Nick Terrell 5b7fd7c422 [zdict] Make COVER the default algorithm 2017-06-26 21:09:22 -07:00
Sean Purcell 42bac7fa84 Change ifndef's to undef's 2017-04-13 15:35:05 -07:00
Sean Purcell f876f1200c Fix compilation on macOS 2017-04-13 12:33:45 -07:00
Sean Purcell 042ba122ae Change g_displayLevel to int and fix DISPLAYUPDATE flush 2017-03-23 11:21:59 -07:00
Nick Terrell c220d4c74d Use COVER_MEMMULT when training with COVER. 2017-01-09 16:49:04 -08:00
Nick Terrell 3a1fefcf00 Simplify COVER parameters 2017-01-02 17:51:38 -08:00
Nick Terrell df8415c502 Add COVER to the zstd cli 2017-01-02 14:43:08 -08:00
Przemyslaw Skibinski 7a8a03c20d util.h: restore BSD license for Facebook Open-Source 2016-12-21 15:08:44 +01:00
Przemyslaw Skibinski e679741b18 _CRT_SECURE_NO_WARNINGS moved to util.h 2016-12-21 13:47:11 +01:00
Przemyslaw Skibinski 2f6ccee6af platform.h: removed Compiler Options 2016-12-21 13:23:34 +01:00
Przemyslaw Skibinski 16ae6563a2 executables use new util.h and platform.h 2016-12-21 09:06:14 +01:00
Przemyslaw Skibinski f8046b8e72 Merge remote-tracking branch 'refs/remotes/facebook/dev' into v112
# Conflicts:
#	appveyor.yml
2016-12-19 08:20:26 +01:00
Yann Collet 1496c3dc47 Fix : size estimation when some samples are very large 2016-12-18 11:58:23 +01:00
Yann Collet d46ecb58a5 added dll compilation tests 2016-12-17 16:28:12 +01:00
Przemyslaw Skibinski b866e72826 tools use platform.h 2016-12-16 14:24:01 +01:00
Yann Collet 4ded9e591c added boilerplate 2016-08-30 11:06:28 -07:00
Yann Collet 49d105cfcf better warning and error messages in case of dictionary training failure (#292) 2016-08-18 15:02:11 +02:00
Yann Collet dd25a27702 added tutorial warning messages for dictBuilder 2016-07-27 12:43:09 +02:00
Yann Collet a3d03a3973 added <errno.h> dependency 2016-07-06 16:27:17 +02:00
Yann Collet bcb5f77efa dictBuilder manages better samples of null size 0 and large size > 128 KB 2016-07-06 15:41:03 +02:00
Yann Collet 290aaa7521 Added : ability to manually select the dictionary ID of a newly created dictionary 2016-05-30 21:18:52 +02:00
inikep 3733797fcd bench.c: experimental -r (operate recursively on directories) for Windows and _POSIX_C_SOURCE >= 200112L 2016-05-10 14:22:55 +02:00
inikep ed9a08538c Merge remote-tracking branch 'refs/remotes/Cyan4973/dev' into dev
# Conflicts:
#	lib/common/util.h
#	programs/paramgrill.c
#	visual/2013/fullbench/fullbench.vcxproj.filters
#	visual/2013/fuzzer/fuzzer.vcxproj.filters
2016-05-10 13:20:01 +02:00
Yann Collet f6ca09b5ff Reduced console display on loading lots of files with `zstd --train`. Reported by @KrzysFR, see #177 2016-05-09 04:44:45 +02:00
inikep 13c8424ea0 code cleaning 2016-05-05 13:58:56 +02:00
inikep 9c22e57bfb Compiler Options moved to util.h 2016-05-05 11:53:42 +02:00
inikep bab4317961 util.h must the the first include to #define _POSIX_C_SOURCE 2016-04-29 15:19:40 +02:00
inikep 55d047aa92 getTotalFileSize moved to common/util.h 2016-04-28 16:50:13 +02:00
inikep d5ff2c3d9a ordering of #include 2016-04-28 14:40:45 +02:00
inikep 69fcd7c0ae getFileSize moved to common/util.h 2016-04-28 12:23:33 +02:00
inikep 23a0889301 separation of lib/ into common/, compress/, decompress/, dictBuilder/, legacy/ 2016-04-22 12:43:18 +02:00
Yann Collet 7de4f9fd81 minor cosmetic 2016-02-23 21:34:18 +01:00