Commit Graph

41 Commits (b399b4746796f3f363ff19df89fcb1d478829b61)

Author SHA1 Message Date
Yann Collet 23dd28df67 minor improvements to benchmark display 2021-10-26 23:23:30 -07:00
senhuang42 b5c35d7ea3 Use new paramSwitch enum for LCM, row matchfinder, and block splitter 2021-09-21 14:22:02 -04:00
Yann Collet f58e63bee7 Merge branch 'dev' into opt_investigation 2021-09-12 01:42:49 -07:00
Yann Collet 640c5b1f77 fix automated_benchmarking
make it able to process text output sent into either stdout or stderr
2021-09-12 01:36:18 -07:00
Yann Collet 08ceda3dfc new statistics update policy
small general compression ratio improvement for btopt+ strategies/
2021-09-04 00:52:44 -07:00
Yann Collet eab692211e removed pretty-print of sizes in benchmark
This is less appropriate for this mode :
benchmark is about accuracy,
it's important to read the exact values.
2021-09-03 12:51:02 -07:00
W. Felix Handte ab8aa49b8d Fix Benchmark Corruption Display 2021-09-01 14:15:03 -04:00
senhuang42 dce48f53df Fix benchzstd error message 2021-08-23 19:10:16 -04:00
W. Felix Handte 7e0058848c Fix Whitespace 2021-06-10 12:53:07 -04:00
W. Felix Handte bbb81c8801 Avoid `snprintf()` in Preparing Human-Readable Sizes; Improve Formatting
This produces the following formatting:

   Size    | `zstd` | `ls -lh`
---------- | ------ | --------
1          | 1      | 1
12         | 12     | 12
123        | 123    | 123
1234       | 1.21K  | 1.3K
12345      | 12.1K  | 13K
123456     | 121K   | 121K
1234567    | 1.18M  | 1.2M
12345678   | 11.8M  | 12M
123456789  | 118M   | 118M
1234567890 | 1.15G  | 1.2G
999        | 999    | 999
1000       | 1000   | 1000
1001       | 1001   | 1001
1023       | 1023   | 1023
1024       | 1.000K | 1.0K
1025       | 1.00K  | 1.1K
999999     | 977K   | 977K
1000000    | 977K   | 977K
1000001    | 977K   | 977K
1023999    | 1000K  | 1000K
1024000    | 1000K  | 1000K
1024001    | 1000K  | 1001K
1048575    | 1024K  | 1.0M
1048576    | 1.000M | 1.0M
1048577    | 1.00M  | 1.1M

This was produced with the following invocation:

```
for N in 1 12 123 1234 12345 123456 1234567 12345678 123456789 1234567890 999 1000 1001 1023 1024 1025 999999 1000000 1000001 1023999 1024000 1024001 1048575 1048576 1048577; do
  head -c $N /dev/urandom > r$N
done
./zstd -i1 -b1 -S r1 r12 r123 r1234 r12345 r123456 r1234567 r12345678 r123456789 r1234567890 r999 r1000 r1001 r1023 r1024 r1025 r999999 r1000000 r1000001 r1023999 r1024000 r1024001 r1048575 r1048576 r1048577
```
2021-06-10 12:53:07 -04:00
Scott Baker 35576e63ce Convert tabs to spaces 2021-06-10 12:53:07 -04:00
Scott Baker 894698d3b6 Use human_size() in the benchmark output also 2021-06-10 12:53:07 -04:00
Yann Collet 9750f3c87b improved benchmark experience on Windows
benchmark results are not progressively displayed on Windows terminal.
For long benchmark sessions, nothing is displayed,
until the end, where everything is flushed.

Force display to be flushed after each update.
Updates happen roughtly every second, or even less,
so it's not a substantial workload.
2021-05-05 16:52:21 -07:00
Nick Terrell 09149beaf8 [1.5.0] Move `zstd_errors.h` and `zdict.h` to `lib/` root
`zstd_errors.h` and `zdict.h` are public headers, so they deserve to be
in the root `lib/` directory with `zstd.h`, not mixed in with our private
headers.
2021-04-30 15:13:54 -07:00
Nick Terrell 4694423c4f Add and integrate lazy row hash strategy 2021-04-07 09:53:34 -07:00
Nick Terrell a494308ae9 [copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files
* Switch to yearless copyright per FB policy
* Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources
* Add zstd copyright/license header to the `contrib/linux-kernel` sources
* Update the `tests/test-license.py` to check for yearless copyright
* Improvements to `tests/test-license.py`
* Check `contrib/linux-kernel` in `tests/test-license.py`
2021-03-30 10:30:43 -07:00
Nick Terrell 66e811d782 [license] Update year to 2021 2021-01-04 17:53:52 -05:00
W. Felix Handte 7dcca6bc64 Also Move programs/ Directory to Relative Includes 2020-05-04 15:20:26 -04:00
Nick Terrell ac58c8d720 Fix copyright and license lines
* All copyright lines now have -2020 instead of -present
* All copyright lines include "Facebook, Inc"
* All licenses are now standardized

The copyright in `threading.{h,c}` is not changed because it comes from
zstdmt.

The copyright and license of `divsufsort.{h,c}` is not changed.
2020-03-26 17:02:06 -07:00
Bimba Shrestha f33baa21c6 Removing assert and changing ratio cSize 2020-01-31 11:54:14 -08:00
Yann Collet 6309be677c minor comments & refactoring 2019-10-15 16:09:18 -07:00
Josh Soref a880ca239b Spelling (#1582)
* spelling: accidentally

* spelling: across

* spelling: additionally

* spelling: addresses

* spelling: appropriate

* spelling: assumed

* spelling: available

* spelling: builder

* spelling: capacity

* spelling: compiler

* spelling: compressibility

* spelling: compressor

* spelling: compression

* spelling: contract

* spelling: convenience

* spelling: decompress

* spelling: description

* spelling: deflate

* spelling: deterministically

* spelling: dictionary

* spelling: display

* spelling: eliminate

* spelling: preemptively

* spelling: exclude

* spelling: failure

* spelling: independence

* spelling: independent

* spelling: intentionally

* spelling: matching

* spelling: maximum

* spelling: meaning

* spelling: mishandled

* spelling: memory

* spelling: occasionally

* spelling: occurrence

* spelling: official

* spelling: offsets

* spelling: original

* spelling: output

* spelling: overflow

* spelling: overridden

* spelling: parameter

* spelling: performance

* spelling: probability

* spelling: receives

* spelling: redundant

* spelling: recompression

* spelling: resources

* spelling: sanity

* spelling: segment

* spelling: series

* spelling: specified

* spelling: specify

* spelling: subtracted

* spelling: successful

* spelling: return

* spelling: translation

* spelling: update

* spelling: unrelated

* spelling: useless

* spelling: variables

* spelling: variety

* spelling: verbatim

* spelling: verification

* spelling: visited

* spelling: warming

* spelling: workers

* spelling: with
2019-04-12 11:18:11 -07:00
Yann Collet 59a7116cc2 benchfn dependencies reduced to only timefn
benchfn used to rely on mem.h, and util,
which in turn relied on platform.h.
Using benchfn outside of zstd required to bring all these dependencies.

Now, dependency is reduced to timefn only.
This required to create a separate timefn from util,
and rewrite benchfn and timefn to no longer need mem.h.

Separating timefn from util has a wide effect accross the code base,
as usage of time functions is widespread.
A lot of build scripts had to be updated to also include timefn.
2019-04-10 12:37:03 -07:00
Yann Collet 094c000904 Merge branch 'dev' into benchfn 2019-04-10 11:57:05 -07:00
Nick Terrell 19ca3fbc03 [zstdcli] Respect --[no-]compress-literals in benchmark mode 2019-02-15 16:27:39 -08:00
Yann Collet b8701102e0 fixed benchzstd to use new version of benchfn
returning a double type
2019-01-25 15:11:50 -08:00
Yann Collet ededcfca57 fix confusion between unsigned <-> U32
as suggested in #1441.

generally U32 and unsigned are the same thing,
except when they are not ...

case : 32-bit compilation for MIPS (uint32_t == unsigned long)

A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).

Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
  but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
  but the caller user a pointer to U32 instead.

These fixes are useful.

However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.

As a consequence, the amount of code changed is larger than I would expect for such a patch.

Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.

So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.

I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.

Thoughts ?
2018-12-21 18:09:41 -08:00
Yann Collet 18434d76b8 added strerror in comment
as suggested by @felixhandte
2018-12-20 17:27:08 -08:00
Yann Collet ed2fb6bd57 fixed : better error message when dictionary missing
during benchmark.
Also : refactored ZSTD_fillHashTable(),
just for readability (it does the same thing)
2018-12-20 17:20:07 -08:00
Yann Collet d613fd9afe linked btultra2 as strategy9
and ensure zstdbench detects out-of-bound parameters
2018-12-06 19:27:37 -08:00
Yann Collet be9e561da4 changed ZSTD_c_compressionStrategy into ZSTD_c_strategy
also : fixed paramgrill, and limit conditions
2018-12-06 15:00:52 -08:00
Yann Collet 3583d19c4e changed parameter names from ZSTD_p_* to ZSTD_c_*
for naming consistency
2018-12-05 17:26:02 -08:00
Yann Collet 34e146f548 advanced decompression function replaces by normal streaming one
advanced parameters compatible with ZSTD_decompressStream().
2018-12-04 10:28:36 -08:00
Yann Collet d8e215cbee created ZSTD_compress2() and ZSTD_compressStream2()
ZSTD_compress_generic() is renamed ZSTD_compressStream2().

Note that, for the time being,
the "stable" API and advanced one use different parameter planes :
setting parameters using the advanced API does not influence ZSTD_compressStream()
and using ZSTD_initCStream() does not influence parameters for ZSTD_compressStream2().
2018-11-30 11:25:56 -08:00
Yann Collet 41c7d0b1e1 changed hashEveryLog into hashRateLog 2018-11-21 14:36:57 -08:00
Yann Collet e874dacc08 changed searchLength into minMatch
refactored all relevant API and calls
for consistency.
2018-11-20 14:56:07 -08:00
Yann Collet 5c68639186 updated ZSTD_DCtx_reset()
signature and behavior is now the same as ZSTD_CCtx_reset()
2018-11-15 16:12:39 -08:00
Yann Collet 7b0391e37e finalized retrofit of ZSTD_CCtx_reset()
updated all depending sources
2018-11-14 13:05:35 -08:00
Yann Collet 3ba0d6dd27 fixed decode-only test condition 2018-11-13 14:15:12 -08:00
Yann Collet b830ccca5c changed benchfn api
to use structure for function parameters
as it expresses much clearer than a long list of parameters,
since each parameter can now be named.
2018-11-13 13:12:50 -08:00
Yann Collet d38063f8ae separated bench module into benchfn and benchzstd
it shall be possible to use benchfn
without any dependency on zstd.
2018-11-13 11:01:59 -08:00