zstd/tests/fuzz/README.md

# Fuzzing

Each fuzzing target can be built with multiple engines.
Zstd provides a fuzz corpus for each target that can be downloaded with
the command:

```
make corpora
```

It will download each corpus into `./corpora/TARGET`.

## fuzz.py

`fuzz.py` is a helper script for building and running fuzzers.
Run `./fuzz.py -h` for the commands and run `./fuzz.py COMMAND -h` for
command specific help.

### Generating Data

`fuzz.py` provides a utility to generate seed data for each fuzzer.

```
make -C ../tests decodecorpus
./fuzz.py gen TARGET
```

By default it outputs 100 samples, each at most 8KB into `corpora/TARGET-seed`,
but that can be configured with the `--number`, `--max-size-log` and `--seed`
flags.

### Build
It respects the usual build environment variables `CC`, `CFLAGS`, etc.
The environment variables can be overridden with the corresponding flags
`--cc`, `--cflags`, etc.
The specific fuzzing engine is selected with `LIB_FUZZING_ENGINE` or
`--lib-fuzzing-engine`, the default is `libregression.a`.
Alternatively, you can use Clang's built in fuzzing engine with
`--enable-fuzzer`.
It has flags that can easily set up sanitizers `--enable-{a,ub,m}san`, and
coverage instrumentation `--enable-coverage`.
It sets sane defaults which can be overridden with flags `--debug`,
`--enable-ubsan-pointer-overflow`, etc.
Run `./fuzz.py build -h` for help.

### Running Fuzzers

`./fuzz.py` can run `libfuzzer`, `afl`, and `regression` tests.
See the help of the relevant command for options.
Flags not parsed by `fuzz.py` are passed to the fuzzing engine.
The command used to run the fuzzer is printed for debugging.

## LibFuzzer

```
# Build the fuzz targets
./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++
# OR equivalently
CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan
# Run the fuzzer
./fuzz.py libfuzzer TARGET <libfuzzer args like -jobs=4>
```

where `TARGET` could be `simple_decompress`, `stream_round_trip`, etc.

### MSAN

Fuzzing with `libFuzzer` and `MSAN` is as easy as:

```
CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-msan
./fuzz.py libfuzzer TARGET <libfuzzer args>
```

`fuzz.py` respects the environment variables / flags `MSAN_EXTRA_CPPFLAGS`,
`MSAN_EXTRA_CFLAGS`, `MSAN_EXTRA_CXXFLAGS`, `MSAN_EXTRA_LDFLAGS` to easily pass
the extra parameters only for MSAN.

## AFL

The default `LIB_FUZZING_ENGINE` is `libregression.a`, which produces a binary
that AFL can use.

```
# Build the fuzz targets
CC=afl-clang CXX=afl-clang++ ./fuzz.py build all --enable-asan --enable-ubsan
# Run the fuzzer without a memory limit because of ASAN
./fuzz.py afl TARGET -m none
```

## Regression Testing

The regression test supports the `all` target to run all the fuzzers in one
command.

```
CC=clang CXX=clang++ ./fuzz.py build all --enable-asan --enable-ubsan
./fuzz.py regression all
CC=clang CXX=clang++ ./fuzz.py build all --enable-msan
./fuzz.py regression all
```
[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00			`# Fuzzing`

			`Each fuzzing target can be built with multiple engines.`
[fuzz] Mention the corpora in the README 2017-09-25 13:32:50 -07:00			`Zstd provides a fuzz corpus for each target that can be downloaded with`
			`the command:`

			```
			`make corpora`
			```

			It will download each corpus into `./corpora/TARGET`.
[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			`## fuzz.py`

			`fuzz.py` is a helper script for building and running fuzzers.
			Run `./fuzz.py -h` for the commands and run `./fuzz.py COMMAND -h` for
			`command specific help.`

			`### Generating Data`

			`fuzz.py` provides a utility to generate seed data for each fuzzer.

			```
			`make -C ../tests decodecorpus`
			`./fuzz.py gen TARGET`
			```

			By default it outputs 100 samples, each at most 8KB into `corpora/TARGET-seed`,
			but that can be configured with the `--number`, `--max-size-log` and `--seed`
			`flags.`

			`### Build`
			It respects the usual build environment variables `CC`, `CFLAGS`, etc.
			`The environment variables can be overridden with the corresponding flags`
			`--cc`, `--cflags`, etc.
			The specific fuzzing engine is selected with `LIB_FUZZING_ENGINE` or
			`--lib-fuzzing-engine`, the default is `libregression.a`.
[fuzz] Improve fuzzer build script and docs * Remove the `make libFuzzer` target since it is broken and obsoleted by `CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer`. The new `-fsanitize=fuzzer` is much better because it works with MSAN by default. * Improve the `./fuzz.py gen` command by making the input type explicit when creating a new target. * Update the `README` for `--enable-fuzzer`. Fixes #1727. 2019-08-20 11:33:33 -07:00			`Alternatively, you can use Clang's built in fuzzing engine with`
			`--enable-fuzzer`.
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			It has flags that can easily set up sanitizers `--enable-{a,ub,m}san`, and
			coverage instrumentation `--enable-coverage`.
Spelling (#1582) * spelling: accidentally * spelling: across * spelling: additionally * spelling: addresses * spelling: appropriate * spelling: assumed * spelling: available * spelling: builder * spelling: capacity * spelling: compiler * spelling: compressibility * spelling: compressor * spelling: compression * spelling: contract * spelling: convenience * spelling: decompress * spelling: description * spelling: deflate * spelling: deterministically * spelling: dictionary * spelling: display * spelling: eliminate * spelling: preemptively * spelling: exclude * spelling: failure * spelling: independence * spelling: independent * spelling: intentionally * spelling: matching * spelling: maximum * spelling: meaning * spelling: mishandled * spelling: memory * spelling: occasionally * spelling: occurrence * spelling: official * spelling: offsets * spelling: original * spelling: output * spelling: overflow * spelling: overridden * spelling: parameter * spelling: performance * spelling: probability * spelling: receives * spelling: redundant * spelling: recompression * spelling: resources * spelling: sanity * spelling: segment * spelling: series * spelling: specified * spelling: specify * spelling: subtracted * spelling: successful * spelling: return * spelling: translation * spelling: update * spelling: unrelated * spelling: useless * spelling: variables * spelling: variety * spelling: verbatim * spelling: verification * spelling: visited * spelling: warming * spelling: workers * spelling: with 2019-04-12 11:18:11 -07:00			It sets sane defaults which can be overridden with flags `--debug`,
			`--enable-ubsan-pointer-overflow`, etc.
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			Run `./fuzz.py build -h` for help.

			`### Running Fuzzers`

			`./fuzz.py` can run `libfuzzer`, `afl`, and `regression` tests.
			`See the help of the relevant command for options.`
			Flags not parsed by `fuzz.py` are passed to the fuzzing engine.
			`The command used to run the fuzzer is printed for debugging.`

[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00			`## LibFuzzer`

[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			```
			`# Build the fuzz targets`
[fuzz] Improve fuzzer build script and docs * Remove the `make libFuzzer` target since it is broken and obsoleted by `CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer`. The new `-fsanitize=fuzzer` is much better because it works with MSAN by default. * Improve the `./fuzz.py gen` command by making the input type explicit when creating a new target. * Update the `README` for `--enable-fuzzer`. Fixes #1727. 2019-08-20 11:33:33 -07:00			`./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++`
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			`# OR equivalently`
[fuzz] Improve fuzzer build script and docs * Remove the `make libFuzzer` target since it is broken and obsoleted by `CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer`. The new `-fsanitize=fuzzer` is much better because it works with MSAN by default. * Improve the `./fuzz.py gen` command by making the input type explicit when creating a new target. * Update the `README` for `--enable-fuzzer`. Fixes #1727. 2019-08-20 11:33:33 -07:00			`CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan`
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			`# Run the fuzzer`
[fuzz] Improve fuzzer build script and docs * Remove the `make libFuzzer` target since it is broken and obsoleted by `CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer`. The new `-fsanitize=fuzzer` is much better because it works with MSAN by default. * Improve the `./fuzz.py gen` command by making the input type explicit when creating a new target. * Update the `README` for `--enable-fuzzer`. Fixes #1727. 2019-08-20 11:33:33 -07:00			`./fuzz.py libfuzzer TARGET <libfuzzer args like -jobs=4>`
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			```

			where `TARGET` could be `simple_decompress`, `stream_round_trip`, etc.

			`### MSAN`

[fuzz] Improve fuzzer build script and docs * Remove the `make libFuzzer` target since it is broken and obsoleted by `CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer`. The new `-fsanitize=fuzzer` is much better because it works with MSAN by default. * Improve the `./fuzz.py gen` command by making the input type explicit when creating a new target. * Update the `README` for `--enable-fuzzer`. Fixes #1727. 2019-08-20 11:33:33 -07:00			Fuzzing with `libFuzzer` and `MSAN` is as easy as:

			```
			`CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-msan`
			`./fuzz.py libfuzzer TARGET <libfuzzer args>`
			```

[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			`fuzz.py` respects the environment variables / flags `MSAN_EXTRA_CPPFLAGS`,
			`MSAN_EXTRA_CFLAGS`, `MSAN_EXTRA_CXXFLAGS`, `MSAN_EXTRA_LDFLAGS` to easily pass
			`the extra parameters only for MSAN.`
[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00
			`## AFL`

[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			The default `LIB_FUZZING_ENGINE` is `libregression.a`, which produces a binary
			`that AFL can use.`
[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00
			```
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			`# Build the fuzz targets`
			`CC=afl-clang CXX=afl-clang++ ./fuzz.py build all --enable-asan --enable-ubsan`
			`# Run the fuzzer without a memory limit because of ASAN`
			`./fuzz.py afl TARGET -m none`
[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00			```

			`## Regression Testing`

[Fuzz] Improve data generation #1723 2019-09-09 08:43:22 -07:00			The regression test supports the `all` target to run all the fuzzers in one
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			`command.`
[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00
			```
[fuzzer] Update README.md 2017-09-13 18:18:35 -07:00			`CC=clang CXX=clang++ ./fuzz.py build all --enable-asan --enable-ubsan`
			`./fuzz.py regression all`
			`CC=clang CXX=clang++ ./fuzz.py build all --enable-msan`
			`./fuzz.py regression all`
[fuzz] Add libFuzzer targets * The regression driver serves both as a regression test, and as a binary for afl-fuzz. * Next, we want to check in a seed corpus for each target. Then we can run the regression test binary on them on Travis or Circle CI. 2017-06-29 16:53:52 -07:00			```