4238 Commits

Author SHA1 Message Date
Elliot Gorokhovsky
f936dd89cb Minor lint fix 2022-01-20 11:54:43 -07:00
H.J. Lu
d6fcdd123c x86: Append -z cet-report=error to LDFLAGS
Append -z cet-report=error to LDFLAGS if -fcf-protection is enabled by
default in compiler to catch the missing Intel CET marker:

compiling multi-threaded dynamic library 1.5.1
/usr/local/bin/ld: obj/conf_f408b4c825de923ffc88f7f21b6884b1/dynamic/huf_decompress_amd64.o: error: missing IBT and SHSTK properties
collect2: error: ld returned 1 exit status
...
LINK obj/conf_dbc0b41e36c44111bb0bb918e093d7c1/zstd
/usr/local/bin/ld: obj/conf_dbc0b41e36c44111bb0bb918e093d7c1/huf_decompress_amd64.o: error: missing IBT and SHSTK properties
collect2: error: ld returned 1 exit status
2022-01-20 08:32:01 -08:00
Felix Handte
4dfc4eca9a
Merge pull request #2992 from hjl-tools/hjl/cet/dev
x86-64: Enable Intel CET
2022-01-20 11:24:47 -05:00
Wojciech Muła
e74ca7979e Simplify HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop
Get rid of three divisions. The original expression was:

    opmin := min((oend0 - op0) / 10, (oend1 - op1) / 10, (oend2 - op2) / 10, (oend3 - op3) / 10)
    r15   := min(r15, opmin)

The division by 10 can be moved outside the `min`:

    opmin := min(oend0 - op0, oend1 - op1, oend2 - op2, oend3 - op3)
    r15   := min(r15, opmin/10)
2022-01-19 18:38:46 +01:00
Nick Terrell
8ea3d57de4 [build][asm] Pass ASFLAGS to the assembler instead of CFLAGS
* Add `-Wa,--noexecstack` to both `ASFLAGS` and `CFLAGS`
* Pass `ASFLAGS` to `.S` compilation instead of `CFLAGS`

Fixes #3006.
2022-01-18 15:11:29 -08:00
Elliot Gorokhovsky
9b6dfedf0c Documentation and minor refactor to clarify MT memory management. 2022-01-18 09:43:05 -07:00
hjl-tools
ff92884a89
Merge branch 'facebook:dev' into hjl/cet/dev 2022-01-16 10:07:32 -08:00
Felix Handte
f4a552a3fa
Merge pull request #2987 from felixhandte/prepare-v1.5.2
Prepare v1.5.2
2022-01-11 17:47:39 -05:00
H.J. Lu
51ab182bd4 x86-64: Enable Intel CET
Intel Control-flow Enforcement Technology (CET):

https://en.wikipedia.org/wiki/Control-flow_integrity#Intel_Control-flow_Enforcement_Technology

requires that on Linux, all linker input files are marked as CET enabled
in .note.gnu.property section.  For high-level language source codes,
.note.gnu.property section is added by compiler with the -fcf-protection
option.  For assembly sources, include <cet.h> to add .note.gnu.property
section.
2022-01-11 13:19:16 -08:00
H.J. Lu
568c69a4eb x86-64: Hide internal assembly functions
Hide x86-64 internal assembly functions. Before

$ nm -D lib/libzstd.so.1 | grep usingDTable_internal_bmi2_asm_loop
00000000000c23c0 T _HUF_decompress4X1_usingDTable_internal_bmi2_asm_loop
00000000000c23c0 T HUF_decompress4X1_usingDTable_internal_bmi2_asm_loop
00000000000c283d T _HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop
00000000000c283d T HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop
$

After

$ nm -D lib/libzstd.so.1 | grep usingDTable_internal_bmi2_asm_loop
$

This fixes issue #2990.
2022-01-11 10:12:24 -08:00
Yann Collet
ca0135c2fd new Formulation
presumes faster
2022-01-07 14:37:53 -08:00
Yann Collet
9e1b4828e5 enforce a minimum price of 1 bit per literal in the optimal parser 2022-01-07 13:53:48 -08:00
W. Felix Handte
46ad9377e8 Bump Version Number to 1.5.2 2022-01-07 14:14:26 -05:00
Nick Terrell
5f2c3d9720
Merge pull request #2981 from terrelln/asm-license
[license] Fix license header of huf_decompress_amd64.S
2022-01-07 11:06:30 -08:00
Nick Terrell
c7b03c217c [license] Fix license header of huf_decompress_amd64.S
* Add the license header for `huf_decompress_amd64.S`
* Add `.S` files to the `test-license.py` test
2022-01-07 09:35:27 -08:00
Nick Terrell
4d8a2132d0 [opt] Fix oss-fuzz bug in optimal parser
oss-fuzz uncovered a scenario where we're evaluating the cost of litLength = 131072,
which can't be represented in the zstd format, so we accessed 1 beyond LL_bits.

Fix the issue by making it cost 1 bit more than litLength = 131071.

There are still follow ups:
1. This happened because literals_cost[0] = 0, so the optimal parser chose 36 literals
   over a match. Should we bound literals_cost[literal] > 0, unless the block truly only
   has one literal value?
2. When no matches are found, the cost model isn't updated. In this case no matches were
   found for an entire block. So the literals cost model wasn't updated at all. That made
   the optimal parser think literals_cost[0] = 0, where it is actually quite high, since
   the block was entirely random noise.

Credit to OSS-Fuzz.
2022-01-06 16:10:18 -08:00
W. Felix Handte
8dd943e42c Improve Module Map File
This commit makes several changes:

1. It adds modules for the dictionary builder and errors headers.
2. It captures all of the macros that are used to configure these headers.
   When the headers are imported as modules and one of these macros is defined
   the compiler issues a warning that it needs to be defined on the CLI.
3. It promotes the modulemap file into the root of the lib directory.
   Experimentation shows that clang's `-fimplicit-module-maps` will find the
   modulemap when placed here, but not when it's put in a subdirectory.
2022-01-05 18:32:53 -05:00
Felix Handte
7e679511a8
Merge pull request #2964 from felixhandte/noexecstack-all-archs
Mark Huffman Decoder Assembly `noexecstack` on All Architectures
2022-01-05 16:52:39 -05:00
W. Felix Handte
ff5d1daf33 Clean Up Debugging Statements 2022-01-05 16:13:00 -05:00
W. Felix Handte
ef1f9e80ff Restrict GNU-stack Note to GNU Assemblers 2022-01-05 16:03:32 -05:00
W. Felix Handte
b12edddb37 Write GNU-stack Section on All ELF Architectures
Previously we did this only on Linux, which missed other Unices.
2022-01-05 15:44:40 -05:00
W. Felix Handte
4620ce6a9a Makefiles: Add noexecstack Options to Compilation and Linking
Hopefully this marks the binary artifacts `noexecstack` even on platforms
where binaries default to true.
2022-01-05 15:12:31 -05:00
Yann Collet
41ad7332dd Updated expression for better readability 2022-01-04 09:07:11 -08:00
Yann Collet
8c53e526db fix performance issue in scenario #2966 (part 1)
When re-using a compression state, across multiple successive compressions,
the state should minimize the amount of allocation and initialization required.

This mostly matters in situations where initialization is an overwhelming task
compared to compression itself.
This can happen when the amount to compress is small,
while the compression state was given the impression that it would be much larger,
aka, streaming mode without providing a srcSize hint.

This lean-initialization optimization was broken in 980f3bbf8354edec0ad32b4430800f330185de6a .

This commit fixes it, making this scenario once again on par with v1.4.9.

Note that this does not completely fix #2966,
since another heavy initialization, specific to row mode,
is also happening (and was not present in v1.4.9).
This will be fixed in a separate commit.
2021-12-31 15:16:19 -08:00
Yann Collet
6211bfee5e fixed backup prototype for POOL_sizeof() 2021-12-30 14:33:21 -08:00
Yann Collet
b1978d60ee POOL_sizeof() only needs a const read-only reference 2021-12-30 14:08:51 -08:00
Yann Collet
03903f5701 fixed minor compression difference in btlazy2
subtle dependency on sumtype numeric representation
2021-12-29 18:51:03 -08:00
W. Felix Handte
9a9d1ec6f4 Mark Huffman Decoder Assembly noexecstack on All Architectures
Apparently, even when the assembly file is empty (because
`ZSTD_ENABLE_ASM_X86_64_BMI2` is false), it still is marked as possibly
needing an executable stack and so the whole library is marked as such. This
commit applies a simple patch for this problem by moving the noexecstack
indication outside the macro guard.

This commit builds on #2857.

This commit addresses #2963.
2021-12-29 17:47:12 -08:00
Yann Collet
7a18d709ae updated all names to offBase convention 2021-12-29 17:30:43 -08:00
Yann Collet
f92ec5ea54 change the offset|repcode sumtype format to match offBase
directly at ZSTD_storeSeq() interface.

In the process, remove ZSTD_REP_MOVE.

This makes it possible, in future commits,
to update and effectively simplify the naming scheme
to properly label the updated processing pipeline :
offset | repcode => offBase => offCode + offBits
2021-12-29 12:03:36 -08:00
Yann Collet
ad7c9fc11e use ZSTD_memcpy(), for proper redirection within Linux Kernel 2021-12-28 17:41:47 -08:00
Yann Collet
8da414231d found a few more places which were dependent on seqStore offcode sumtype numeric representation 2021-12-28 17:03:24 -08:00
Yann Collet
de9f52e945 regroup all mentions of ZSTD_REP_MOVE within zstd_compress_internal.h 2021-12-28 13:47:57 -08:00
Yann Collet
a34ccad9a6 fixed minor conversion warnings 2021-12-28 13:21:22 -08:00
Yann Collet
92a08eec72 abstracted storeSeq() sumtype numeric representation from zstd_lazy.c 2021-12-28 12:23:39 -08:00
Yann Collet
e909fa627f abstracted storeSeq() sumtype numeric representation from zstd_opt.c 2021-12-28 12:14:33 -08:00
Yann Collet
6fa640ef70 separate newRep() from updateRep()
the new contracts seems to make more sense :
updateRep() updates an array of repeat offsets _in place_,
while newRep() generates a new structure with the updated repeat-offset array.

Most callers are actually expecting the in-place variant,
and a limited sub-section, in `zstd_opt.c` mainly, prefer `newRep()`.
2021-12-28 11:52:33 -08:00
Yann Collet
321583ccf5 fixed minor typecast warnings 2021-12-28 11:38:21 -08:00
Yann Collet
b7630a474b abstracted usage of offBase sumtype within zstd_lazy.c 2021-12-28 10:59:47 -08:00
Yann Collet
435f5a2e6d fixed regression test assert
optLdm->offset might be == 0 in invalid case.
Only use STORE_OFFSET() after validating it's a correct case.
2021-12-28 09:55:31 -08:00
Yann Collet
2068889146 created STORED_*() macros
to act on values stored / expressed in the sumtype numeric representation required by `storedSeq()`.

This makes it possible to abstract away this representation by using the macros to extract these values.

First user : ZSTD_updateRep() .
2021-12-28 06:59:07 -08:00
Yann Collet
1aed962216 introduce macros STORE_OFFSET() and STORE_REPCODE()
this meant to abstract the sumtype representation required
to transfert `offcode` to `ZSTD_storeSeq()`.

Unfortunately, the sumtype numeric representation is currently a leaky abstraction
that has permeated many other parts of the code,
especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`.

While this PR makes a good job a transfering a large nb of call sites
to using the new macros, there are still a few sites where this transformation is more complex,
or where the numeric representation itself it used "as is".

One of the problematics area is the decision to use the numeric format of the sumtype
within the match finders of `zstd_lazy`.

This commit doesn't change the behavior, it only introduces and employes the macros,
but eventually the resulting code remains identical.

At target, if the numeric representation of the sumtype can be completely abstracted
and no other part of the code depends on it,
it will be possible to move it towards something slightly more efficient.
2021-12-23 22:03:30 -08:00
Yann Collet
bec7bbb5a4 Merge branch 'dev' into seqStore_off 2021-12-23 18:03:17 -08:00
Yann Collet
aeff128331 change seqDef.offset into seqDef.offBase
to better reflect the value stored in this field.
2021-12-23 17:56:08 -08:00
Yann Collet
75525fcb9f library optimization flag can be selected on command line again
`CFLAGS=-O0 make`
will now use `-O0` instead of enforcing `-O3`
which used to be the behavior before introduction of `libzstd.mk`.

This should result in faster tests,
since a few tests depend on this capability for faster roundtrips.
2021-12-23 17:43:12 -08:00
Yann Collet
e145b58cfd changed seqDef.matchLength into seqDef.mlBase
since this is effectively what is stored in this field (== matchLength - MINMATCH).
This makes it clearer what needs to be done when reading from / writing to this field.
2021-12-23 13:39:46 -08:00
Yann Collet
b77fcac61f change ZSTD_storeSeq() interface to accept matchLength
instead of mlBase.

This removes the need to do `- MINMATCH` at every call site.

The new interface contract is checked with an `assert()`.
2021-12-23 12:03:33 -08:00
Yann Collet
a9e43b37d0
Revert "Limit ZSTD_maxCLevel to 21 for 32-bit binaries." 2021-12-20 11:43:14 -08:00
Yann Collet
f829c32258 forgot the chainlog is effectively a "fake" value with rowHash
the only value which makes sense is `hashlog-1`
as it mimics the real memory usage.
2021-12-16 11:37:40 -08:00
Yann Collet
db1b408a2f rebalance lazy compression levels 2021-12-15 21:33:31 -08:00