W. Felix Handte
95bdf20a87
Moar Renames
2018-05-23 17:53:03 -04:00
W. Felix Handte
7e0402e738
Also Attach Dict When Source Size is Unknown
2018-05-23 17:53:03 -04:00
W. Felix Handte
3ba70cc759
Clear the Dictionary When Sliding the Window
2018-05-23 17:53:03 -04:00
W. Felix Handte
b05ae9b608
Refine ip Initialization to Avoid ARM Weirdness
2018-05-23 17:53:03 -04:00
W. Felix Handte
1a7b34ef28
Use New Index Invariant to Simplify Conditionals
2018-05-23 17:53:03 -04:00
W. Felix Handte
2d598e6fed
Force Working Context Indices Greater than Dict Indices
2018-05-23 17:53:03 -04:00
W. Felix Handte
d005e5daf4
Whitespace Fix
2018-05-23 17:53:03 -04:00
W. Felix Handte
154eb09419
Switch to Original Match Calc for noDict Repcode Check
2018-05-23 17:53:03 -04:00
W. Felix Handte
191fc74a51
Rename 'hasDict' to 'dictMode'
2018-05-23 17:53:03 -04:00
W. Felix Handte
ae4fcf7816
Respond to PR Comments; Formatting/Style/Lint Fixes
2018-05-23 17:53:03 -04:00
W. Felix Handte
ca26cecc7a
Rename and Reformat
2018-05-23 17:53:03 -04:00
W. Felix Handte
66bc1ca641
Change Cut-Off to 8 KB
2018-05-23 17:53:03 -04:00
W. Felix Handte
c31ee3c7f8
Fix Rep Code Initialization
2018-05-23 17:53:03 -04:00
W. Felix Handte
b67196f30d
Coalesce hasDictMatchState and extDict Checks into One Enum and Rename Stuff
2018-05-23 17:53:03 -04:00
W. Felix Handte
265c2869d1
Split Wrapper Functions to Cause Inlining
2018-05-23 17:53:03 -04:00
W. Felix Handte
6929964d65
Add bounds check in repcode tests
2018-05-23 17:53:03 -04:00
W. Felix Handte
70a537d1d7
Initial Repcode Check Support for Ext Dict Ctx
2018-05-23 17:53:03 -04:00
W. Felix Handte
8d24ff0353
Preliminary Support in ZSTD_compressBlock_fast_generic() for Ext Dict Ctx
2018-05-23 17:53:03 -04:00
W. Felix Handte
d18a405779
Refer to the Dictionary Match State In-Place (Sometimes)
2018-05-23 17:53:03 -04:00
Nick Terrell
e3959d5eba
Fixes
2018-05-22 16:06:33 -07:00
Yann Collet
7a8b3496b4
Merge branch 'dev' into staticDictCost
2018-05-22 15:10:05 -07:00
Yann Collet
a8ddf1d370
disable 2-passes strategy
2018-05-22 15:06:36 -07:00
Nick Terrell
49cf880513
Approximate FSE encoding costs for selection
...
Estimate the cost for using FSE modes `set_basic`, `set_compressed`, and
`set_repeat`, and select the one with the lowest cost.
* The cost of `set_basic` is computed using the cross-entropy cost
function `ZSTD_crossEntropyCost()`, using the normalized default count
and the count.
* The cost of `set_repeat` is computed using `FSE_bitCost()`. We check the
previous table to see if it is able to represent the distribution.
* The cost of `set_compressed` is computed with the entropy cost function
`ZSTD_entropyCost()`, together with the cost of writing the normalized
count `ZSTD_NCountCost()`.
2018-05-22 14:33:22 -07:00
Yann Collet
5381369cb1
Merge branch 'dev' into tableLevels
2018-05-18 18:23:27 -07:00
Yann Collet
b0b3fb517d
updated compression levels for blocks of 256KB
2018-05-18 17:17:12 -07:00
Yann Collet
5cbef6e094
Merge branch 'dev' into staticDictCost
2018-05-18 16:03:06 -07:00
Yann Collet
a95e9e80d1
adding some debug functions to observe statistics
2018-05-18 14:09:42 -07:00
Yann Collet
af3da079d1
fixed minor conversion warning
2018-05-17 17:27:27 -07:00
Yann Collet
8572b4d09f
fixed a pretty complex bug when combining ldm + btultra
2018-05-17 16:13:53 -07:00
Yann Collet
134388ba6b
collect statistics for first block in ultra mode
...
this patch makes btultra do 2 passes on the first block,
the first one being dedicated to collecting statistics
so that the 2nd pass is more accurate.
It translates into a very small compression ratio gain :
enwik7, level 20:
blocks 4K : 2.142 -> 2.153
blocks 16K : 2.447 -> 2.457
blocks 64K : 2.716 -> 2.726
On the other hand, the cpu cost is doubled.
The trade off looks bad.
Though, that's ultimately a price to pay to reach better compression ratio.
So it's only enabled when setting btultra.
2018-05-17 12:24:30 -07:00
Yann Collet
a243020d37
slightly improved weight calculation
...
translating into a tiny compression ratio improvement
2018-05-17 11:19:44 -07:00
Yann Collet
63eeeaa1dd
update table levels for blocks <= 16K
...
also : allow hlog to be slighly larger than windowlog,
as it's apparently good for both speed and compression ratio.
2018-05-16 16:13:37 -07:00
Yann Collet
18fc3d3cd5
introduced bit-fractional cost evaluation
...
this improves compression ratio by a *tiny* amount.
It also reduces speed by a small amount.
Consequently, bit-fractional evaluation is only turned on for btultra.
2018-05-16 14:53:35 -07:00
Nick Terrell
30d9c84b1a
Fix failing Travis tests
2018-05-15 09:46:20 -07:00
Yann Collet
0b31304c8d
Merge branch 'dev' into staticDictCost
2018-05-14 18:09:26 -07:00
Yann Collet
2c26df0e13
opt: removed static prices
...
after testing, it's actually always better to use dynamic prices
albeit initialised from dictionary.
2018-05-14 18:04:08 -07:00
Yann Collet
f372ffc64d
Merge pull request #1127 from facebook/staticDictCost
...
Improved optimal parser with dictionary
2018-05-14 17:45:50 -07:00
Yann Collet
c9227ee16b
update table for 128 KB blocks
2018-05-13 17:15:07 -07:00
Yann Collet
b4250489cf
update compression levels for large inputs
2018-05-13 01:53:38 -07:00
Yann Collet
761758982e
replaced FSE_count by FSE_count_simple
...
to reduce usage of stack memory.
Also : tweaked a few comments, as suggested by @terrelln
2018-05-11 16:03:37 -07:00
Yann Collet
99ddca43a6
fixed wrong assertion
...
base can actually overflow
2018-05-10 19:48:09 -07:00
Yann Collet
09d0fa29ee
minor adjusting of weights
2018-05-10 18:13:48 -07:00
Yann Collet
1a26ec6e8d
opt: init statistics from dictionary
...
instead of starting from fake "default" statistics.
2018-05-10 17:59:12 -07:00
Yann Collet
74b1c75d64
btopt : minor adjustment of update frequencies
2018-05-10 16:32:36 -07:00
Yann Collet
ac6105463a
opt: minor improvements to log traces
...
slight improvement when using fractional-bit evaluation (opt:dictionay)
2018-05-09 15:46:11 -07:00
Yann Collet
c39061cb7b
fixed declaration-after-statement warning
2018-05-09 12:07:25 -07:00
Yann Collet
4d5bd32a00
added traces to look at symbol costs
...
evaluation looks correct.
2018-05-09 12:00:12 -07:00
Yann Collet
c0da0f5e9e
switchable bit-approximation / fractional-bit accuracy modes
...
also : makes it possible to select nb of fractional bits.
2018-05-09 10:48:09 -07:00
Yann Collet
ba2ad9b6b9
implemented fractional bit cost evaluation
...
for FSE symbols.
While it seems to work, the gains are negligible compared to rough maxNbBits evaluation.
There are even a few losses sometimes, that still need to be explained.
Furthermode, there are still cases where btlazy2 does a better job than btopt,
which seems rather strange too.
2018-05-08 17:43:13 -07:00
Yann Collet
1aff63b114
opt: shift all costs by 8 bits (* 256)
...
making it possible to represent fractional bit costs.
2018-05-08 16:19:04 -07:00