223 Commits

Author SHA1 Message Date
Yann Collet
de10f56be2 improve high compression ratio for file like #3793
this works great for 32-bit arrays,
notably the synthetic ones, with extreme regularity,
unfortunately, it's not universal,
and in some cases, it's a loss.
Crucially, on average, it's a loss on silesia.
The most negatively impacted file is x-ray.
It deserves an investigation before suggesting it as an evolution.
2024-01-29 23:25:24 -08:00
Nick Terrell
8193250615 Modernize macros to use do { } while (0)
This PR introduces no functional changes. It attempts to change all
macros currently using `{ }` or some variant of that to to
`do { } while (0)`, and introduces trailing `;` where necessary.
There were no bugs found during this migration.

The bug in Visual Studios warning on this has been fixed since VS2015.
Additionally, we have several instances of `do { } while (0)` which have
been present for several releases, so we don't have to worry about
breaking peoples builds.

Fixes Issue #3830.
2023-11-21 20:05:17 -05:00
Nick Terrell
43118da8a7 Stop suppressing pointer-overflow UBSAN errors
* Remove all pointer-overflow suppressions from our UBSAN builds/tests.
* Add `ZSTD_ALLOW_POINTER_OVERFLOW_ATTR` macro to suppress
  pointer-overflow at a per-function level. This is a superior approach
  because it also applies to users who build zstd with UBSAN.
* Add `ZSTD_wrappedPtr{Diff,Add,Sub}()` that use these suppressions.
  The end goal is to only tag these functions with
  `ZSTD_ALLOW_POINTER_OVERFLOW`. But we can start by annoting functions
  that rely on pointer overflow, and gradually transition to using
  these.
* Add `ZSTD_maybeNullPtrAdd()` to simplify pointer addition when the
  pointer may be `NULL`.
* Fix all the fuzzer issues that came up. I'm sure there will be a lot
  more, but these are the ones that came up within a few minutes of
  running the fuzzers, and while running GitHub CI.
2023-09-28 17:35:05 -04:00
W. Felix Handte
1b65803fe7 Reorder Definitions in zstd_opt.c to Group Under Macro Guards (Slightly) 2023-05-22 12:41:48 -04:00
W. Felix Handte
b12e8cb3e7 Merge Ultra and Ultra2 Exclusion
Ultra2 does not exist for dict compression, and so uses ultra. So ultra must
be present if ultra2 is.
2023-05-04 12:18:58 -04:00
W. Felix Handte
6761e1c949 Tweak Ultra/Opt Guards 2023-05-04 12:18:58 -04:00
W. Felix Handte
50cdf84f58 Macro-Exclude Block Compressors from Declaration/Definition 2023-05-04 12:18:58 -04:00
Yonatan Komornik
9420bce8a4
Add init once memory (#3528) (#3529)
- Adds memory type that is guaranteed to have been initialized at least once in the workspace's lifetime.
- Changes tag space in row hash to be based on init once memory.
2023-03-13 13:20:49 -07:00
Yann Collet
71dbe8f9d4 minor: fix conversion warnings 2023-01-04 20:00:04 -08:00
Yann Collet
5434de01e2 improve compression ratio of small alphabets
fix #3328

In situations where the alphabet size is very small,
the evaluation of literal costs from the Optimal Parser is initially incorrect.
It takes some time to converge, during which compression is less efficient.
This is especially important for small files,
because there will not be enough data to converge,
so most of the parsing is selected based on incorrect metrics.

After this patch, the scenario ##3328 gets fixed,
delivering the expected 29 bytes compressed size (smallest known compressed size).
2023-01-03 12:22:37 -08:00
Yann Collet
d07e72bb13 fixed incorrect assert
commented Fweight instead
2022-12-28 17:23:40 -08:00
Yann Collet
4a1a79a512 just add some comments to zstd_opt for improved clarity 2022-12-28 16:24:12 -08:00
W. Felix Handte
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
W. Felix Handte
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
Yann Collet
cc7d23bcec
Merge pull request #2965 from facebook/offbase
Converge sumtype (offset | repcode) numeric representation towards offBase
2022-01-24 15:47:42 -08:00
Yann Collet
ca0135c2fd new Formulation
presumes faster
2022-01-07 14:37:53 -08:00
Yann Collet
9e1b4828e5 enforce a minimum price of 1 bit per literal in the optimal parser 2022-01-07 13:53:48 -08:00
Nick Terrell
4d8a2132d0 [opt] Fix oss-fuzz bug in optimal parser
oss-fuzz uncovered a scenario where we're evaluating the cost of litLength = 131072,
which can't be represented in the zstd format, so we accessed 1 beyond LL_bits.

Fix the issue by making it cost 1 bit more than litLength = 131071.

There are still follow ups:
1. This happened because literals_cost[0] = 0, so the optimal parser chose 36 literals
   over a match. Should we bound literals_cost[literal] > 0, unless the block truly only
   has one literal value?
2. When no matches are found, the cost model isn't updated. In this case no matches were
   found for an entire block. So the literals cost model wasn't updated at all. That made
   the optimal parser think literals_cost[0] = 0, where it is actually quite high, since
   the block was entirely random noise.

Credit to OSS-Fuzz.
2022-01-06 16:10:18 -08:00
Yann Collet
7a18d709ae updated all names to offBase convention 2021-12-29 17:30:43 -08:00
Yann Collet
e909fa627f abstracted storeSeq() sumtype numeric representation from zstd_opt.c 2021-12-28 12:14:33 -08:00
Yann Collet
6fa640ef70 separate newRep() from updateRep()
the new contracts seems to make more sense :
updateRep() updates an array of repeat offsets _in place_,
while newRep() generates a new structure with the updated repeat-offset array.

Most callers are actually expecting the in-place variant,
and a limited sub-section, in `zstd_opt.c` mainly, prefer `newRep()`.
2021-12-28 11:52:33 -08:00
Yann Collet
b7630a474b abstracted usage of offBase sumtype within zstd_lazy.c 2021-12-28 10:59:47 -08:00
Yann Collet
435f5a2e6d fixed regression test assert
optLdm->offset might be == 0 in invalid case.
Only use STORE_OFFSET() after validating it's a correct case.
2021-12-28 09:55:31 -08:00
Yann Collet
1aed962216 introduce macros STORE_OFFSET() and STORE_REPCODE()
this meant to abstract the sumtype representation required
to transfert `offcode` to `ZSTD_storeSeq()`.

Unfortunately, the sumtype numeric representation is currently a leaky abstraction
that has permeated many other parts of the code,
especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`.

While this PR makes a good job a transfering a large nb of call sites
to using the new macros, there are still a few sites where this transformation is more complex,
or where the numeric representation itself it used "as is".

One of the problematics area is the decision to use the numeric format of the sumtype
within the match finders of `zstd_lazy`.

This commit doesn't change the behavior, it only introduces and employes the macros,
but eventually the resulting code remains identical.

At target, if the numeric representation of the sumtype can be completely abstracted
and no other part of the code depends on it,
it will be possible to move it towards something slightly more efficient.
2021-12-23 22:03:30 -08:00
Yann Collet
b77fcac61f change ZSTD_storeSeq() interface to accept matchLength
instead of mlBase.

This removes the need to do `- MINMATCH` at every call site.

The new interface contract is checked with an `assert()`.
2021-12-23 12:03:33 -08:00
Nick Terrell
e5bfaeede7 Improve zstd_opt build speed and size
Use the same trick as we did for zstd_lazy in PR #2828:
* Create one search function specialization for each (dictMode, mls).
* Select the search function pointer at the top of the match finder.

Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into
every function, since `dictMode` is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.

Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.

Compilation time sees a ~4x reduction:

| Compiler | Flags                            | Dev Time (s) | PR Time (s) | Delta |
|----------|----------------------------------|--------------|-------------|-------|
| gcc      | -O3                              |         10.1 |         2.3 |  -77% |
| gcc      | -O3 -fsanitize=address,undefined |         61.1 |        10.2 |  -83% |
| clang    | -O3                              |          9.0 |         2.1 |  -76% |
| clang    | -O3 -fsanitize=address,undefined |         33.5 |         5.1 |  -84% |

Build size is reduced by 150KB - 200KB:

| Compiler | Dev libzstd.a Size (B) | PR libzstd.a Size (B) | Delta |
|----------|------------------------|-----------------------|-------|
| gcc      |                1327476 |               1177108 |  -11% |
| clang    |                1378324 |               1167780 |  -15% |

There is a <2% speed loss in all cases:

| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta  |
|----------|-------|------------------|-----------------|--------|
| gcc      |    16 |             4.78 |            4.72 | -1.25% |
| gcc      |    17 |             3.49 |            3.46 | -0.85% |
| gcc      |    18 |             2.92 |            2.86 | -2.04% |
| gcc      |    19 |             2.61 |            2.61 |  0.00% |
| clang    |    16 |             4.69 |            4.80 |  2.34% |
| clang    |    17 |             3.53 |            3.49 | -1.13% |
| clang    |    18 |             2.86 |            2.85 | -0.34% |
| clang    |    19 |             2.61 |            2.61 |  0.00% |

Fixes Issue #2862.
2021-12-02 14:19:41 -08:00
Nick Terrell
19eb459da3 [linux-kernel] Don't inline function in zstd_opt.c
The optimal parser is unlikely to be used in the linux kernel in
practice. There is no reason these functions should be force inlined,
since we aren't gaining anything, and are losing build size.

| Compiler | Before (Bytes) | After (Bytes) | Delta (Bytes) |
|----------|----------------|---------------|---------------|
| gcc-11   |        1142090 |        952754 |       -189336 |
| clang-12 |        1228402 |        976290 |       -252112 |

This is a temporary solution pending the resolution of PR #2862 in the
`dev` branch.
2021-11-15 20:37:30 -08:00
Nick Terrell
c6c482fe07 [binary-tree] Fix underflow of nbCompares
Fix underflow of `nbCompares` by switching to an `int` and comparing
`nbCompares > 0`. This is a minimal fix, because I don't want to change
the logic. These loops seem to be doing `nbCompares + 1` comparisons.

The bug was reported by Dan Carpenter and found by Smatch static
checker.

https://lore.kernel.org/all/20211008063704.GA5370@kili/
2021-10-08 13:22:55 -07:00
Yann Collet
fa2a4d77c7 constify MatchState* parameter when possible
turns out, it's possible to constify MatchState* parameter
in some parts of the binary tree algorithm,
making it a pure read-only parameter,
as opposed to a mutable state.

This is supposed to be helpful for both maintenance and the compiler.
2021-09-23 08:27:44 -07:00
senhuang42
b5c35d7ea3 Use new paramSwitch enum for LCM, row matchfinder, and block splitter 2021-09-21 14:22:02 -04:00
sen
9d2a45a705
Merge pull request #2778 from senhuang42/opt_inlining_revert
Revert opt outlining change
2021-09-15 14:22:10 -04:00
Sen Huang
bd84e4a9d3 Revert opt outlining change 2021-09-15 09:08:41 -07:00
Yann Collet
b7f46ebc23 use ZSTD_memcpy() for better portability
notably within kernel space
2021-09-08 14:45:53 -07:00
Yann Collet
7fce9a41b5 change update rate to 12/11/11/11
better for large files, and sources with relatively "stable" entropy,
like silesia.tar.
slightly worse for files with rapidly changing entropy,
like Calgary.tar/.

Updated small files tests in fuzzer
2021-09-08 14:05:57 -07:00
Yann Collet
ef78611c26 change update rate to 11/10/10/10
better for larger blocks,
very small inefficiency on small block.
2021-09-08 08:58:28 -07:00
Yann Collet
42a3ed752a removed frequency booster for stat initialization of btultra2
used to be necessary to counter-balance the fixed-weight frequency update
which has been recently changed for an adaptive rate (targeting stable starting frequency stats).
2021-09-08 07:56:43 -07:00
Yann Collet
08ceda3dfc new statistics update policy
small general compression ratio improvement for btopt+ strategies/
2021-09-04 00:52:44 -07:00
Yann Collet
23a9368c45 new starting offcode table for zstd_opt 2021-09-03 17:41:42 -07:00
Yann Collet
27a8bbe265 new initializer for ll price 2021-09-03 16:07:31 -07:00
Yann Collet
f0fc8cb3e1 Disable console notification by default within the library
As a library, the default shouldn't be to write anything on console.
`cover` and `fastcover` have a `g_displayLevel` variable to control this behavior.
It's now set to 0 (no display) by default.
Setting notification to a higher level should be an explicit operation by a console application.
2021-09-03 13:44:07 -07:00
Yann Collet
eab692211e removed pretty-print of sizes in benchmark
This is less appropriate for this mode :
benchmark is about accuracy,
it's important to read the exact values.
2021-09-03 12:51:02 -07:00
Sen Huang
d88c1d95ce Remove inlining for opt 2021-09-01 16:58:57 -04:00
Nick Terrell
46f2710562 [HUF] Improve Huffman encoding speed
Improve Huffman encoding speed by 20% for gcc and 10% for clang.

| Compiler |     Benchmark     | Config  |   Dataset   | Ratio | Speed MB/s (dev) | Speed MB/s (huf-cspeed) | Speed MB/s (huf-cspeed - dev) |
|----------|-------------------|---------|-------------|-------|------------------|-------------------------|-------------------------------|
| gcc      | compress          | level_1 | enwik7      | 2.43  | 253.70           | 258.72                  | 2.0%                          |
| gcc      | compress          | level_1 | silesia     | 2.88  | 341.90           | 348.15                  | 1.8%                          |
| gcc      | compress_literals | level_1 | enwik7      | 1.49  | 761.83           | 912.76                  | 19.8%                         |
| gcc      | compress_literals | level_1 | silesia     | 1.28  | 754.83           | 902.37                  | 19.5%                         |
| gcc      | compress_literals | level_7 | enwik7      | 1.29  | 502.81           | 552.79                  | 9.9%                          |
| gcc      | compress_literals | level_7 | silesia     | 1.11  | 675.97           | 776.44                  | 14.9%                         |
| clang    | compress          | level_1 | enwik7      | 2.43  | 277.54           | 280.98                  | 1.2%                          |
| clang    | compress          | level_1 | silesia     | 2.88  | 369.98           | 375.46                  | 1.5%                          |
| clang    | compress_literals | level_1 | enwik7      | 1.49  | 828.83           | 918.41                  | 10.8%                         |
| clang    | compress_literals | level_1 | silesia     | 1.28  | 815.81           | 905.41                  | 11.0%                         |
| clang    | compress_literals | level_7 | enwik7      | 1.29  | 533.13           | 553.30                  | 3.8%                          |
| clang    | compress_literals | level_7 | silesia     | 1.11  | 714.52           | 775.38                  | 8.5%                          |
2021-07-27 15:10:35 -07:00
Nick Terrell
91c9a247b6 [lib] Fix determinism bug in the optimal parser
`ZSTD_insertBt1()` has a speed optimization that skips the prefix of
very long matches.

40def70387/lib/compress/zstd_opt.c (L476)

This optimization is based off the length longest match found. However,
when indices are reset, we only ensure that we can reference the whole
window starting from `ip`. If the previous block ended with a long match
then `nextToUpdate` could be much less than `ip`. It might be far enough
back that `nextToUpdate < maxDist`, so it doesn't have a full window of
data to reference. This can cause non-determinism bugs, because we may
find a match that is beyond `ip - maxDist`, and may sometimes be
un-referencable, and that match triggers the speed optimization.

The fix is to base the `windowLow` off of the `target` of
`ZSTD_updateTree_internal()`, because anything below that value will be
obsolete by the time `ZSTD_updateTree_internal()` completes.
2021-05-13 17:05:59 -07:00
Nick Terrell
a494308ae9 [copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files
* Switch to yearless copyright per FB policy
* Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources
* Add zstd copyright/license header to the `contrib/linux-kernel` sources
* Update the `tests/test-license.py` to check for yearless copyright
* Improvements to `tests/test-license.py`
* Check `contrib/linux-kernel` in `tests/test-license.py`
2021-03-30 10:30:43 -07:00
Nick Terrell
66e811d782 [license] Update year to 2021 2021-01-04 17:53:52 -05:00
senhuang42
d6911b86be Require LDM matches to be strictly greater in length 2020-10-09 12:56:18 -04:00
senhuang42
b9c8033cde Define kNullRawSeqStore for every file 2020-10-07 19:02:41 -04:00
senhuang42
a6165c1b28 Change matchState_t::ldmSeqStore to pointer 2020-10-07 14:13:57 -04:00
senhuang42
abce708a56 Move posInSequence correction to correct location 2020-10-07 13:56:25 -04:00