238 Commits

Author SHA1 Message Date
Federico Maresca
5e9a6c2fe4
Refactor dictionary matchfinder index safety check (#4039) 2024-05-29 12:35:24 -04:00
Christoph Grüninger
b921f1aad6 Reduce scope of variables
This improves readability, keeps variables local, and
prevents the unintended use (e.g. typo) later on.
Found by Cppcheck (variableScope)
2024-02-11 22:00:03 +01:00
Yann Collet
b0e8580dc7 fix fuzz issue 5131069967892480 2024-02-08 16:38:20 -08:00
Yann Collet
22574d848d fix issue 5921623844651008
ossfuzz managed to create a scenario which triggers an `assert`.
This fixes it, by giving +1 more space for the backward search pass.
2024-02-06 13:01:14 -08:00
Yann Collet
b88c593d8f added or updated code comments
as suggested by @terrelln,
to make the code of the optimal parser a bit more understandable.
2024-02-05 18:32:25 -08:00
Yann Collet
6c35fb2e8c fix msan warnings 2024-02-05 01:21:06 -08:00
Yann Collet
641749fc09 fix uasan dictionary_stream_round_trip fuzz test 2024-02-05 00:36:10 -08:00
Yann Collet
fe2e2ad36d use ZSTD_memcpy()
which can be redirected in Linux kernel mode
2024-02-03 19:57:38 -08:00
Yann Collet
0ae21d8c31 removed trace control 2024-02-03 19:32:59 -08:00
Yann Collet
5474edbe60 fixed wrong assert
by introducing ZSTD_OPT_SIZE
2024-02-03 19:31:53 -08:00
Yann Collet
e5af24c5fa fixed wrong assert 2024-02-03 17:48:29 -08:00
Yann Collet
8168a451e5 minor optimization, mostly for clarity 2024-02-03 17:26:47 -08:00
Yann Collet
d31018e223 finally, a version that generalizes well
While it's not always strictly a win,
it's a win for files that see a noticeably compression ratio increase,
while it's a very small noise for other files.

Downside is, this patch is less efficient for 32-bit arrays of integer
than the previous patch which was introducing losses for other files,
but it's still a net improvement on this scenario.
2024-02-03 14:26:18 -08:00
Yann Collet
0166b2ba80 modification: differentiate literal update at pos+1
helps when litlen==1 is cheaper than litlen==0

works great on pathological arr[u32] examples
but doesn't generalize well on other files.

silesia/x-ray is amoung the most negatively affected ones.
2024-01-31 11:20:43 -08:00
Yann Collet
4683667785 refactor optimal parser
store stretches as intermediate solution instead of sequences.
makes it possible to link a solution to a predecessor.
2024-01-31 02:51:46 -08:00
Yann Collet
de10f56be2 improve high compression ratio for file like #3793
this works great for 32-bit arrays,
notably the synthetic ones, with extreme regularity,
unfortunately, it's not universal,
and in some cases, it's a loss.
Crucially, on average, it's a loss on silesia.
The most negatively impacted file is x-ray.
It deserves an investigation before suggesting it as an evolution.
2024-01-29 23:25:24 -08:00
Nick Terrell
8193250615 Modernize macros to use do { } while (0)
This PR introduces no functional changes. It attempts to change all
macros currently using `{ }` or some variant of that to to
`do { } while (0)`, and introduces trailing `;` where necessary.
There were no bugs found during this migration.

The bug in Visual Studios warning on this has been fixed since VS2015.
Additionally, we have several instances of `do { } while (0)` which have
been present for several releases, so we don't have to worry about
breaking peoples builds.

Fixes Issue #3830.
2023-11-21 20:05:17 -05:00
Nick Terrell
43118da8a7 Stop suppressing pointer-overflow UBSAN errors
* Remove all pointer-overflow suppressions from our UBSAN builds/tests.
* Add `ZSTD_ALLOW_POINTER_OVERFLOW_ATTR` macro to suppress
  pointer-overflow at a per-function level. This is a superior approach
  because it also applies to users who build zstd with UBSAN.
* Add `ZSTD_wrappedPtr{Diff,Add,Sub}()` that use these suppressions.
  The end goal is to only tag these functions with
  `ZSTD_ALLOW_POINTER_OVERFLOW`. But we can start by annoting functions
  that rely on pointer overflow, and gradually transition to using
  these.
* Add `ZSTD_maybeNullPtrAdd()` to simplify pointer addition when the
  pointer may be `NULL`.
* Fix all the fuzzer issues that came up. I'm sure there will be a lot
  more, but these are the ones that came up within a few minutes of
  running the fuzzers, and while running GitHub CI.
2023-09-28 17:35:05 -04:00
W. Felix Handte
1b65803fe7 Reorder Definitions in zstd_opt.c to Group Under Macro Guards (Slightly) 2023-05-22 12:41:48 -04:00
W. Felix Handte
b12e8cb3e7 Merge Ultra and Ultra2 Exclusion
Ultra2 does not exist for dict compression, and so uses ultra. So ultra must
be present if ultra2 is.
2023-05-04 12:18:58 -04:00
W. Felix Handte
6761e1c949 Tweak Ultra/Opt Guards 2023-05-04 12:18:58 -04:00
W. Felix Handte
50cdf84f58 Macro-Exclude Block Compressors from Declaration/Definition 2023-05-04 12:18:58 -04:00
Yonatan Komornik
9420bce8a4
Add init once memory (#3528) (#3529)
- Adds memory type that is guaranteed to have been initialized at least once in the workspace's lifetime.
- Changes tag space in row hash to be based on init once memory.
2023-03-13 13:20:49 -07:00
Yann Collet
71dbe8f9d4 minor: fix conversion warnings 2023-01-04 20:00:04 -08:00
Yann Collet
5434de01e2 improve compression ratio of small alphabets
fix #3328

In situations where the alphabet size is very small,
the evaluation of literal costs from the Optimal Parser is initially incorrect.
It takes some time to converge, during which compression is less efficient.
This is especially important for small files,
because there will not be enough data to converge,
so most of the parsing is selected based on incorrect metrics.

After this patch, the scenario ##3328 gets fixed,
delivering the expected 29 bytes compressed size (smallest known compressed size).
2023-01-03 12:22:37 -08:00
Yann Collet
d07e72bb13 fixed incorrect assert
commented Fweight instead
2022-12-28 17:23:40 -08:00
Yann Collet
4a1a79a512 just add some comments to zstd_opt for improved clarity 2022-12-28 16:24:12 -08:00
W. Felix Handte
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
W. Felix Handte
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
Yann Collet
cc7d23bcec
Merge pull request #2965 from facebook/offbase
Converge sumtype (offset | repcode) numeric representation towards offBase
2022-01-24 15:47:42 -08:00
Yann Collet
ca0135c2fd new Formulation
presumes faster
2022-01-07 14:37:53 -08:00
Yann Collet
9e1b4828e5 enforce a minimum price of 1 bit per literal in the optimal parser 2022-01-07 13:53:48 -08:00
Nick Terrell
4d8a2132d0 [opt] Fix oss-fuzz bug in optimal parser
oss-fuzz uncovered a scenario where we're evaluating the cost of litLength = 131072,
which can't be represented in the zstd format, so we accessed 1 beyond LL_bits.

Fix the issue by making it cost 1 bit more than litLength = 131071.

There are still follow ups:
1. This happened because literals_cost[0] = 0, so the optimal parser chose 36 literals
   over a match. Should we bound literals_cost[literal] > 0, unless the block truly only
   has one literal value?
2. When no matches are found, the cost model isn't updated. In this case no matches were
   found for an entire block. So the literals cost model wasn't updated at all. That made
   the optimal parser think literals_cost[0] = 0, where it is actually quite high, since
   the block was entirely random noise.

Credit to OSS-Fuzz.
2022-01-06 16:10:18 -08:00
Yann Collet
7a18d709ae updated all names to offBase convention 2021-12-29 17:30:43 -08:00
Yann Collet
e909fa627f abstracted storeSeq() sumtype numeric representation from zstd_opt.c 2021-12-28 12:14:33 -08:00
Yann Collet
6fa640ef70 separate newRep() from updateRep()
the new contracts seems to make more sense :
updateRep() updates an array of repeat offsets _in place_,
while newRep() generates a new structure with the updated repeat-offset array.

Most callers are actually expecting the in-place variant,
and a limited sub-section, in `zstd_opt.c` mainly, prefer `newRep()`.
2021-12-28 11:52:33 -08:00
Yann Collet
b7630a474b abstracted usage of offBase sumtype within zstd_lazy.c 2021-12-28 10:59:47 -08:00
Yann Collet
435f5a2e6d fixed regression test assert
optLdm->offset might be == 0 in invalid case.
Only use STORE_OFFSET() after validating it's a correct case.
2021-12-28 09:55:31 -08:00
Yann Collet
1aed962216 introduce macros STORE_OFFSET() and STORE_REPCODE()
this meant to abstract the sumtype representation required
to transfert `offcode` to `ZSTD_storeSeq()`.

Unfortunately, the sumtype numeric representation is currently a leaky abstraction
that has permeated many other parts of the code,
especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`.

While this PR makes a good job a transfering a large nb of call sites
to using the new macros, there are still a few sites where this transformation is more complex,
or where the numeric representation itself it used "as is".

One of the problematics area is the decision to use the numeric format of the sumtype
within the match finders of `zstd_lazy`.

This commit doesn't change the behavior, it only introduces and employes the macros,
but eventually the resulting code remains identical.

At target, if the numeric representation of the sumtype can be completely abstracted
and no other part of the code depends on it,
it will be possible to move it towards something slightly more efficient.
2021-12-23 22:03:30 -08:00
Yann Collet
b77fcac61f change ZSTD_storeSeq() interface to accept matchLength
instead of mlBase.

This removes the need to do `- MINMATCH` at every call site.

The new interface contract is checked with an `assert()`.
2021-12-23 12:03:33 -08:00
Nick Terrell
e5bfaeede7 Improve zstd_opt build speed and size
Use the same trick as we did for zstd_lazy in PR #2828:
* Create one search function specialization for each (dictMode, mls).
* Select the search function pointer at the top of the match finder.

Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into
every function, since `dictMode` is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.

Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.

Compilation time sees a ~4x reduction:

| Compiler | Flags                            | Dev Time (s) | PR Time (s) | Delta |
|----------|----------------------------------|--------------|-------------|-------|
| gcc      | -O3                              |         10.1 |         2.3 |  -77% |
| gcc      | -O3 -fsanitize=address,undefined |         61.1 |        10.2 |  -83% |
| clang    | -O3                              |          9.0 |         2.1 |  -76% |
| clang    | -O3 -fsanitize=address,undefined |         33.5 |         5.1 |  -84% |

Build size is reduced by 150KB - 200KB:

| Compiler | Dev libzstd.a Size (B) | PR libzstd.a Size (B) | Delta |
|----------|------------------------|-----------------------|-------|
| gcc      |                1327476 |               1177108 |  -11% |
| clang    |                1378324 |               1167780 |  -15% |

There is a <2% speed loss in all cases:

| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta  |
|----------|-------|------------------|-----------------|--------|
| gcc      |    16 |             4.78 |            4.72 | -1.25% |
| gcc      |    17 |             3.49 |            3.46 | -0.85% |
| gcc      |    18 |             2.92 |            2.86 | -2.04% |
| gcc      |    19 |             2.61 |            2.61 |  0.00% |
| clang    |    16 |             4.69 |            4.80 |  2.34% |
| clang    |    17 |             3.53 |            3.49 | -1.13% |
| clang    |    18 |             2.86 |            2.85 | -0.34% |
| clang    |    19 |             2.61 |            2.61 |  0.00% |

Fixes Issue #2862.
2021-12-02 14:19:41 -08:00
Nick Terrell
19eb459da3 [linux-kernel] Don't inline function in zstd_opt.c
The optimal parser is unlikely to be used in the linux kernel in
practice. There is no reason these functions should be force inlined,
since we aren't gaining anything, and are losing build size.

| Compiler | Before (Bytes) | After (Bytes) | Delta (Bytes) |
|----------|----------------|---------------|---------------|
| gcc-11   |        1142090 |        952754 |       -189336 |
| clang-12 |        1228402 |        976290 |       -252112 |

This is a temporary solution pending the resolution of PR #2862 in the
`dev` branch.
2021-11-15 20:37:30 -08:00
Nick Terrell
c6c482fe07 [binary-tree] Fix underflow of nbCompares
Fix underflow of `nbCompares` by switching to an `int` and comparing
`nbCompares > 0`. This is a minimal fix, because I don't want to change
the logic. These loops seem to be doing `nbCompares + 1` comparisons.

The bug was reported by Dan Carpenter and found by Smatch static
checker.

https://lore.kernel.org/all/20211008063704.GA5370@kili/
2021-10-08 13:22:55 -07:00
Yann Collet
fa2a4d77c7 constify MatchState* parameter when possible
turns out, it's possible to constify MatchState* parameter
in some parts of the binary tree algorithm,
making it a pure read-only parameter,
as opposed to a mutable state.

This is supposed to be helpful for both maintenance and the compiler.
2021-09-23 08:27:44 -07:00
senhuang42
b5c35d7ea3 Use new paramSwitch enum for LCM, row matchfinder, and block splitter 2021-09-21 14:22:02 -04:00
sen
9d2a45a705
Merge pull request #2778 from senhuang42/opt_inlining_revert
Revert opt outlining change
2021-09-15 14:22:10 -04:00
Sen Huang
bd84e4a9d3 Revert opt outlining change 2021-09-15 09:08:41 -07:00
Yann Collet
b7f46ebc23 use ZSTD_memcpy() for better portability
notably within kernel space
2021-09-08 14:45:53 -07:00
Yann Collet
7fce9a41b5 change update rate to 12/11/11/11
better for large files, and sources with relatively "stable" entropy,
like silesia.tar.
slightly worse for files with rapidly changing entropy,
like Calgary.tar/.

Updated small files tests in fuzzer
2021-09-08 14:05:57 -07:00
Yann Collet
ef78611c26 change update rate to 11/10/10/10
better for larger blocks,
very small inefficiency on small block.
2021-09-08 08:58:28 -07:00