since it's a type name.
Note: in contrast with previous names, this one is on the Public API side.
So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.
* Remove all pointer-overflow suppressions from our UBSAN builds/tests.
* Add `ZSTD_ALLOW_POINTER_OVERFLOW_ATTR` macro to suppress
pointer-overflow at a per-function level. This is a superior approach
because it also applies to users who build zstd with UBSAN.
* Add `ZSTD_wrappedPtr{Diff,Add,Sub}()` that use these suppressions.
The end goal is to only tag these functions with
`ZSTD_ALLOW_POINTER_OVERFLOW`. But we can start by annoting functions
that rely on pointer overflow, and gradually transition to using
these.
* Add `ZSTD_maybeNullPtrAdd()` to simplify pointer addition when the
pointer may be `NULL`.
* Fix all the fuzzer issues that came up. I'm sure there will be a lot
more, but these are the ones that came up within a few minutes of
running the fuzzers, and while running GitHub CI.
Every 256 bytes the lazy match finders process without finding a match,
they will increase their step size by 1. So for bytes [0, 256) they search
every position, for bytes [256, 512) they search every other position,
and so on. However, they currently still insert every position into
their hash tables. This is different from fast & dfast, which only
insert the positions they search.
This PR changes that, so now after we've searched 2KB without finding
any matches, at which point we'll only be searching one in 9 positions,
we'll stop inserting every position, and only insert the positions we
search. The exact cutoff of 2KB isn't terribly important, I've just
selected a cutoff that is reasonably large, to minimize the impact on
"normal" data.
This PR only adds skipping to greedy, lazy, and lazy2, but does not
touch btlazy2.
| Dataset | Level | Compiler | CSize ∆ | Speed ∆ |
|---------|-------|--------------|---------|---------|
| Random | 5 | clang-14.0.6 | 0.0% | +704% |
| Random | 5 | gcc-12.2.0 | 0.0% | +670% |
| Random | 7 | clang-14.0.6 | 0.0% | +679% |
| Random | 7 | gcc-12.2.0 | 0.0% | +657% |
| Random | 12 | clang-14.0.6 | 0.0% | +1355% |
| Random | 12 | gcc-12.2.0 | 0.0% | +1331% |
| Silesia | 5 | clang-14.0.6 | +0.002% | +0.35% |
| Silesia | 5 | gcc-12.2.0 | +0.002% | +2.45% |
| Silesia | 7 | clang-14.0.6 | +0.001% | -1.40% |
| Silesia | 7 | gcc-12.2.0 | +0.007% | +0.13% |
| Silesia | 12 | clang-14.0.6 | +0.011% | +22.70% |
| Silesia | 12 | gcc-12.2.0 | +0.011% | -6.68% |
| Enwik8 | 5 | clang-14.0.6 | 0.0% | -1.02% |
| Enwik8 | 5 | gcc-12.2.0 | 0.0% | +0.34% |
| Enwik8 | 7 | clang-14.0.6 | 0.0% | -1.22% |
| Enwik8 | 7 | gcc-12.2.0 | 0.0% | -0.72% |
| Enwik8 | 12 | clang-14.0.6 | 0.0% | +26.19% |
| Enwik8 | 12 | gcc-12.2.0 | 0.0% | -5.70% |
The speed difference for clang at level 12 is real, but is probably
caused by some sort of alignment or codegen issues. clang is
significantly slower than gcc before this PR, but gets up to parity with
it.
I also measured the ratio difference for the HC match finder, and it
looks basically the same as the row-based match finder. The speedup on
random data looks similar. And performance is about neutral, without the
big difference at level 12 for either clang or gcc.
* Mark all bufferless and block level functions as deprecated
* Update documentation to suggest not using these functions
* Add `_deprecated()` wrappers for functions that we use internally and
call those instead
Part 2 of #3528
Adds hash salt that helps to avoid regressions where consecutive compressions use the same tag space with similar data (running zstd -b5e7 enwik8 -B128K reproduces this regression).
Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index).
The results is a slight loss in compression ratio (up to 0.2%) and some regressions/improvements to speed depending on level and sample. In turn, we get to save 16% of the hash table's space (5 bytes per entry instead of 6 bytes per entry).
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
* first attempt at fast DMS short cache
* significant wins for some scenarios
* fix all clang regressions
* nits
* fix 1.5% gcc11 regression on hot 110Kdict scenario
* fix CI
* nit
* Add tags to doublefast hash table
* use tags in doublefast DMS
* Fix CI
* Clean up some hardcoded logic / constants
* Switch forCCtx to an enum
* nit
* add short cache to ip+1 long search
* Move tag size into hashLog
* Minor nits
* Truncate dictionaries greater than 16MB in short cache mode
* Helper function for tag comparison
* Cap short cache hashLog at 24 to prevent overflow
* size_t dictTagsMatch -> int dictTagsMatch
* nit
* Clean up and comment dictionary truncation
* Move ZSTD_tableFillPurpose_e next to ZSTD_dictTableLoadMethod_e
* Comment and expand helper functions
* Asserts and documentation
* nit
credit to oss-fuzz
This issue could happen when using the new Sequence Compression API in Explicit Delimiter Mode
with a too small dstCapacity.
In which case, there was one place where the buffer size wasn't checked.