4649 Commits

Author SHA1 Message Date
Yann Collet
0166b2ba80 modification: differentiate literal update at pos+1
helps when litlen==1 is cheaper than litlen==0

works great on pathological arr[u32] examples
but doesn't generalize well on other files.

silesia/x-ray is amoung the most negatively affected ones.
2024-01-31 11:20:43 -08:00
Yann Collet
4683667785 refactor optimal parser
store stretches as intermediate solution instead of sequences.
makes it possible to link a solution to a predecessor.
2024-01-31 02:51:46 -08:00
Yann Collet
de10f56be2 improve high compression ratio for file like #3793
this works great for 32-bit arrays,
notably the synthetic ones, with extreme regularity,
unfortunately, it's not universal,
and in some cases, it's a loss.
Crucially, on average, it's a loss on silesia.
The most negatively impacted file is x-ray.
It deserves an investigation before suggesting it as an evolution.
2024-01-29 23:25:24 -08:00
Yann Collet
e6f4b46493 playTests.sh does no longer needs grep -E
it makes the test script more portable across posix systems
because `grep -E` is not guaranteed
while `grep` is fairly common.
2024-01-15 11:16:46 -08:00
Like Ma
66269e74a0 Fix building xxhash on AIX 5.1 2024-01-14 00:09:48 +08:00
Yann Collet
a07cae3976
Merge pull request #3847 from michoecho/fix_nullptr_deref_in_createCDict
Fix a nullptr dereference in ZSTD_createCDict_advanced2()
2023-12-30 13:23:39 -08:00
Elliot Gorokhovsky
c6cabf9441
Make offload API compatible with static CCtx (#3854)
* Add ZSTD_CCtxParams_registerSequenceProducer() to public API

* add unit test

* add docs to zstd.h

* nits

* Add ZSTDLIB_STATIC_API prefix

* Add asserts
2023-12-28 14:48:46 -05:00
Michał Chojnowski
9a3b17c4d6 Fix a nullptr dereference in ZSTD_createCDict_advanced2()
If the relevant allocation returns NULL, ZSTD_createCDict_advanced_internal()
will return NULL. But ZSTD_createCDict_advanced2() doesn't check for
this and attempts to use the returned pointer anyway, which leads to
a segfault.
2023-12-16 13:02:18 +01:00
aimuz
468bb17378
lib/decompress: check for reserved bit corruption in zstd
The patch adds a validation to ensure that the last field, which is
reserved, must be all-zeroes in ZSTD_decodeSeqHeaders. This prevents
potential corruption from going undetected.

Fixes an issue where corrupted input could lead to undefined behavior
due to improper validation of reserved bits.

Signed-off-by: aimuz <mr.imuz@gmail.com>
2023-11-28 21:04:37 +08:00
Elliot Gorokhovsky
d151a4880b Move offload API params into ZSTD_CCtx_params 2023-11-27 08:11:01 -08:00
Elliot Gorokhovsky
809c7eb6bf Refactor ZSTD_sequenceProducer_F typedef to ZSTD_sequenceProducer_F* 2023-11-27 06:56:37 -08:00
Nick Terrell
8193250615 Modernize macros to use do { } while (0)
This PR introduces no functional changes. It attempts to change all
macros currently using `{ }` or some variant of that to to
`do { } while (0)`, and introduces trailing `;` where necessary.
There were no bugs found during this migration.

The bug in Visual Studios warning on this has been fixed since VS2015.
Additionally, we have several instances of `do { } while (0)` which have
been present for several releases, so we don't have to worry about
breaking peoples builds.

Fixes Issue #3830.
2023-11-21 20:05:17 -05:00
Yann Collet
6b3d12fe54
Merge pull request #3820 from facebook/xxh082
update xxhash library to v0.8.2
2023-11-21 09:11:40 -08:00
Nick Terrell
dd4de1dd7a [huf] Fix null pointer addition
`HUF_DecompressFastArgs_init()` was adding 0 to NULL. Fix it by exiting
early for empty outputs. This is no change in behavior, because the
function was already exiting 0 in this case, just slightly later.
2023-11-20 17:13:01 -05:00
Nick Terrell
5ab78c0418 [huf] Improve fast C & ASM performance on small data
* Rename `ilimit` to `ilowest` and set it equal to `src` instead of
  `src + 6 + 8`. This is safe because the fast decoding loops guarantee
  to never read below `ilowest` already. This allows the fast decoder to
  run for at least two more iterations, because it consumes at most 7
  bytes per iteration.
* Continue the fast loop all the way until the number of safe iterations
 is 0. Initially, I thought that when it got towards the end, the
 computation of how many iterations of safe might become expensive. But
 it ends up being slower to have to decode each of the 4 streams
 individually, which makes sense.

This drastically speeds up the Huffman decoder on the `github` dataset
for the issue raised in #3762, measured with `zstd -b1e1r github/`.

| Decoder  | Speed before | Speed after |
|----------|--------------|-------------|
| Fallback | 477 MB/s     | 477 MB/s    |
| Fast C   | 384 MB/s     | 492 MB/s    |
| Assembly | 385 MB/s     | 501 MB/s    |

We can also look at the speed delta for different block sizes of silesia
using `zstd -b1e1r silesia.tar -B#`.

| Decoder  | -B1K ∆ | -B2K ∆ | -B4K ∆ | -B8K ∆ | -B16K ∆ | -B32K ∆ | -B64K ∆ | -B128K ∆ |
|----------|--------|--------|--------|--------|---------|---------|---------|----------|
| Fast C   | +11.2% | +8.2%  | +6.1%  | +4.4%  | +2.7%   | +1.5%   | +0.6%   | +0.2%    |
| Assembly | +12.5% | +9.0%  | +6.2%  | +3.6%  | +1.5%   | +0.7%   | +0.2%   | +0.03%   |
2023-11-20 17:13:01 -05:00
Nick Terrell
c7269add7e [huf] Improve fast huffman decoding speed in linux kernel
gcc in the linux kernel was not unrolling the inner loops of the Huffman
decoder, which was destroying decoding performance. The compiler was
generating crazy code with all sorts of branches. I suspect because of
Spectre mitigations, but I'm not certain. Once the loops were manually
unrolled, performance was restored.

Additionally, when gcc couldn't prove that the variable left shift in
the 4X2 decode loop wasn't greater than 63, it inserted checks to verify
it. To fix this, mask `entry.nbBits & 0x3F`, which allows gcc to eliete
this check. This is a no op, because `entry.nbBits` is guaranteed to be
less than 64.

Lastly, introduce the `HUF_DISABLE_FAST_DECODE` macro to disable the
fast C loops for Issue #3762. So if even after this change, there is a
performance regression, users can opt-out at compile time.
2023-11-20 14:56:46 -05:00
Nick Terrell
e122fcbf58 [debug] Don't define g_debuglevel in the kernel
We only use this constant when `DEBUGLEVEL>=2`, but we get
-Werror=pedantic errors for empty translation units, so still define it
except in kernel environments.

Backport from the kernel:

https://lore.kernel.org/lkml/20230616144400.172683-1-ben.dooks@codethink.co.uk/
2023-11-17 09:54:10 -08:00
Yann Collet
59dcc47579 update license text 2023-11-16 16:19:25 -08:00
Yann Collet
3fd5f9f52d fix the copyright linter 2023-11-13 15:50:42 -08:00
Yann Collet
592b1acb18 update xxhash to v0.8.2
List of updates : https://github.com/Cyan4973/xxHash/releases/tag/v0.8.2

This is also a preparation task before taking care of #3819
2023-11-13 15:42:07 -08:00
Yann Collet
24dabde507 revert to manually defining DTable
thus avoiding the analyzer and ubsan to associate DTable to a size of 1.
2023-10-18 22:45:57 -07:00
Yann Collet
d988e00a7f baby-step towards solving flexArray issue #3785
the flexArray in structure FSE_DecompressWksp
is just a way to derive a pointer easily,
without risk/complexity of calculating it manually.

Not sure if this change is good enough to avoid ubsan warnings though.
2023-10-18 16:21:39 -07:00
Yann Collet
6bb1688c1a extended the fix to ZSTDMT's Buffer Pool 2023-10-08 00:25:17 -07:00
Yann Collet
ea4027c003 removed unused macro constant 2023-10-07 23:32:22 -07:00
Yann Collet
c87ad5bdb5 fixes suggested by @ebiggers 2023-10-07 23:29:42 -07:00
Yann Collet
e8ff7d18eb removed FlexArray pattern from CCtxPool
within ZSTDMT_.
This pattern is flagged by less forgiving variants of ubsan
notably used during compilation of the Linux Kernel.

There are 2 other places in the code where this pattern is used.
This fixes just one of them.
2023-10-07 21:30:08 -07:00
Yann Collet
2b31cb0698
Merge pull request #3763 from dloidolt/fix_lib/README.md
Fix a very small formatting typo in the lib/README.md file
2023-10-07 19:31:36 -07:00
Yann Collet
c1e588fcb4
Merge pull request #3771 from DimitriPapadopoulos/codespell
Fix new typos found by codespell
2023-10-07 19:29:41 -07:00
Nick Terrell
43118da8a7 Stop suppressing pointer-overflow UBSAN errors
* Remove all pointer-overflow suppressions from our UBSAN builds/tests.
* Add `ZSTD_ALLOW_POINTER_OVERFLOW_ATTR` macro to suppress
  pointer-overflow at a per-function level. This is a superior approach
  because it also applies to users who build zstd with UBSAN.
* Add `ZSTD_wrappedPtr{Diff,Add,Sub}()` that use these suppressions.
  The end goal is to only tag these functions with
  `ZSTD_ALLOW_POINTER_OVERFLOW`. But we can start by annoting functions
  that rely on pointer overflow, and gradually transition to using
  these.
* Add `ZSTD_maybeNullPtrAdd()` to simplify pointer addition when the
  pointer may be `NULL`.
* Fix all the fuzzer issues that came up. I'm sure there will be a lot
  more, but these are the ones that came up within a few minutes of
  running the fuzzers, and while running GitHub CI.
2023-09-28 17:35:05 -04:00
Nick Terrell
3daed7017a Revert "Work around nullptr-with-nonzero-offset warning"
This reverts commit c27fa399042f466080e79bb4fd8a4871bc0bcf28.
2023-09-28 17:35:05 -04:00
Dimitri Papadopoulos
fe34776c20
Fix new typos found by codespell 2023-09-23 18:56:01 +02:00
Nick Terrell
cdceb0fce5 Improve macro guards for ZSTD_assertValidSequence
Refine the macro guards to define the functions exactly when they are
needed.

This fixes the chromium build with zstd.

Thanks to @GregTho for reporting!
2023-09-22 16:36:14 -04:00
Dominik Loidolt
48b5a7bd8b Fix a very small formatting typo in the lib/README.md file 2023-09-19 16:22:47 +02:00
Yann Collet
3fc14e411b added some documentation on ZSTD_estimate*Size() variants
as a follow up for #3747
2023-09-13 11:35:19 -07:00
Yann Collet
607933a2ff minor simplification for dependency generation
also : fix zstd-nomt exclusion and test
2023-09-12 13:46:03 -07:00
Yann Collet
f4dbfce79c define LIB_SRCDIR and LIB_BINDIR 2023-09-12 13:46:03 -07:00
Yann Collet
feaa8ac50d renamed STATLIB into STATICLIB
for improved clarity
2023-09-12 13:46:03 -07:00
Yann Collet
b69d06a810 add include guards
alleviate risks of double inclusion (typically via transitive includes)
2023-09-12 13:46:03 -07:00
Yann Collet
1de57bb271
Merge pull request #3733 from ldv-alt/zdictlib_fix_prototype_mismatch
zdictlib: fix prototype mismatch
2023-09-10 06:31:52 -07:00
Nick Terrell
396ef5b434 Fix & refactor Huffman repeat tables for dictionaries
The Huffman repeat mode checker assumed that the CTable was zeroed in the region `[maxSymbolValue + 1, 256)`.
This assumption didn't hold for tables built in the dictionaries, because it didn't go through the same codepath.

Since this code was originally written, we added a header to the CTable that specifies the `tableLog`.
Add `maxSymbolValue` to that header, and check that the table's `maxSymbolValue` is at least the block's `maxSymbolValue`.

This solution is cleaner because we write this header for every CTable we build, so it can't be missed in any code path.

Credit to OSS-Fuzz
2023-08-25 13:21:58 -04:00
Nick Terrell
c27fa39904 Work around nullptr-with-nonzero-offset warning
See comment.
2023-08-25 13:20:59 -04:00
Dmitry V. Levin
ecb86d8286 zdictlib: fix prototype mismatch
Fix the following warnings reported by the compiler when
ZDICTLIB_STATIC_API is not defined to ZDICTLIB_API:

lib/dictBuilder/cover.c:1122:21: warning: redeclaration of 'ZDICT_optimizeTrainFromBuffer_cover' with different visibility (old visibility
preserved)
lib/dictBuilder/cover.c:736:21: warning: redeclaration of 'ZDICT_trainFromBuffer_cover' with different visibility (old visibility
+preserved)
lib/dictBuilder/fastcover.c:549:1: warning: redeclaration of 'ZDICT_trainFromBuffer_fastCover' with different visibility (old visibility
preserved)
lib/dictBuilder/fastcover.c:618:1: warning: redeclaration of 'ZDICT_optimizeTrainFromBuffer_fastCover' with different visibility (old
visibility preserved)
2023-08-23 08:00:00 +00:00
Yann Collet
0fcb28c5d2
Merge pull request #3720 from QBos07/cygwin-msys2-support
Updated Makefiles for full MSYS2 and Cygwin installation and testing …
2023-08-22 16:29:34 -07:00
Nick Terrell
bd02c9be6e No longer reject dictionaries with literals maxSymbolValue < 255
We already have logic in our Huffman encoder to validate Huffman tables with missing symbols.
We use this for higher compression levels to re-use the previous blocks statistics, or when the dictionaries table has zero-weighted symbols.
This check was leftover as an oversight from before we added validation for Huffman tables.

I validated that the `dictionary_loader` fuzzer has coverage of every line in the `ZSTD_loadCEntropy()` function to validate that it is correctly testing this function.
2023-08-22 13:22:35 -04:00
W. Felix Handte
9987d2f594 Unpoison Workspace Memory Before Freeing to Custom Free
MSAN is hooked into the system malloc, but when the user provides a custom
allocator, it may not provide the same cleansing behavior. So if we leave
memory poisoned and return it to the user's allocator, where it is re-used
elsewhere, our poisoning can blow up in some other context.
2023-08-16 12:09:12 -04:00
W. Felix Handte
5f5bdc1e5d Easy: Move Helper Functions Up 2023-08-16 12:08:52 -04:00
Quentin Boswank
78dbba76b8 Updated Makefiles for full MSYS2 and Cygwin installation and testing support.
They are Linux-like environments under Windows and have all the tools needed to support staged installation and testing.

Beware: this only affects the make build system.
2023-08-13 19:44:15 +02:00
Jacob Greenfield
55ff3e4e17 Save one byte on the frame epilogue 2023-07-20 18:59:44 -04:00
Yann Collet
118200f7b9
Merge pull request #3677 from facebook/detectOverflow
Changed the decoding loop to detect more invalid cases of corruption sooner
2023-07-05 00:59:08 -07:00
Yann Collet
25822342be
Merge pull request #3688 from nidhijaju/hide-asm-apple
Hide ASM symbols on Apple platforms
2023-06-29 19:40:37 -07:00