2304 Commits

Author SHA1 Message Date
Adenilson Cavalcanti
a40bad8ec0 [zstd][leak] Avoid memory leak on early return of ZSTD_generateSequence
Sanity checks on a few of the context parameters (i.e. workers and block size)
may prompt an early return on ZSTD_generateSequences.

Allocating the destination buffer past those return points avoids a potential
memory leak.

This patch should fix issue #4112.
2024-08-06 18:01:20 -07:00
Dimitri Papadopoulos
44e83e9180
Fix typos not found by codespell 2024-06-20 20:16:25 +02:00
Dimitri Papadopoulos
2d736d9c50
Fix new typos found by codespell 2024-06-20 20:12:16 +02:00
Federico Maresca
5e9a6c2fe4
Refactor dictionary matchfinder index safety check (#4039) 2024-05-29 12:35:24 -04:00
Yann Collet
c6e5257240
Merge pull request #3977 from facebook/doc_advanced
Doc update
2024-03-21 12:33:15 -07:00
Nick Terrell
731f4b70fc Fix & fuzz ZSTD_generateSequences
This function was seriously flawed:
* It didn't do output bounds checks
* It produced invalid sequences when an uncompressed or RLE block was emitted
* It produced invalid sequences when the block splitter was enabled
* It produced invalid sequences when ZSTD_c_targetCBlockSize was enabled

I've attempted to fix these issues, but this function is just a bad idea,
so I've marked it as deprecated and unsafe. We should replace it with
`ZSTD_extractSequences()` which operates on a compressed frame.
2024-03-21 07:18:05 -07:00
Yann Collet
6f1215b874 fix ZSTD_TARGETCBLOCKSIZE_MIN test
when requested CBlockSize is too low,
bound it to the minimum
instead of returning an error.
2024-03-18 14:10:08 -07:00
Yann Collet
f5728da365 update targetCBlockSize documentation 2024-03-18 12:04:02 -07:00
Yann Collet
c8ab027227 reduce the amount of includes in "cover.h" 2024-03-13 11:29:28 -07:00
Yonatan Komornik
b20703f273
Updates ZSTD_RowFindBestMatch comment (#3947)
Updates the comment on the head of `ZSTD_RowFindBestMatch` to make sure it's aligned with recent changes to the hash table.
2024-03-12 15:10:07 -07:00
Yann Collet
aed172a8fe minor: fix incorrect debug level 2024-03-08 14:29:44 -08:00
Yann Collet
8d31e8ec42 sizeBlockSequences() also tracks uncompressed size
and only defines a sub-block boundary when
it believes that it is compressible.

It's effectively an optimization,
avoiding a compression cycle to reach the same conclusion.
2024-02-26 14:31:12 -08:00
Yann Collet
d23b95d21d minor refactor for clarity
since we can ensure that nbSubBlocks>0
2024-02-26 14:06:34 -08:00
Yann Collet
86db60752d optimization: bail out faster in presence of incompressible data 2024-02-26 13:27:59 -08:00
Yann Collet
ef82b214ad nit: comment indentation
as reported by @terrelln
2024-02-26 13:23:59 -08:00
Yann Collet
aa8592c532 minor: reformulate nbSubBlocks assignment 2024-02-26 13:21:14 -08:00
Yann Collet
e0412c2062 fix extraneous semicolon ';'
as reported by @terrelln
2024-02-26 12:26:54 -08:00
Yann Collet
1fafd0c4ae fix minor visual static analyzer warning
it's a false positive,
but change the code nonetheless to make it more obvious to the static analyzer.
2024-02-25 19:45:32 -08:00
Yann Collet
038a8a906b targetCBlockSize: modified splitting strategy to generate blocks of more regular size
notably avoiding to feature a larger first block
2024-02-25 17:39:29 -08:00
Yann Collet
f8372191f5 reduced minimum compressed block size
with the intention to match the transport layer size,
such as Ethernet and 4G mobile networks.
2024-02-24 01:59:16 -08:00
Yann Collet
4b51526412 fix partial block uncompressed 2024-02-24 01:24:58 -08:00
Yann Collet
6719794379 fixed some regressionTests
but not all
2024-02-23 18:48:29 -08:00
Yann Collet
0591e7eea1 minor: fix overly cautious conversion warning 2024-02-23 16:05:09 -08:00
Yann Collet
3b40100058 fix long sequences (> 64 KB) 2024-02-23 15:35:12 -08:00
Yann Collet
6b11fc436c fix issue with incompressible sections 2024-02-23 14:53:56 -08:00
Yann Collet
cc4530924b speed optimized version of targetCBlockSize
note that the size of individual compressed blocks will vary more wildly with this modification.
But it seems good enough for a first test, and fix the speed regression issue.
Further refinements can be attempted later.
2024-02-23 14:03:26 -08:00
Christoph Grüninger
b921f1aad6 Reduce scope of variables
This improves readability, keeps variables local, and
prevents the unintended use (e.g. typo) later on.
Found by Cppcheck (variableScope)
2024-02-11 22:00:03 +01:00
Yann Collet
b0e8580dc7 fix fuzz issue 5131069967892480 2024-02-08 16:38:20 -08:00
Yann Collet
22574d848d fix issue 5921623844651008
ossfuzz managed to create a scenario which triggers an `assert`.
This fixes it, by giving +1 more space for the backward search pass.
2024-02-06 13:01:14 -08:00
Yann Collet
b88c593d8f added or updated code comments
as suggested by @terrelln,
to make the code of the optimal parser a bit more understandable.
2024-02-05 18:32:25 -08:00
Yann Collet
6c35fb2e8c fix msan warnings 2024-02-05 01:21:06 -08:00
Yann Collet
641749fc09 fix uasan dictionary_stream_round_trip fuzz test 2024-02-05 00:36:10 -08:00
Yann Collet
fe2e2ad36d use ZSTD_memcpy()
which can be redirected in Linux kernel mode
2024-02-03 19:57:38 -08:00
Yann Collet
0ae21d8c31 removed trace control 2024-02-03 19:32:59 -08:00
Yann Collet
5474edbe60 fixed wrong assert
by introducing ZSTD_OPT_SIZE
2024-02-03 19:31:53 -08:00
Yann Collet
e5af24c5fa fixed wrong assert 2024-02-03 17:48:29 -08:00
Yann Collet
8168a451e5 minor optimization, mostly for clarity 2024-02-03 17:26:47 -08:00
Yann Collet
d31018e223 finally, a version that generalizes well
While it's not always strictly a win,
it's a win for files that see a noticeably compression ratio increase,
while it's a very small noise for other files.

Downside is, this patch is less efficient for 32-bit arrays of integer
than the previous patch which was introducing losses for other files,
but it's still a net improvement on this scenario.
2024-02-03 14:26:18 -08:00
Yann Collet
0166b2ba80 modification: differentiate literal update at pos+1
helps when litlen==1 is cheaper than litlen==0

works great on pathological arr[u32] examples
but doesn't generalize well on other files.

silesia/x-ray is amoung the most negatively affected ones.
2024-01-31 11:20:43 -08:00
Yann Collet
4683667785 refactor optimal parser
store stretches as intermediate solution instead of sequences.
makes it possible to link a solution to a predecessor.
2024-01-31 02:51:46 -08:00
Yann Collet
de10f56be2 improve high compression ratio for file like #3793
this works great for 32-bit arrays,
notably the synthetic ones, with extreme regularity,
unfortunately, it's not universal,
and in some cases, it's a loss.
Crucially, on average, it's a loss on silesia.
The most negatively impacted file is x-ray.
It deserves an investigation before suggesting it as an evolution.
2024-01-29 23:25:24 -08:00
Yann Collet
a07cae3976
Merge pull request #3847 from michoecho/fix_nullptr_deref_in_createCDict
Fix a nullptr dereference in ZSTD_createCDict_advanced2()
2023-12-30 13:23:39 -08:00
Elliot Gorokhovsky
c6cabf9441
Make offload API compatible with static CCtx (#3854)
* Add ZSTD_CCtxParams_registerSequenceProducer() to public API

* add unit test

* add docs to zstd.h

* nits

* Add ZSTDLIB_STATIC_API prefix

* Add asserts
2023-12-28 14:48:46 -05:00
Michał Chojnowski
9a3b17c4d6 Fix a nullptr dereference in ZSTD_createCDict_advanced2()
If the relevant allocation returns NULL, ZSTD_createCDict_advanced_internal()
will return NULL. But ZSTD_createCDict_advanced2() doesn't check for
this and attempts to use the returned pointer anyway, which leads to
a segfault.
2023-12-16 13:02:18 +01:00
Elliot Gorokhovsky
d151a4880b Move offload API params into ZSTD_CCtx_params 2023-11-27 08:11:01 -08:00
Elliot Gorokhovsky
809c7eb6bf Refactor ZSTD_sequenceProducer_F typedef to ZSTD_sequenceProducer_F* 2023-11-27 06:56:37 -08:00
Nick Terrell
8193250615 Modernize macros to use do { } while (0)
This PR introduces no functional changes. It attempts to change all
macros currently using `{ }` or some variant of that to to
`do { } while (0)`, and introduces trailing `;` where necessary.
There were no bugs found during this migration.

The bug in Visual Studios warning on this has been fixed since VS2015.
Additionally, we have several instances of `do { } while (0)` which have
been present for several releases, so we don't have to worry about
breaking peoples builds.

Fixes Issue #3830.
2023-11-21 20:05:17 -05:00
Yann Collet
d988e00a7f baby-step towards solving flexArray issue #3785
the flexArray in structure FSE_DecompressWksp
is just a way to derive a pointer easily,
without risk/complexity of calculating it manually.

Not sure if this change is good enough to avoid ubsan warnings though.
2023-10-18 16:21:39 -07:00
Yann Collet
6bb1688c1a extended the fix to ZSTDMT's Buffer Pool 2023-10-08 00:25:17 -07:00
Yann Collet
ea4027c003 removed unused macro constant 2023-10-07 23:32:22 -07:00