Elliot Gorokhovsky
3fe5f1fbb9
assert externalRepSearch != ZSTD_ps_auto
2023-02-01 18:24:46 -08:00
Elliot Gorokhovsky
7f8189ca57
add ZSTD_c_fastExternalSequenceParsing cctxParam
2023-02-01 09:09:53 -08:00
Elliot Gorokhovsky
64052ef57d
Guard against invalid sequences from external matchfinders ( #3465 )
2023-01-31 13:55:48 -05:00
daniellerozenblit
00176638e3
Merge pull request #3460 from daniellerozenblit/fix-long-offsets-resolution-pointer
...
fix long offset resolution
2023-01-30 14:02:51 -05:00
daniellerozenblit
2bde9fbf85
Update lib/compress/zstd_compress.c
...
Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2023-01-27 16:58:53 -05:00
Nick Terrell
423a74986f
[fse] Delete unused functions
...
Delete all unused FSE functions, now that we are no longer syncing
to/from upstream.
This avoids confusion about Zstd's stack usage like in Issue #3453 .
It also removes dead code, which is always a plus.
2023-01-27 13:15:07 -08:00
Danielle Rozenblit
9e4c66b9e9
record long offsets in ZSTD_symbolEncodingTypeStats_t + add test case
2023-01-27 12:04:29 -08:00
Danielle Rozenblit
814f4bfb99
fix long offset resolution
2023-01-27 08:21:47 -08:00
daniellerozenblit
f3255bfeff
Merge pull request #3447 from daniellerozenblit/fuzz-sequence-compression
...
Fuzz large offsets through sequence compression api
2023-01-25 09:27:34 -05:00
Yonatan Komornik
1d636b4ba0
Bug fix redzones by unpoisoning only the intended buffer and not the followup redzone.
2023-01-24 12:54:43 -08:00
Danielle Rozenblit
7d600c628a
fix bound check for ZSTD_copySequencesToSeqStoreNoBlockDelim()
2023-01-24 06:40:40 -08:00
daniellerozenblit
9116000be6
Merge pull request #3439 from daniellerozenblit/sequence-validation-bug-fix
...
Fix sequence validation and seqStore bounds check
2023-01-23 13:50:37 -05:00
Danielle Rozenblit
815d1d4eda
update external sequence error to fit error naming scheme
2023-01-23 09:58:34 -08:00
Danielle Rozenblit
1b65727e74
fix nits and add new error code for invalid external sequences
2023-01-23 07:59:02 -08:00
Nick Terrell
b4467c1061
Fix bufferless API with attached dictionary
...
Fixes #3102 .
2023-01-20 16:15:16 -08:00
Nick Terrell
329169189c
Replace Huffman boolean args with flags bit set
2023-01-20 14:12:53 -08:00
Nick Terrell
0cc1b0cb22
Delete unused Huffman functions
...
Remove all Huffman functions that aren't used by zstd.
2023-01-20 14:12:53 -08:00
Yann Collet
6742f20a7f
Merge pull request #3435 from facebook/c89build
...
added c89 build test to CI
2023-01-20 14:07:12 -08:00
Nick Terrell
666944fbe6
Cap hashLog & chainLog to ensure that we only use 32 bits of hash
...
* Cap shortCache chainLog to 24
* Cap row match finder hashLog so that rowLog <= 24
* Add unit tests to expose all cases. The row match finder unit tests
are only run in 64-bit mode, because they allocate ~1GB.
Fixes #3336
2023-01-20 14:05:26 -08:00
Danielle Rozenblit
aa385ece13
fix sequence validation and bounds check in ZSTD_copySequencesToSeqStore()
2023-01-20 10:32:35 -08:00
Yann Collet
ea684c335a
added c89 build test to CI
2023-01-19 14:59:30 -08:00
Elliot Gorokhovsky
bce0382c82
Bugfixes for the External Matchfinder API ( #3433 )
...
* external matchfinder bugfixes + tests
* small doc fix
2023-01-19 10:41:24 -05:00
daniellerozenblit
dc1c6cc5df
Merge pull request #3418 from daniellerozenblit/fuzz-max-block-size
...
Fuzz on maxBlockSize
2023-01-19 08:18:04 -05:00
Danielle Rozenblit
8353a4b095
fix maxBlockSize resolution + add test cases
2023-01-17 12:24:18 -08:00
Yann Collet
ac45e078a5
add explanation about new test
...
as requested by @terrelln
2023-01-12 15:49:01 -08:00
Yann Collet
796699c0bc
fix root cause of #3416
...
A minor change in 5434de0 changed a `<=` into a `<`,
and as an indirect consequence allowed compression attempt of literals when there are only 6 literals to compress
(previous limit was effectively 7 literals).
This is not in itself a problem, as the threshold is merely an heuristic,
but it emerged a bug that has always been there, and was just never triggered so far due to the previous limit.
This bug would make the literal compressor believes that all literals are the same symbol,
but for the exact case where nbLiterals==6, plus a pretty wild combination of other limit conditions,
this outcome could be false, resulting in data corruption.
Replaced the blind heuristic by an actual test for all limit cases,
so that even if the threshold is changed again in the future,
the detection of RLE mode will remain reliable.
2023-01-12 15:41:08 -08:00
Danielle Rozenblit
06b096db47
additional tests and documentation updates + allow maxBlockSize to be set to 0 (goes to default)
2023-01-12 13:41:50 -08:00
Danielle Rozenblit
53eb5a758c
add simple test for maxBlockSize expected functionality
2023-01-12 08:55:39 -08:00
Danielle Rozenblit
1fffcfe01d
update minimum threshold for max block size
2023-01-11 11:09:57 -08:00
Danielle Rozenblit
fe08137d9a
resolve max block value in cctx and use when calculating the max block size
2023-01-09 07:53:53 -08:00
Yann Collet
71dbe8f9d4
minor: fix conversion warnings
2023-01-04 20:00:04 -08:00
daniellerozenblit
d913417f72
Merge branch 'dev' into fuzz-max-block-size
2023-01-04 16:34:07 -05:00
Danielle Rozenblit
908e812733
initial commit
2023-01-04 13:01:54 -08:00
Yann Collet
ebba9ff425
update regression results
2023-01-03 14:04:23 -08:00
Yann Collet
5434de01e2
improve compression ratio of small alphabets
...
fix #3328
In situations where the alphabet size is very small,
the evaluation of literal costs from the Optimal Parser is initially incorrect.
It takes some time to converge, during which compression is less efficient.
This is especially important for small files,
because there will not be enough data to converge,
so most of the parsing is selected based on incorrect metrics.
After this patch, the scenario ##3328 gets fixed,
delivering the expected 29 bytes compressed size (smallest known compressed size).
2023-01-03 12:22:37 -08:00
daniellerozenblit
1c818e3a0a
Merge pull request #3302 from daniellerozenblit/optimal-huff-depth-speed
...
Optimal huff depth speed improvements
2023-01-03 12:51:51 -05:00
Danielle Rozenblit
df714ddb0f
implement suggestions
2023-01-03 07:20:21 -08:00
Yann Collet
d07e72bb13
fixed incorrect assert
...
commented Fweight instead
2022-12-28 17:23:40 -08:00
Yann Collet
4a1a79a512
just add some comments to zstd_opt for improved clarity
2022-12-28 16:24:12 -08:00
Yann Collet
481a2e1010
Merge pull request #3403 from facebook/setCParams
...
ZSTD_CCtx_setCParams
2022-12-28 14:07:13 -08:00
Elliot Gorokhovsky
2a402626dd
External matchfinder API ( #3333 )
...
* First building commit with sample matchfinder
* Set up ZSTD_externalMatchCtx struct
* move seqBuffer to ZSTD_Sequence*
* support non-contiguous dictionary
* clean up parens
* add clearExternalMatchfinder, handle allocation errors
* Add useExternalMatchfinder cParam
* validate useExternalMatchfinder cParam
* Disable LDM + external matchfinder
* Check for static CCtx
* Validate mState and mStateDestructor
* Improve LDM check to cover both branches
* Error API with optional fallback
* handle RLE properly for external matchfinder
* nit
* Move to a CDict-like model for resource ownership
* Add hidden useExternalMatchfinder bool to CCtx_params_s
* Eliminate malloc, move to cwksp allocation
* Handle CCtx reset properly
* Ensure seqStore has enough space for external sequences
* fix capitalization
* Add DEBUGLOG statements
* Add compressionLevel param to matchfinder API
* fix c99 issues and add a param combination error code
* nits
* Test external matchfinder API
* C90 compat for simpleExternalMatchFinder
* Fix some @nocommits and an ASAN bug
* nit
* nit
* nits
* forward declare copySequencesToSeqStore functions in zstd_compress_internal.h
* nit
* nit
* nits
* Update copyright headers
* Fix CMake zstreamtest build
* Fix copyright headers (again)
* typo
* Add externalMatchfinder demo program to make contrib
* Reduce memory consumption for small blockSize
* ZSTD_postProcessExternalMatchFinderResult nits
* test sum(matchlen) + sum(litlen) == srcSize in debug builds
* refExternalMatchFinder -> registerExternalMatchFinder
* C90 nit
* zstreamtest nits
* contrib nits
* contrib nits
* allow block splitter + external matchfinder, refactor
* add windowSize param
* add contrib/externalMatchfinder/README.md
* docs
* go back to old RLE heuristic because of the first block issue
* fix initializer element is not a constant expression
* ref contrib from zstd.h
* extremely pedantic compiler warning fix, meson fix, typo fix
* Additional docs on API limitations
* minor nits
* Refactor maxNbSeq calculation into a helper function
* Fix copyright
2022-12-28 16:45:14 -05:00
Yann Collet
b17743e41b
Signal parameter change during MT compression
2022-12-28 13:14:58 -08:00
Yann Collet
89342d1e07
New xp library symbol : ZSTD_CCtx_setCParams()
...
Inspired by #3395 ,
offer a new capability to set all parameters defined in a ZSTD_compressionParameters structure
with a single symbol invocation
to improve user code brevity.
2022-12-27 23:49:22 -08:00
Yann Collet
089b2797e3
Merge pull request #3398 from facebook/fix3316
...
spec update : require minimum nb of literals for 4-streams mode
2022-12-22 16:57:05 -08:00
Yann Collet
6a9c525903
spec update : require minimum nb of literals for 4-streams mode
...
Reported by @shulib :
the specification for 4-streams mode
doesn't work when the amount of literals to compress is 5 bytes.
Extending it, it also doesn't work for sizes 1 or 2.
This patch updates the specification and the implementation
to require a minimum of 6 literals to trigger or accept the 4-streams mode.
The impact is expected to be a no-op :
the 4-streams mode is never triggered for such small quantity of literals anyway,
since it would be wasteful (it costs ~7.3 bytes more than single-stream mode).
An informal lower limit is set at ~256 bytes,
so the technical minimum is very far from this limit.
This is just meant for completeness of the specification.
2022-12-22 16:14:34 -08:00
Yann Collet
ea2895cef4
Support decompression of compressed blocks of size ZSTD_BLOCKSIZE_MAX exactly
2022-12-22 12:40:27 -08:00
Nick Terrell
40a7188130
Fix make clangbuild
& add CI
...
Fix the errors for:
* `-Wdocumentation`
* `-Wconversion` except `-Wsign-conversion`
2022-12-21 17:31:04 -08:00
Danielle Rozenblit
c26f348dc8
fix CI errors
2022-12-20 12:43:46 -08:00
Danielle Rozenblit
482689b995
huf log speed optimization: unidirectional scan of logs + break when regressing
2022-12-20 12:27:38 -08:00
W. Felix Handte
5d693cc38c
Coalesce Almost All Copyright Notices to Standard Phrasing
...
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done
git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00