1493 Commits

Author SHA1 Message Date
Yann Collet
ffa66a6971 fix speed of --patch-from at high compression mode 2025-02-05 18:41:59 -08:00
Yann Collet
e117d79e22 fix minor alignment warning 2025-02-05 16:13:58 -08:00
Yann Collet
c39424ea87 fix minor alignment warning
this is a prototype definition error:
`_mm_storeu_si128()` should accept a `void*` pointer,
since it explicitly states that it accepts unaligned addresses
yet requiring a `__m128i*` tells otherwise, and requires the compiler the enforce this alignment.
2025-02-05 16:11:54 -08:00
Yann Collet
32dff04d32 fix one minor alignment warning
seems like a prototype interface error:
input parameter should have been `const void*`,
since the documentation is explicit that input doesn't have to be aligned,
but `const __m256i*` makes the compiler enforce it.
2025-02-05 15:46:44 -08:00
Yann Collet
f0b5f65bca fixed minor static function declaration issue
in AVX2 mode only
2025-01-18 22:49:16 -08:00
Yann Collet
19025f3da0
Merge pull request #4238 from szsam/patch-1
fix out-of-bounds array index access
2025-01-15 17:56:41 -08:00
Yann Collet
87f0a4fbe0 restore full equation
do not solve the equation, even though some members cancel each other,
this is done for clarity,
we'll let the compiler do the resolution at compile time.
2025-01-15 17:11:27 -08:00
Yann Collet
8bff69af86 Alignment instruction ZSTD_ALIGNED() in common/compiler.h 2025-01-15 17:11:27 -08:00
Yann Collet
2f3ee8b530 changed code compilation test to employ ZSTD_ARCH_X86_AVX2 2025-01-15 17:11:27 -08:00
Yann Collet
debe3d20d9 removed unused branch 2025-01-15 17:11:27 -08:00
Yann Collet
e3181cfd32 minor code doc update 2025-01-15 17:11:27 -08:00
Yann Collet
aa2cdf964f added compilation-time checks to ensure AVX2 code is valid
since it depends on a specific definition of ZSTD_Sequence structure.
2025-01-15 17:11:27 -08:00
Yann Collet
57a4554192 removed unused variable 2025-01-15 17:11:27 -08:00
Yann Collet
4aaf9cefe9 fix minor conversion warning 2025-01-15 17:11:27 -08:00
Yann Collet
db3d48823a no need for specialized variant
the branch is not in the hot loop
2025-01-15 17:11:27 -08:00
Yann Collet
cd53924eff removed erroneous #includes
that were automatically added by the editor without notification
2025-01-15 17:11:27 -08:00
Yann Collet
ed0a8b8be1 AVX2 version of ZSTD_get1BlockSummary() 2025-01-15 17:11:27 -08:00
Yann Collet
b6a4d5a8ba minor +10% speed improvement for scalar ZSTD_get1BlockSummary() 2025-01-15 17:11:27 -08:00
Yann Collet
8eb2587432 added benchmark for get1BlockSummary() 2025-01-15 17:11:27 -08:00
Yann Collet
8d62164589 control long length within AVX2 implementation 2025-01-15 17:11:27 -08:00
Yann Collet
d1f0e5fb97 fullbench can run a verification function
compressSequencesAndLiterals: fixed long lengths in scalar mode
2025-01-15 17:11:27 -08:00
Yann Collet
886720442f initial implementation (incomplete)
needs to take care of long lengths > 65535
2025-01-15 17:11:27 -08:00
Mingjie Shen
afff3d2cce
return error if block delimiter is not found 2025-01-13 20:52:06 -05:00
Mingjie Shen
e490be895c
fix out-of-bounds array index access 2025-01-13 16:39:34 -05:00
Victor Zhang
d88651e604 Do not vary row matchfinder selection based on availability of SSE2/Neon
Move towards a stronger guarantee of reproducibility by removing this small difference for machines without SSE2/Neon.
The SIMD behavior is now the default for all platforms.
2025-01-03 09:35:18 -08:00
Nick Terrell
1548bfc349 [opt] Fix too short of match getting generated
The optimal parser with LDM enabled using minMatch > 3 could generate a match
length of 3 when minMatch >= 4. This is not allowed.

1. Fix the bug
2. Add validation logic to `ZSTD_buildSeqStore()` in debug mode for all block
   compressors that checks we never generate too short a match. This way we don't
   rely on the `generate_sequences` fuzzer to find this issue.

Credit to OSS-Fuzz
2025-01-03 11:38:41 -05:00
Yann Collet
47cbfc87a9 restore invocation of ZSTD_entropyCompressSeqStore()
in the ZSTD_compressSequences() pipeline
2024-12-20 10:37:01 -08:00
Yann Collet
522adc34eb minor: use MEM_writeLE24()
so that an empty frame needs only 3 bytes of dstCapacity.
2024-12-20 10:37:01 -08:00
Yann Collet
b7a9e69d8d added parameter litCapacity
to ZSTD_compressSequencesAndLiterals()
to enforce the litCapacity >= litSize+8 condition.
2024-12-20 10:37:01 -08:00
Yann Collet
76445bb379 add a check, to return an error if Sequence validation is enabled
since ZSTD_compressSequencesAndLiterals() doesn't support it.
2024-12-20 10:37:01 -08:00
Yann Collet
ab0f1798e8 ensure that srcSize is controlled 2024-12-20 10:37:00 -08:00
Yann Collet
b339efff2b add dedicated error code for special case
ZSTD_compressSequencesAndLiterals() cannot produce an uncompressed block
2024-12-20 10:37:00 -08:00
Yann Collet
0a54f6f288 ZSTD_compressSequencesAndLiterals requires srcSize as parameter
this makes it possible to adjust windowSize to its tightest.
2024-12-20 10:37:00 -08:00
Yann Collet
b7b4e86347 fixed minor conversion warning 2024-12-20 10:37:00 -08:00
Yann Collet
12c47d3262 improved speed of the Sequences converter 2024-12-20 10:37:00 -08:00
Yann Collet
95ad9e47ff added benchmark for ZSTD_convertBlockSequences_wBlockDelim() 2024-12-20 10:37:00 -08:00
Yann Collet
d48e330ae1 change name to ZSTD_convertSequences*() 2024-12-20 10:37:00 -08:00
Yann Collet
31b5ef2539 ZSTD_compressSequencesAndLiterals() now supports multi-blocks frames. 2024-12-20 10:36:59 -08:00
Yann Collet
5164d44dab change advanced parameter name: ZSTD_c_repcodeResolution
and updated its documentation.
Note: older name ZSTD_c_searchForExternalRepcodes remains supported via #define
2024-12-20 10:36:59 -08:00
Yann Collet
ca8bd83373 minor: cleaner function parameter repcodeResolution 2024-12-20 10:36:59 -08:00
Yann Collet
d2d0fdac42 updated documentation on validateSequence 2024-12-20 10:36:59 -08:00
Yann Collet
1f6d6815c3 optimization: instantiate specialized version without Sequence checking code
results in +4% compression speed,
thanks to removal of branches in the hot loop.
2024-12-20 10:36:59 -08:00
Yann Collet
a288751de7 minor optimization: only track seqPos->posInSrc when validateSequences is enabled
note: very minor saving, no performance impact
2024-12-20 10:36:59 -08:00
Yann Collet
f281497aef fullbench: new scenario: compressSequencesAndLiterals() 2024-12-20 10:36:59 -08:00
Yann Collet
8ab04097ed add the compressSequences() benchmark scenario 2024-12-20 10:36:59 -08:00
Yann Collet
0b013b2688 added unit tests to ZSTD_compressSequencesAndLiterals()
seems to work as expected,
correctly control that `litSize` and `srcSize` are exactly correct.
2024-12-20 10:36:58 -08:00
Yann Collet
14a21e43b3 produced ZSTD_compressSequencesAndLiterals() as a separate pipeline
only supports explicit delimiter mode, at least for the time being
2024-12-20 10:36:58 -08:00
Yann Collet
bcb15091aa minor: more accurate variable scope 2024-12-20 10:36:58 -08:00
Yann Collet
047db4f1f8 ZSTD_SequenceCopier_f no returns the nb of bytes consumed from input
which feels much more natural
2024-12-20 10:36:58 -08:00
Yann Collet
4ef9d7d585 codemod: ZSTD_cParamMode_e -> ZSTD_CParamMode_e 2024-12-20 10:36:58 -08:00