Yann Collet
d48e330ae1
change name to ZSTD_convertSequences*()
2024-12-20 10:37:00 -08:00
Yann Collet
31b5ef2539
ZSTD_compressSequencesAndLiterals() now supports multi-blocks frames.
2024-12-20 10:36:59 -08:00
Yann Collet
5164d44dab
change advanced parameter name: ZSTD_c_repcodeResolution
...
and updated its documentation.
Note: older name ZSTD_c_searchForExternalRepcodes remains supported via #define
2024-12-20 10:36:59 -08:00
Yann Collet
ca8bd83373
minor: cleaner function parameter repcodeResolution
2024-12-20 10:36:59 -08:00
Yann Collet
d2d0fdac42
updated documentation on validateSequence
2024-12-20 10:36:59 -08:00
Yann Collet
1f6d6815c3
optimization: instantiate specialized version without Sequence checking code
...
results in +4% compression speed,
thanks to removal of branches in the hot loop.
2024-12-20 10:36:59 -08:00
Yann Collet
a288751de7
minor optimization: only track seqPos->posInSrc when validateSequences is enabled
...
note: very minor saving, no performance impact
2024-12-20 10:36:59 -08:00
Yann Collet
f281497aef
fullbench: new scenario: compressSequencesAndLiterals()
2024-12-20 10:36:59 -08:00
Yann Collet
8ab04097ed
add the compressSequences() benchmark scenario
2024-12-20 10:36:59 -08:00
Yann Collet
0b013b2688
added unit tests to ZSTD_compressSequencesAndLiterals()
...
seems to work as expected,
correctly control that `litSize` and `srcSize` are exactly correct.
2024-12-20 10:36:58 -08:00
Yann Collet
14a21e43b3
produced ZSTD_compressSequencesAndLiterals() as a separate pipeline
...
only supports explicit delimiter mode, at least for the time being
2024-12-20 10:36:58 -08:00
Yann Collet
bcb15091aa
minor: more accurate variable scope
2024-12-20 10:36:58 -08:00
Yann Collet
047db4f1f8
ZSTD_SequenceCopier_f no returns the nb of bytes consumed from input
...
which feels much more natural
2024-12-20 10:36:58 -08:00
Yann Collet
4ef9d7d585
codemod: ZSTD_cParamMode_e -> ZSTD_CParamMode_e
2024-12-20 10:36:58 -08:00
Yann Collet
56cfb7816a
codemod: ZSTD_paramSwitch_e -> ZSTD_ParamSwitch_e
2024-12-20 10:36:58 -08:00
Yann Collet
13b9296d79
minor simplification
2024-12-20 10:36:58 -08:00
Yann Collet
08edecb78c
codemod: ZSTD_blockCompressor -> ZSTD_BlockCompressor_f
2024-12-20 10:36:57 -08:00
Yann Collet
25bef24c5c
codemod: rawSeqStore_t -> RawSeqStore_t
2024-12-20 10:36:57 -08:00
Yann Collet
41c667c0fd
codemod: repcodes_t -> Repcodes_t
2024-12-20 10:36:57 -08:00
Yann Collet
5df80acedb
codemod: ZSTD_matchState_t -> ZSTD_MatchState_t
2024-12-20 10:36:57 -08:00
Yann Collet
fa468944f2
codemod: ZSTD_buildSeqStore_e -> ZSTD_BuildSeqStore_e
2024-12-20 10:36:57 -08:00
Yann Collet
30671d77af
codemod: ZSTD_sequencePosition -> ZSTD_SequencePosition
2024-12-20 10:36:57 -08:00
Yann Collet
5359d16d8d
enable proper type
2024-12-20 10:36:57 -08:00
Yann Collet
76dd3a98c4
scope: ZSTD_copySequencesToSeqStore*() are private to ZSTD_compress.c
...
no need to publish them outside of this unit.
2024-12-20 10:36:57 -08:00
Yann Collet
1ac79ba1b6
minor: simplify ZSTD_selectSequenceCopier
2024-12-20 10:36:56 -08:00
Yann Collet
894ea31281
codemod: ZSTD_sequenceCopier -> ZSTD_SequenceCopier_f
2024-12-20 10:36:56 -08:00
Yann Collet
c97522f7fb
codemod: ZSTD_sequenceFormat_e -> ZSTD_SequenceFormat_e
...
since it's a type name.
Note: in contrast with previous names, this one is on the Public API side.
So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.
2024-12-20 10:36:56 -08:00
Yann Collet
0165eeb441
created ZSTD_entropyCompressSeqStore_wExtLitBuffer()
...
can receive externally defined buffer of literals
2024-12-20 10:36:56 -08:00
Yann Collet
e9f8a119b4
ZSTD_entropyCompressSeqStore_internal() can accept an externally defined literals buffer
2024-12-20 10:36:56 -08:00
Yann Collet
0442e43aca
codemod: ZSTD_defaultPolicy_e -> ZSTD_DefaultPolicy_e
2024-12-20 10:36:56 -08:00
Yann Collet
477a01067f
codemod: symbolEncodingType_e -> SymbolEncodingType_e
2024-12-20 10:36:56 -08:00
Yann Collet
a2245721ca
codemod: seqStore_t -> SeqStore_t
...
same idea, SeqStore_t is a type name, it should start with a Capital letter.
2024-12-20 10:36:55 -08:00
Yann Collet
9671813375
codemod: seqDef -> SeqDef
...
SeqDef is a type name, so it should start with a Capital letter.
It's an internal symbol, no impact on public API.
2024-12-20 10:36:55 -08:00
Yann Collet
b4a40a845f
move Sequences definition to zstd_compress_internal.h
...
they should not be in common/zstd_internal.h,
since these definitions are not shared beyond lib/compress/.
2024-12-20 10:36:55 -08:00
Yann Collet
a00f45a037
created ZSTD_storeSeqOnly()
...
makes it possible to register a sequence without copying its literals.
2024-12-20 10:36:04 -08:00
Yann Collet
bbaba45589
change experimental parameter name
...
from ZSTD_c_useBlockSplitter to ZSTD_c_splitAfterSequences.
2024-10-31 13:43:40 -07:00
Yann Collet
4f93206d62
changed variable name to ZSTD_c_blockSplitterLevel
...
suggested by @terrelln
2024-10-29 11:12:09 -07:00
Yann Collet
fcbf6b014a
fixed minor conversion warning
2024-10-28 16:47:38 -07:00
Yann Collet
37706a677c
added a test
...
test both that the new parameter works as intended,
and that the over-split protection works as intended
2024-10-28 16:31:15 -07:00
Yann Collet
226ae73311
expose new parameter ZSTD_c_blockSplitter_level
2024-10-28 16:31:15 -07:00
Yann Collet
01474bf73b
add internal compression parameter preBlockSplitter_level
...
not yet exposed to the interface.
Also: renames `useBlockSplitter` to `postBlockSplitter`
to better qualify the difference between the 2 settings.
2024-10-28 16:31:15 -07:00
Yann Collet
e557abc8a0
new block splitting variant _fromBorders
...
less precise but still suitable for `fast` strategy.
2024-10-25 16:13:55 -07:00
Yann Collet
da2c0dffd8
add faster block splitting heuristic, suitable for dfast strategy
2024-10-24 14:37:00 -07:00
Yann Collet
ca6e55cbf5
reduce splitBlock arguments
2024-10-24 13:17:56 -07:00
Yann Collet
566763fdc9
new variant, sampling by 11
2024-10-24 13:17:56 -07:00
Yann Collet
90095f056d
apply limit conditions for all splitting strategies
...
instead of just for blind split.
This is in anticipation of adversarial input,
that would intentionally target the sampling pattern of the split detector.
Note that, even without this protection, splitting can never expand beyond ZSTD_COMPRESSBOUND(),
because this upper limit uses a 1KB block size worst case scenario,
and splitting never creates blocks thath small.
The protection is more to ensure that data is not expanded by more than 3-bytes per 128 KB full block,
which is a much stricter limit.
2024-10-24 11:36:56 -07:00
Yann Collet
c80645a055
stricter limits to ensure expansion factor with blind-split strategy
...
issue reported by @terrelln
2024-10-23 14:55:10 -07:00
Yann Collet
7d3e5e3ba1
split all full 128 KB blocks
...
this helps make the streaming behavior more consistent,
since it does no longer depend on having more data presented on the input.
suggested by @terrelln
2024-10-23 14:18:48 -07:00
Yann Collet
b68ddce818
rewrite fingerprint storage to no longer need 64-bit members
...
so that it can be stored using standard alignment requirement (sizeof(void*)).
Distance function still requires 64-bit signed multiplication though,
so it won't change the issue regarding the bug in ubsan for clang 32-bit on github ci.
2024-10-23 11:50:57 -07:00
Yann Collet
0be334d208
fixes static state allocation check
...
detected by @felixhandte
2024-10-23 11:50:57 -07:00