Yann Collet
14a21e43b3
produced ZSTD_compressSequencesAndLiterals() as a separate pipeline
...
only supports explicit delimiter mode, at least for the time being
2024-12-20 10:36:58 -08:00
Yann Collet
bcb15091aa
minor: more accurate variable scope
2024-12-20 10:36:58 -08:00
Yann Collet
047db4f1f8
ZSTD_SequenceCopier_f no returns the nb of bytes consumed from input
...
which feels much more natural
2024-12-20 10:36:58 -08:00
Yann Collet
4ef9d7d585
codemod: ZSTD_cParamMode_e -> ZSTD_CParamMode_e
2024-12-20 10:36:58 -08:00
Yann Collet
56cfb7816a
codemod: ZSTD_paramSwitch_e -> ZSTD_ParamSwitch_e
2024-12-20 10:36:58 -08:00
Yann Collet
13b9296d79
minor simplification
2024-12-20 10:36:58 -08:00
Yann Collet
08edecb78c
codemod: ZSTD_blockCompressor -> ZSTD_BlockCompressor_f
2024-12-20 10:36:57 -08:00
Yann Collet
25bef24c5c
codemod: rawSeqStore_t -> RawSeqStore_t
2024-12-20 10:36:57 -08:00
Yann Collet
41c667c0fd
codemod: repcodes_t -> Repcodes_t
2024-12-20 10:36:57 -08:00
Yann Collet
5df80acedb
codemod: ZSTD_matchState_t -> ZSTD_MatchState_t
2024-12-20 10:36:57 -08:00
Yann Collet
fa468944f2
codemod: ZSTD_buildSeqStore_e -> ZSTD_BuildSeqStore_e
2024-12-20 10:36:57 -08:00
Yann Collet
30671d77af
codemod: ZSTD_sequencePosition -> ZSTD_SequencePosition
2024-12-20 10:36:57 -08:00
Yann Collet
5359d16d8d
enable proper type
2024-12-20 10:36:57 -08:00
Yann Collet
76dd3a98c4
scope: ZSTD_copySequencesToSeqStore*() are private to ZSTD_compress.c
...
no need to publish them outside of this unit.
2024-12-20 10:36:57 -08:00
Yann Collet
1ac79ba1b6
minor: simplify ZSTD_selectSequenceCopier
2024-12-20 10:36:56 -08:00
Yann Collet
894ea31281
codemod: ZSTD_sequenceCopier -> ZSTD_SequenceCopier_f
2024-12-20 10:36:56 -08:00
Yann Collet
c97522f7fb
codemod: ZSTD_sequenceFormat_e -> ZSTD_SequenceFormat_e
...
since it's a type name.
Note: in contrast with previous names, this one is on the Public API side.
So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.
2024-12-20 10:36:56 -08:00
Yann Collet
0165eeb441
created ZSTD_entropyCompressSeqStore_wExtLitBuffer()
...
can receive externally defined buffer of literals
2024-12-20 10:36:56 -08:00
Yann Collet
e9f8a119b4
ZSTD_entropyCompressSeqStore_internal() can accept an externally defined literals buffer
2024-12-20 10:36:56 -08:00
Yann Collet
0442e43aca
codemod: ZSTD_defaultPolicy_e -> ZSTD_DefaultPolicy_e
2024-12-20 10:36:56 -08:00
Yann Collet
477a01067f
codemod: symbolEncodingType_e -> SymbolEncodingType_e
2024-12-20 10:36:56 -08:00
Yann Collet
a2245721ca
codemod: seqStore_t -> SeqStore_t
...
same idea, SeqStore_t is a type name, it should start with a Capital letter.
2024-12-20 10:36:55 -08:00
Yann Collet
9671813375
codemod: seqDef -> SeqDef
...
SeqDef is a type name, so it should start with a Capital letter.
It's an internal symbol, no impact on public API.
2024-12-20 10:36:55 -08:00
Yann Collet
b4a40a845f
move Sequences definition to zstd_compress_internal.h
...
they should not be in common/zstd_internal.h,
since these definitions are not shared beyond lib/compress/.
2024-12-20 10:36:55 -08:00
Yann Collet
a00f45a037
created ZSTD_storeSeqOnly()
...
makes it possible to register a sequence without copying its literals.
2024-12-20 10:36:04 -08:00
Yann Collet
bbaba45589
change experimental parameter name
...
from ZSTD_c_useBlockSplitter to ZSTD_c_splitAfterSequences.
2024-10-31 13:43:40 -07:00
Yann Collet
4f93206d62
changed variable name to ZSTD_c_blockSplitterLevel
...
suggested by @terrelln
2024-10-29 11:12:09 -07:00
Yann Collet
fcbf6b014a
fixed minor conversion warning
2024-10-28 16:47:38 -07:00
Yann Collet
37706a677c
added a test
...
test both that the new parameter works as intended,
and that the over-split protection works as intended
2024-10-28 16:31:15 -07:00
Yann Collet
226ae73311
expose new parameter ZSTD_c_blockSplitter_level
2024-10-28 16:31:15 -07:00
Yann Collet
01474bf73b
add internal compression parameter preBlockSplitter_level
...
not yet exposed to the interface.
Also: renames `useBlockSplitter` to `postBlockSplitter`
to better qualify the difference between the 2 settings.
2024-10-28 16:31:15 -07:00
Yann Collet
e557abc8a0
new block splitting variant _fromBorders
...
less precise but still suitable for `fast` strategy.
2024-10-25 16:13:55 -07:00
Yann Collet
da2c0dffd8
add faster block splitting heuristic, suitable for dfast strategy
2024-10-24 14:37:00 -07:00
Yann Collet
ca6e55cbf5
reduce splitBlock arguments
2024-10-24 13:17:56 -07:00
Yann Collet
566763fdc9
new variant, sampling by 11
2024-10-24 13:17:56 -07:00
Yann Collet
90095f056d
apply limit conditions for all splitting strategies
...
instead of just for blind split.
This is in anticipation of adversarial input,
that would intentionally target the sampling pattern of the split detector.
Note that, even without this protection, splitting can never expand beyond ZSTD_COMPRESSBOUND(),
because this upper limit uses a 1KB block size worst case scenario,
and splitting never creates blocks thath small.
The protection is more to ensure that data is not expanded by more than 3-bytes per 128 KB full block,
which is a much stricter limit.
2024-10-24 11:36:56 -07:00
Yann Collet
c80645a055
stricter limits to ensure expansion factor with blind-split strategy
...
issue reported by @terrelln
2024-10-23 14:55:10 -07:00
Yann Collet
7d3e5e3ba1
split all full 128 KB blocks
...
this helps make the streaming behavior more consistent,
since it does no longer depend on having more data presented on the input.
suggested by @terrelln
2024-10-23 14:18:48 -07:00
Yann Collet
b68ddce818
rewrite fingerprint storage to no longer need 64-bit members
...
so that it can be stored using standard alignment requirement (sizeof(void*)).
Distance function still requires 64-bit signed multiplication though,
so it won't change the issue regarding the bug in ubsan for clang 32-bit on github ci.
2024-10-23 11:50:57 -07:00
Yann Collet
0be334d208
fixes static state allocation check
...
detected by @felixhandte
2024-10-23 11:50:57 -07:00
Yann Collet
ea85dc7af6
conservatively estimate over-splitting in presence of incompressible loss
...
ensure data can never be expanded by more than 3 bytes per full block.
2024-10-23 11:50:57 -07:00
Yann Collet
5ae34e4c96
ensure lastBlock
is correctly determined
...
reported by @terrelln
2024-10-23 11:50:57 -07:00
Yann Collet
a167571db5
added a faster block splitter variant
...
that samples 1 in 5 positions.
This variant is fast enough for lazy2 and btlazy2,
but it's less good in combination with post-splitter at higher levels (>= btopt).
2024-10-23 11:50:57 -07:00
Yann Collet
4ce91cbf2b
fixed workspace alignment on non 64-bit systems
2024-10-23 11:50:57 -07:00
Yann Collet
cae8d13294
splitter workspace is now provided by ZSTD_CCtx*
2024-10-23 11:50:56 -07:00
Yann Collet
73a6653653
ZSTD_splitBlock_4k() uses externally provided workspace
...
ideally, this workspace would be provided from the ZSTD_CCtx* state
2024-10-23 11:50:56 -07:00
Yann Collet
20c3d176cd
fix assert
2024-10-23 11:50:56 -07:00
Yann Collet
0d4b520657
only split full blocks
...
short term simplification
2024-10-23 11:50:56 -07:00
Yann Collet
f83ed087f6
fixed RLE detection test
2024-10-23 11:50:56 -07:00
Yann Collet
83a3402a92
fix overlap write scenario in presence of incompressible data
2024-10-23 11:50:56 -07:00