sharpetronics/zstd - zstd - Gitea: Git with a cup of tea

mirror of https://github.com/facebook/zstd.git synced 2025-10-08 00:04:02 -04:00

Author	SHA1	Message	Date
Yann Collet	c97522f7fb	codemod: ZSTD_sequenceFormat_e -> ZSTD_SequenceFormat_e since it's a type name. Note: in contrast with previous names, this one is on the Public API side. So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.	2024-12-20 10:36:56 -08:00
Yann Collet	477a01067f	codemod: symbolEncodingType_e -> SymbolEncodingType_e	2024-12-20 10:36:56 -08:00
Yann Collet	8d4506bc94	codemod: ZSTD_sequenceLength -> ZSTD_SequenceLength	2024-12-20 10:36:55 -08:00
Yann Collet	a2245721ca	codemod: seqStore_t -> SeqStore_t same idea, SeqStore_t is a type name, it should start with a Capital letter.	2024-12-20 10:36:55 -08:00
Yann Collet	9671813375	codemod: seqDef -> SeqDef SeqDef is a type name, so it should start with a Capital letter. It's an internal symbol, no impact on public API.	2024-12-20 10:36:55 -08:00
Yann Collet	b4a40a845f	move Sequences definition to zstd_compress_internal.h they should not be in common/zstd_internal.h, since these definitions are not shared beyond lib/compress/.	2024-12-20 10:36:55 -08:00
Yann Collet	a00f45a037	created ZSTD_storeSeqOnly() makes it possible to register a sequence without copying its literals.	2024-12-20 10:36:04 -08:00
Yann Collet	01474bf73b	add internal compression parameter preBlockSplitter_level not yet exposed to the interface. Also: renames `useBlockSplitter` to `postBlockSplitter` to better qualify the difference between the 2 settings.	2024-10-28 16:31:15 -07:00
Yann Collet	cae8d13294	splitter workspace is now provided by ZSTD_CCtx*	2024-10-23 11:50:56 -07:00
Yann Collet	31d48e9ffa	fixing minor formatting issue in 32-bit mode with logs enabled	2024-10-23 11:50:56 -07:00
Yann Collet	8c38bda935	Merge pull request #4165 from facebook/cspeed_cmov Improve compression speed on small blocks	2024-10-11 16:20:19 -07:00
Yann Collet	186b132495	made search strategy switchable between cmov and branch and use a simple heuristic based on wlog to select between them. note: performance is not good on clang (yet)	2024-10-08 13:52:56 -07:00
Yann Collet	2cc600bab2	refactor search into an inline function for easier swapping with a parameter	2024-10-08 11:10:48 -07:00
Yann Collet	1e7fa242f4	minor refactor zstd_fast make hot variables more local	2024-10-07 11:22:40 -07:00
Ilya Tokar	e8fce38954	Optimize compression by avoiding unpredictable branches Avoid unpredictable branch. Use conditional move to generate the address that is guaranteed to be safe and compare unconditionally. Instead of if (idx < limit && x[idx] == val ) // mispredicted idx < limit branch Do addr = cmov(safe,x+idx) if (*addr == val && idx < limit) // almost always false so well predicted Using microbenchmarks from https://github.com/google/fleetbench, I get about ~10% speed-up: name old cpu/op new cpu/op delta BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:15 1.46ns ± 3% 1.31ns ± 7% -9.88% (p=0.000 n=35+38) BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:16 1.41ns ± 3% 1.28ns ± 3% -9.56% (p=0.000 n=36+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:15 1.61ns ± 1% 1.43ns ± 3% -10.70% (p=0.000 n=30+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:16 1.54ns ± 2% 1.39ns ± 3% -9.21% (p=0.000 n=37+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:15 1.82ns ± 2% 1.61ns ± 3% -11.31% (p=0.000 n=37+40) BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:16 1.73ns ± 3% 1.56ns ± 3% -9.50% (p=0.000 n=38+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:15 2.12ns ± 2% 1.79ns ± 3% -15.55% (p=0.000 n=34+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:16 1.99ns ± 3% 1.72ns ± 3% -13.70% (p=0.000 n=38+38) BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:15 3.22ns ± 3% 2.94ns ± 3% -8.67% (p=0.000 n=38+40) BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:16 3.19ns ± 4% 2.86ns ± 4% -10.55% (p=0.000 n=40+38) BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:15 2.60ns ± 3% 2.22ns ± 3% -14.53% (p=0.000 n=40+39) BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:16 2.46ns ± 3% 2.13ns ± 2% -13.67% (p=0.000 n=39+36) BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:15 2.69ns ± 3% 2.46ns ± 3% -8.63% (p=0.000 n=37+39) BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:16 2.63ns ± 3% 2.36ns ± 3% -10.47% (p=0.000 n=40+40) BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:15 3.20ns ± 2% 2.95ns ± 3% -7.94% (p=0.000 n=35+40) BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:16 3.20ns ± 4% 2.87ns ± 4% -10.33% (p=0.000 n=40+40) I've also measured the impact on internal workloads and saw similar ~10% improvement in performance, measured by cpu usage/byte of data.	2024-09-20 16:07:01 -04:00
Yann Collet	09cb37cbb1	Limit range of operations on Indexes in 32-bit mode and use unsigned type. This reduce risks that an operation produces a negative number when crossing the 2 GB limit.	2024-08-21 11:03:43 -07:00
Yann Collet	cb784edf5d	added android-ndk-build	2024-07-30 11:34:49 -07:00
Federico Maresca	5e9a6c2fe4	Refactor dictionary matchfinder index safety check (#4039 )	2024-05-29 12:35:24 -04:00
Yann Collet	22574d848d	fix issue 5921623844651008 ossfuzz managed to create a scenario which triggers an `assert`. This fixes it, by giving +1 more space for the backward search pass.	2024-02-06 13:01:14 -08:00
Yann Collet	b88c593d8f	added or updated code comments as suggested by @terrelln, to make the code of the optimal parser a bit more understandable.	2024-02-05 18:32:25 -08:00
Yann Collet	5474edbe60	fixed wrong assert by introducing ZSTD_OPT_SIZE	2024-02-03 19:31:53 -08:00
Yann Collet	4683667785	refactor optimal parser store stretches as intermediate solution instead of sequences. makes it possible to link a solution to a predecessor.	2024-01-31 02:51:46 -08:00
Elliot Gorokhovsky	d151a4880b	Move offload API params into ZSTD_CCtx_params	2023-11-27 08:11:01 -08:00
Elliot Gorokhovsky	809c7eb6bf	Refactor ZSTD_sequenceProducer_F typedef to ZSTD_sequenceProducer_F*	2023-11-27 06:56:37 -08:00
Yann Collet	c1e588fcb4	Merge pull request #3771 from DimitriPapadopoulos/codespell Fix new typos found by codespell	2023-10-07 19:29:41 -07:00
Nick Terrell	43118da8a7	Stop suppressing pointer-overflow UBSAN errors * Remove all pointer-overflow suppressions from our UBSAN builds/tests. * Add `ZSTD_ALLOW_POINTER_OVERFLOW_ATTR` macro to suppress pointer-overflow at a per-function level. This is a superior approach because it also applies to users who build zstd with UBSAN. * Add `ZSTD_wrappedPtr{Diff,Add,Sub}()` that use these suppressions. The end goal is to only tag these functions with `ZSTD_ALLOW_POINTER_OVERFLOW`. But we can start by annoting functions that rely on pointer overflow, and gradually transition to using these. * Add `ZSTD_maybeNullPtrAdd()` to simplify pointer addition when the pointer may be `NULL`. * Fix all the fuzzer issues that came up. I'm sure there will be a lot more, but these are the ones that came up within a few minutes of running the fuzzers, and while running GitHub CI.	2023-09-28 17:35:05 -04:00
Dimitri Papadopoulos	fe34776c20	Fix new typos found by codespell	2023-09-23 18:56:01 +02:00
Elliot Gorokhovsky	c6a888c073	suppress false error message in LDM mode	2023-06-21 19:19:02 -07:00
Nick Terrell	a3c3a38b9b	[lazy] Skip over incompressible data Every 256 bytes the lazy match finders process without finding a match, they will increase their step size by 1. So for bytes [0, 256) they search every position, for bytes [256, 512) they search every other position, and so on. However, they currently still insert every position into their hash tables. This is different from fast & dfast, which only insert the positions they search. This PR changes that, so now after we've searched 2KB without finding any matches, at which point we'll only be searching one in 9 positions, we'll stop inserting every position, and only insert the positions we search. The exact cutoff of 2KB isn't terribly important, I've just selected a cutoff that is reasonably large, to minimize the impact on "normal" data. This PR only adds skipping to greedy, lazy, and lazy2, but does not touch btlazy2. \| Dataset \| Level \| Compiler \| CSize ∆ \| Speed ∆ \| \|---------\|-------\|--------------\|---------\|---------\| \| Random \| 5 \| clang-14.0.6 \| 0.0% \| +704% \| \| Random \| 5 \| gcc-12.2.0 \| 0.0% \| +670% \| \| Random \| 7 \| clang-14.0.6 \| 0.0% \| +679% \| \| Random \| 7 \| gcc-12.2.0 \| 0.0% \| +657% \| \| Random \| 12 \| clang-14.0.6 \| 0.0% \| +1355% \| \| Random \| 12 \| gcc-12.2.0 \| 0.0% \| +1331% \| \| Silesia \| 5 \| clang-14.0.6 \| +0.002% \| +0.35% \| \| Silesia \| 5 \| gcc-12.2.0 \| +0.002% \| +2.45% \| \| Silesia \| 7 \| clang-14.0.6 \| +0.001% \| -1.40% \| \| Silesia \| 7 \| gcc-12.2.0 \| +0.007% \| +0.13% \| \| Silesia \| 12 \| clang-14.0.6 \| +0.011% \| +22.70% \| \| Silesia \| 12 \| gcc-12.2.0 \| +0.011% \| -6.68% \| \| Enwik8 \| 5 \| clang-14.0.6 \| 0.0% \| -1.02% \| \| Enwik8 \| 5 \| gcc-12.2.0 \| 0.0% \| +0.34% \| \| Enwik8 \| 7 \| clang-14.0.6 \| 0.0% \| -1.22% \| \| Enwik8 \| 7 \| gcc-12.2.0 \| 0.0% \| -0.72% \| \| Enwik8 \| 12 \| clang-14.0.6 \| 0.0% \| +26.19% \| \| Enwik8 \| 12 \| gcc-12.2.0 \| 0.0% \| -5.70% \| The speed difference for clang at level 12 is real, but is probably caused by some sort of alignment or codegen issues. clang is significantly slower than gcc before this PR, but gets up to parity with it. I also measured the ratio difference for the HC match finder, and it looks basically the same as the row-based match finder. The speedup on random data looks similar. And performance is about neutral, without the big difference at level 12 for either clang or gcc.	2023-03-20 11:18:29 -07:00
Nick Terrell	fbd97f305a	Deprecated bufferless and block level APIs * Mark all bufferless and block level functions as deprecated * Update documentation to suggest not using these functions * Add `_deprecated()` wrappers for functions that we use internally and call those instead	2023-03-16 10:04:15 -07:00
Yonatan Komornik	91f4c23e63	Add salt into row hash (#3528 part 2) (#3533 ) Part 2 of #3528 Adds hash salt that helps to avoid regressions where consecutive compressions use the same tag space with similar data (running zstd -b5e7 enwik8 -B128K reproduces this regression).	2023-03-13 15:34:13 -07:00
Yonatan Komornik	33e39094e7	Reduce RowHash's tag space size by x2 (#3543 ) Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index). The results is a slight loss in compression ratio (up to 0.2%) and some regressions/improvements to speed depending on level and sample. In turn, we get to save 16% of the hash table's space (5 bytes per entry instead of 6 bytes per entry).	2023-03-10 14:15:04 -08:00
Elliot Gorokhovsky	ff42ed1582	Rename "External Matchfinder" to "Block-Level Sequence Producer" (#3484 ) * change "external matchfinder" to "external sequence producer" * migrate contrib/ to new naming convention * fix contrib build * fix error message * update debug strings * fix def of invalid sequences in zstd.h * nit * update CHANGELOG * fix .gitignore	2023-02-09 17:01:17 -05:00
Elliot Gorokhovsky	7f8189ca57	add ZSTD_c_fastExternalSequenceParsing cctxParam	2023-02-01 09:09:53 -08:00
Nick Terrell	b4467c1061	Fix bufferless API with attached dictionary Fixes #3102.	2023-01-20 16:15:16 -08:00
Yann Collet	ea684c335a	added c89 build test to CI	2023-01-19 14:59:30 -08:00
daniellerozenblit	dc1c6cc5df	Merge pull request #3418 from daniellerozenblit/fuzz-max-block-size Fuzz on maxBlockSize	2023-01-19 08:18:04 -05:00
Yann Collet	71dbe8f9d4	minor: fix conversion warnings	2023-01-04 20:00:04 -08:00
Danielle Rozenblit	908e812733	initial commit	2023-01-04 13:01:54 -08:00
Elliot Gorokhovsky	2a402626dd	External matchfinder API (#3333 ) * First building commit with sample matchfinder * Set up ZSTD_externalMatchCtx struct * move seqBuffer to ZSTD_Sequence* * support non-contiguous dictionary * clean up parens * add clearExternalMatchfinder, handle allocation errors * Add useExternalMatchfinder cParam * validate useExternalMatchfinder cParam * Disable LDM + external matchfinder * Check for static CCtx * Validate mState and mStateDestructor * Improve LDM check to cover both branches * Error API with optional fallback * handle RLE properly for external matchfinder * nit * Move to a CDict-like model for resource ownership * Add hidden useExternalMatchfinder bool to CCtx_params_s * Eliminate malloc, move to cwksp allocation * Handle CCtx reset properly * Ensure seqStore has enough space for external sequences * fix capitalization * Add DEBUGLOG statements * Add compressionLevel param to matchfinder API * fix c99 issues and add a param combination error code * nits * Test external matchfinder API * C90 compat for simpleExternalMatchFinder * Fix some @nocommits and an ASAN bug * nit * nit * nits * forward declare copySequencesToSeqStore functions in zstd_compress_internal.h * nit * nit * nits * Update copyright headers * Fix CMake zstreamtest build * Fix copyright headers (again) * typo * Add externalMatchfinder demo program to make contrib * Reduce memory consumption for small blockSize * ZSTD_postProcessExternalMatchFinderResult nits * test sum(matchlen) + sum(litlen) == srcSize in debug builds * refExternalMatchFinder -> registerExternalMatchFinder * C90 nit * zstreamtest nits * contrib nits * contrib nits * allow block splitter + external matchfinder, refactor * add windowSize param * add contrib/externalMatchfinder/README.md * docs * go back to old RLE heuristic because of the first block issue * fix initializer element is not a constant expression * ref contrib from zstd.h * extremely pedantic compiler warning fix, meson fix, typo fix * Additional docs on API limitations * minor nits * Refactor maxNbSeq calculation into a helper function * Fix copyright	2022-12-28 16:45:14 -05:00
W. Felix Handte	5d693cc38c	Coalesce Almost All Copyright Notices to Standard Phrasing ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do sed -i '/Copyright .* $Yann Collet$\\|$Meta Platforms$/ s/Copyright ./Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0.c lib/legacy/zstd_v0*.h nano ./programs/windres/zstd.rc nano ./build/VS2010/zstd/zstd.rc nano ./build/VS2010/libzstd-dll/libzstd-dll.rc ```	2022-12-20 12:52:34 -05:00
W. Felix Handte	8927f985ff	Update Copyright Headers 'Facebook' -> 'Meta Platforms' ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora $ -prune -o -type f); do sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f; done ```	2022-12-20 12:37:57 -05:00
Yann Collet	832c1a6a1c	minor reformatting and minor reliability and maintenance changes	2022-12-18 11:26:57 -08:00
Qiongsi Wu	1b445c1c2e	Fix hash4Ptr for big endian (#3227 )	2022-08-01 10:41:24 -07:00
Elliot Gorokhovsky	cb9e341129	Nits	2022-06-23 16:59:21 -04:00
Elliot Gorokhovsky	93b89fb24b	Add docs	2022-06-22 16:13:07 -04:00
Elliot Gorokhovsky	2a128110d0	Add prefetchCDictTables CCtxParam	2022-06-22 16:13:07 -04:00
Elliot Gorokhovsky	f6ef14329f	"Short cache" optimization for level 1-4 DMS (+5-30% compression speed) (#3152 ) * first attempt at fast DMS short cache * significant wins for some scenarios * fix all clang regressions * nits * fix 1.5% gcc11 regression on hot 110Kdict scenario * fix CI * nit * Add tags to doublefast hash table * use tags in doublefast DMS * Fix CI * Clean up some hardcoded logic / constants * Switch forCCtx to an enum * nit * add short cache to ip+1 long search * Move tag size into hashLog * Minor nits * Truncate dictionaries greater than 16MB in short cache mode * Helper function for tag comparison * Cap short cache hashLog at 24 to prevent overflow * size_t dictTagsMatch -> int dictTagsMatch * nit * Clean up and comment dictionary truncation * Move ZSTD_tableFillPurpose_e next to ZSTD_dictTableLoadMethod_e * Comment and expand helper functions * Asserts and documentation * nit	2022-06-21 17:27:19 -04:00
Elliot Gorokhovsky	db2f4a6532	Move bitwise builtins into bits.h	2022-02-14 11:16:03 -05:00
Yann Collet	cad9f8d5f9	fix 44239 credit to oss-fuzz This issue could happen when using the new Sequence Compression API in Explicit Delimiter Mode with a too small dstCapacity. In which case, there was one place where the buffer size wasn't checked.	2022-02-01 10:49:38 -08:00

1 2 3 4 5 ...

295 Commits