sharpetronics/zstd - zstd - Gitea: Git with a cup of tea

mirror of https://github.com/facebook/zstd.git synced 2025-12-04 00:04:23 -05:00

Author	SHA1	Message	Date
Yann Collet	ac45e078a5	add explanation about new test as requested by @terrelln	2023-01-12 15:49:01 -08:00
Yann Collet	796699c0bc	fix root cause of #3416 A minor change in 5434de0 changed a `<=` into a `<`, and as an indirect consequence allowed compression attempt of literals when there are only 6 literals to compress (previous limit was effectively 7 literals). This is not in itself a problem, as the threshold is merely an heuristic, but it emerged a bug that has always been there, and was just never triggered so far due to the previous limit. This bug would make the literal compressor believes that all literals are the same symbol, but for the exact case where nbLiterals==6, plus a pretty wild combination of other limit conditions, this outcome could be false, resulting in data corruption. Replaced the blind heuristic by an actual test for all limit cases, so that even if the threshold is changed again in the future, the detection of RLE mode will remain reliable.	2023-01-12 15:41:08 -08:00
Danielle Rozenblit	06b096db47	additional tests and documentation updates + allow maxBlockSize to be set to 0 (goes to default)	2023-01-12 13:41:50 -08:00
Danielle Rozenblit	53eb5a758c	add simple test for maxBlockSize expected functionality	2023-01-12 08:55:39 -08:00
Danielle Rozenblit	1fffcfe01d	update minimum threshold for max block size	2023-01-11 11:09:57 -08:00
Danielle Rozenblit	fe08137d9a	resolve max block value in cctx and use when calculating the max block size	2023-01-09 07:53:53 -08:00
Yann Collet	71dbe8f9d4	minor: fix conversion warnings	2023-01-04 20:00:04 -08:00
daniellerozenblit	d913417f72	Merge branch 'dev' into fuzz-max-block-size	2023-01-04 16:34:07 -05:00
Danielle Rozenblit	908e812733	initial commit	2023-01-04 13:01:54 -08:00
Yann Collet	ebba9ff425	update regression results	2023-01-03 14:04:23 -08:00
Yann Collet	5434de01e2	improve compression ratio of small alphabets fix #3328 In situations where the alphabet size is very small, the evaluation of literal costs from the Optimal Parser is initially incorrect. It takes some time to converge, during which compression is less efficient. This is especially important for small files, because there will not be enough data to converge, so most of the parsing is selected based on incorrect metrics. After this patch, the scenario ##3328 gets fixed, delivering the expected 29 bytes compressed size (smallest known compressed size).	2023-01-03 12:22:37 -08:00
daniellerozenblit	1c818e3a0a	Merge pull request #3302 from daniellerozenblit/optimal-huff-depth-speed Optimal huff depth speed improvements	2023-01-03 12:51:51 -05:00
Danielle Rozenblit	df714ddb0f	implement suggestions	2023-01-03 07:20:21 -08:00
Yann Collet	d07e72bb13	fixed incorrect assert commented Fweight instead	2022-12-28 17:23:40 -08:00
Yann Collet	4a1a79a512	just add some comments to zstd_opt for improved clarity	2022-12-28 16:24:12 -08:00
Yann Collet	9fbbd74871	Merge pull request #3400 from danlark1/dev Move deprecated annotation before static to allow C++ compilation for clang	2022-12-28 15:50:26 -08:00
Yann Collet	00c85b28e7	update ZSTD_CCts_setCParams() inline documentation specify behavior when changing compression parameters during MT compression, reported by @embg	2022-12-28 15:08:18 -08:00
Yann Collet	481a2e1010	Merge pull request #3403 from facebook/setCParams ZSTD_CCtx_setCParams	2022-12-28 14:07:13 -08:00
Elliot Gorokhovsky	2a402626dd	External matchfinder API (#3333 ) * First building commit with sample matchfinder * Set up ZSTD_externalMatchCtx struct * move seqBuffer to ZSTD_Sequence* * support non-contiguous dictionary * clean up parens * add clearExternalMatchfinder, handle allocation errors * Add useExternalMatchfinder cParam * validate useExternalMatchfinder cParam * Disable LDM + external matchfinder * Check for static CCtx * Validate mState and mStateDestructor * Improve LDM check to cover both branches * Error API with optional fallback * handle RLE properly for external matchfinder * nit * Move to a CDict-like model for resource ownership * Add hidden useExternalMatchfinder bool to CCtx_params_s * Eliminate malloc, move to cwksp allocation * Handle CCtx reset properly * Ensure seqStore has enough space for external sequences * fix capitalization * Add DEBUGLOG statements * Add compressionLevel param to matchfinder API * fix c99 issues and add a param combination error code * nits * Test external matchfinder API * C90 compat for simpleExternalMatchFinder * Fix some @nocommits and an ASAN bug * nit * nit * nits * forward declare copySequencesToSeqStore functions in zstd_compress_internal.h * nit * nit * nits * Update copyright headers * Fix CMake zstreamtest build * Fix copyright headers (again) * typo * Add externalMatchfinder demo program to make contrib * Reduce memory consumption for small blockSize * ZSTD_postProcessExternalMatchFinderResult nits * test sum(matchlen) + sum(litlen) == srcSize in debug builds * refExternalMatchFinder -> registerExternalMatchFinder * C90 nit * zstreamtest nits * contrib nits * contrib nits * allow block splitter + external matchfinder, refactor * add windowSize param * add contrib/externalMatchfinder/README.md * docs * go back to old RLE heuristic because of the first block issue * fix initializer element is not a constant expression * ref contrib from zstd.h * extremely pedantic compiler warning fix, meson fix, typo fix * Additional docs on API limitations * minor nits * Refactor maxNbSeq calculation into a helper function * Fix copyright	2022-12-28 16:45:14 -05:00
Yann Collet	b17743e41b	Signal parameter change during MT compression	2022-12-28 13:14:58 -08:00
Yann Collet	89342d1e07	New xp library symbol : ZSTD_CCtx_setCParams() Inspired by #3395, offer a new capability to set all parameters defined in a ZSTD_compressionParameters structure with a single symbol invocation to improve user code brevity.	2022-12-27 23:49:22 -08:00
Daniel Kutenin	48f4aa7307	Move deprecated annotation before static to allow C++ compilation for clang This fixes last 2 instances of https://github.com/facebook/zstd/issues/3250	2022-12-23 12:07:31 +00:00
Yann Collet	089b2797e3	Merge pull request #3398 from facebook/fix3316 spec update : require minimum nb of literals for 4-streams mode	2022-12-22 16:57:05 -08:00
Yann Collet	6a9c525903	spec update : require minimum nb of literals for 4-streams mode Reported by @shulib : the specification for 4-streams mode doesn't work when the amount of literals to compress is 5 bytes. Extending it, it also doesn't work for sizes 1 or 2. This patch updates the specification and the implementation to require a minimum of 6 literals to trigger or accept the 4-streams mode. The impact is expected to be a no-op : the 4-streams mode is never triggered for such small quantity of literals anyway, since it would be wasteful (it costs ~7.3 bytes more than single-stream mode). An informal lower limit is set at ~256 bytes, so the technical minimum is very far from this limit. This is just meant for completeness of the specification.	2022-12-22 16:14:34 -08:00
Yann Collet	ea2895cef4	Support decompression of compressed blocks of size ZSTD_BLOCKSIZE_MAX exactly	2022-12-22 12:40:27 -08:00
Nick Terrell	40a7188130	Fix `make clangbuild` & add CI Fix the errors for: * `-Wdocumentation` * `-Wconversion` except `-Wsign-conversion`	2022-12-21 17:31:04 -08:00
Danielle Rozenblit	c26f348dc8	fix CI errors	2022-12-20 12:43:46 -08:00
Danielle Rozenblit	482689b995	huf log speed optimization: unidirectional scan of logs + break when regressing	2022-12-20 12:27:38 -08:00
Nick Terrell	e4018c4e7f	[docs] Clarify dictionary loading documentation Reinforce that loading a new dictionary clears the current dictionary. Except for the multiple-ddict mode.	2022-12-20 11:10:49 -08:00
W. Felix Handte	5d693cc38c	Coalesce Almost All Copyright Notices to Standard Phrasing ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do sed -i '/Copyright .* $Yann Collet$\\|$Meta Platforms$/ s/Copyright ./Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0.c lib/legacy/zstd_v0*.h nano ./programs/windres/zstd.rc nano ./build/VS2010/zstd/zstd.rc nano ./build/VS2010/libzstd-dll/libzstd-dll.rc ```	2022-12-20 12:52:34 -05:00
W. Felix Handte	8927f985ff	Update Copyright Headers 'Facebook' -> 'Meta Platforms' ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora $ -prune -o -type f); do sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f; done ```	2022-12-20 12:37:57 -05:00
Yonatan Komornik	a8add436ce	Merge pull request #3364 from yoniko/fix-windows-mt-thread-resize-bug Windows MT layer bug fixes	2022-12-19 15:54:01 -08:00
Yonatan Komornik	26f1bf7d70	CR fixes	2022-12-19 15:13:43 -08:00
Yann Collet	832c1a6a1c	minor reformatting and minor reliability and maintenance changes	2022-12-18 11:26:57 -08:00
Yonatan Komornik	ec42c92aaa	Fix race condition in the Windows thread / pthread translation layer When spawning a Windows thread we have small worker wrapper function that translates between the interfaces of Windows and POSIX threads. This wrapper is given a pointer that might get stale before the worker starts running, resulting in UB and crashes. This commit adds synchronization so that we know the wrapper has finished reading the data it needs before we allow the main thread to resume execution.	2022-12-17 13:38:02 -08:00
Yonatan Komornik	500f02eb66	Fixes two bugs in the Windows thread / pthread translation layer 1. If threads are resized the threads' `ZSTD_pthread_t` might move while the worker still holds a pointer into it (see more details in #3120). 2. The join operation was waiting for a thread and then return its `thread.arg` as a return value, but since the `ZSTD_pthread_t thread` was passed by value it would have a stale `arg` that wouldn't match the thread's actual return value. This fix changes the `ZSTD_pthread_join` API and removes support for returning a value. This means that we are diverging from the `pthread_join` API and this is no longer just an alias. In the future, if needed, we could return a Windows thread's return value using `GetExitCodeThread`, but as this path wouldn't be excised in any case, it's preferable to not add it right now.	2022-12-17 13:38:02 -08:00
Yann Collet	2f4238e47a	make ZSTD_DECOMPRESSBOUND() compatible with input size 0 for environments with stringent compilation warnings.	2022-12-16 16:05:39 -08:00
Yann Collet	ea24b88667	decompressBound() tests fixed an overflow in an intermediate result on 32-bit platform. Checked that the new test catch this bug in 32-bit mode.	2022-12-16 15:43:26 -08:00
Yann Collet	51355e1f70	Merge pull request #3362 from facebook/compressBound check potential overflow of compressBound()	2022-12-16 14:22:22 -08:00
Nick Terrell	2f7b8d47fb	[zdict] Fix static linking only include guards Fix `zdict.h` static linking only section so if you include it twice it still exposes the static linking only symbols. E.g. this pattern: ``` ``` This can easily happen when a header you include includes `zdict.h`.	2022-12-16 12:55:20 -08:00
Nick Terrell	0c42424a1e	[build] Fix ZSTD_LIB_MINIFY build option `ZSTD_LIB_MINIFY` broke in 8bf699aa59372d7c2bb4216bcf8037cab7dae51e. This commit fixes the macro and the static library shrinks from ~600K to 324K with ZSTD_LIB_MINIFY set. Fixes #3066.	2022-12-16 12:55:05 -08:00
Nick Terrell	358a237484	[api][visibility] Make the visibility macros more consistent 1. Follow the scheme introduced in PR #2501 for both `zdict.h` and `zstd_errors.h`. 2. If the `_VISIBLE` macro isn't set, but the `_VISIBILITY` macro is, use that. Also make this change for `zstd.h`, since we probably shouldn't have changed that macro name without backward compatibility in the first place. 3. Change all references to `_VISIBILITY` to `_VISIBLE`. Fixes #3359.	2022-12-16 12:54:45 -08:00
Yann Collet	97f63ce2b5	added unit tests for compressBound() and rephrased the code documentation, as suggested by @terrelln	2022-12-16 12:35:14 -08:00
Nick Terrell	ee6475cbbd	Add missing parens around macro definition Fixes #3301.	2022-12-15 17:18:23 -08:00
Yann Collet	45ed0df18a	check potential overflow of compressBound() fixed #3323, reported by @nigeltao Completed documentation around this risk (which is largely theoretical, I can't see that happening in any "real world" scenario, but an erroneous @srcSize value could indeed trigger it).	2022-12-15 15:23:15 -08:00
Nick Terrell	a91e7ec175	Fix corruption that rarely occurs in 32-bit mode with wlog=25 Fix an off-by-one error in the compressor that emits corrupt blocks if: * Zstd is compiled in 32-bit mode * The windowLog == 25 exactly * An offset of 2^25-3, 2^25-2, 2^25-1, or 2^25 is emitted * The bitstream had 7 bits leftover before writing the offset This bug has been present since before v1.0, but wasn't able to easily be triggered, since until somewhat recently zstd wasn't able to find matches that were within 128KB of the window size. Add a test case, and fix 2 bugs in `ZSTD_compressSequences()`: * The `ZSTD_isRLE()` check was incorrect. It wouldn't produce corruption, but it could waste CPU and not emit RLE even if the block was RLE * One windowSize was `1 << windowLog`, not `1u << windowLog` Thanks to @tansy for finding the issue, and giving us a reproducer! Fixes Issue #3350.	2022-12-15 14:41:50 -08:00
daniellerozenblit	e2fc93340f	Merge branch 'dev' into http-to-https	2022-12-15 10:46:13 -05:00
Nick Terrell	728e73ebb4	[legacy] Remove FORCE_MEMORY_ACCESS and only use memcpy Delete unaligned memory access code from the legacy codebase by removing all the non-memcpy functions. We don't care about speed at all for this codebase, only simplicity.	2022-12-14 17:54:35 -08:00
Nick Terrell	f31b83ff34	[decompress] Fix nullptr addition & improve fuzzer Fix an instance of `NULL + 0` in `ZSTD_decompressStream()`. Also, improve our `stream_decompress` fuzzer to pass `NULL` in/out buffers to `ZSTD_decompressStream()`, and fix 2 issues that were immediately surfaced. Fixes #3351	2022-12-14 17:54:22 -08:00
Alex Xu (Hello71)	a78c91ae59	Use proper unaligned access attributes Instead of using packed attribute hack, just use aligned attribute. It improves code generation on armv6 and armv7, and slightly improves code generation on aarch64. GCC generates identical code to regular aligned access on ARMv6 for all versions between 4.5 and trunk, except GCC 5 which is buggy and generates the same (bad) code as packed access: https://gcc.godbolt.org/z/hq37rz7sb	2022-12-14 16:00:37 -08:00

... 6 7 8 9 10 ...

4683 Commits