sharpetronics/zstd - zstd - Gitea: Git with a cup of tea

mirror of https://github.com/facebook/zstd.git synced 2025-10-05 00:07:15 -04:00

Author	SHA1	Message	Date
Yonatan Komornik	91f4c23e63	Add salt into row hash (#3528 part 2) (#3533 ) Part 2 of #3528 Adds hash salt that helps to avoid regressions where consecutive compressions use the same tag space with similar data (running zstd -b5e7 enwik8 -B128K reproduces this regression).	2023-03-13 15:34:13 -07:00
Yonatan Komornik	9420bce8a4	Add init once memory (#3528 ) (#3529 ) - Adds memory type that is guaranteed to have been initialized at least once in the workspace's lifetime. - Changes tag space in row hash to be based on init once memory.	2023-03-13 13:20:49 -07:00
dependabot[bot]	e2965edd10	Bump github/codeql-action from 2.2.5 to 2.2.6 (#3549 ) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.2.5 to 2.2.6. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`32dc499307...16964e90ba`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-13 10:07:20 -07:00
Yonatan Komornik	a91e91d614	[Bugfix] row hash tries to match position 0 (#3548 ) #3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag. Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have. The fix skips position 0 to solve this problem.	2023-03-13 10:00:03 -07:00
Yann Collet	dd8cb5a0f1	added documentation for the seekable format and notably provide additional context for the Maximum Frame Size parameter. requested by @P-E-Meunier at `1df9f36c6c (commitcomment-103856979)`.	2023-03-10 15:54:31 -08:00
Yonatan Komornik	33e39094e7	Reduce RowHash's tag space size by x2 (#3543 ) Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index). The results is a slight loss in compression ratio (up to 0.2%) and some regressions/improvements to speed depending on level and sample. In turn, we get to save 16% of the hash table's space (5 bytes per entry instead of 6 bytes per entry).	2023-03-10 14:15:04 -08:00
Yann Collet	134d332b10	Merge pull request #3544 from facebook/seek_faster Improved seekable format ingestion speed for small frame size	2023-03-10 12:33:33 -08:00
Yann Collet	1df9f36c6c	Improved seekable format ingestion speed for small frame size As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186, seekable format ingestion speed can be particularly slow when selected `FRAME_SIZE` is very small, especially in combination with the recent row_hash compression mode. The specific scenario mentioned was `pijul`, using frame sizes of 256 bytes and level 10. This is improved in this PR, by providing approximate parameter adaptation to the compression process. Tested locally on a M1 laptop, ingestion of `enwik8` using `pijul` parameters went from 35sec. (before this PR) to 2.5sec (with this PR). For the specific corner case of a file full of zeroes, this is even more pronounced, going from 45sec. to 0.5sec. These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically. The `seekable_compress` test program has been updated to allows setting compression level, in order to produce these performance results.	2023-03-09 18:00:30 -08:00
Felix Handte	d55a6483d7	Merge pull request #3542 from felixhandte/pin-moar-action-deps Pin Moar Action Dependencies	2023-03-09 16:22:11 -08:00
W. Felix Handte	cd9486031d	Also Pin Dockerfile Dependency Hashes	2023-03-09 17:01:22 -05:00
Felix Handte	283c228abe	Merge pull request #3541 from felixhandte/fix-setvbuf-segfault Avoid Segfault Caused by Calling `setvbuf()` on Null File Pointer	2023-03-09 13:54:11 -08:00
Yann Collet	e769da1645	Merge pull request #3526 from facebook/bench_zstd_api Simplify benchmark unit invocation API from CLI	2023-03-09 13:11:11 -08:00
Yann Collet	6bedef8095	Merge pull request #3538 from facebook/doc_huffman added clarifications for sizes of compressed huffman blocks and streams.	2023-03-09 13:09:42 -08:00
daniellerozenblit	e0fc9fd90b	Merge pull request #3486 from daniellerozenblit/patch-from-low-memory-mode Mmap large dictionaries in patch-from mode	2023-03-09 15:30:09 -05:00
Nick Terrell	c40c7378c6	Clarify dstCapacity requirements Clarify `dstCapacity` requirements for single-pass functions. Fixes #3524.	2023-03-09 10:18:30 -08:00
W. Felix Handte	1ec556238e	Pin Moar Action Dependencies An offering to the Scorecard gods, may they have mercy on our souls.	2023-03-09 12:54:07 -05:00
W. Felix Handte	957a0ae52d	Add CLI Test	2023-03-09 12:48:11 -05:00
W. Felix Handte	c4c3e11958	Avoid Calling `setvbuf()` on Null File Pointer	2023-03-09 12:47:40 -05:00
W. Felix Handte	50e8f55e7d	Fix Python 3.6 Incompatibility in CLI Tests	2023-03-09 12:46:37 -05:00
Dmitriy Voropaev	b7080f4c67	Increase tests timeout Current timeout is too small for some slower machines, e.g. most modern riscv64 boards, where tests fail with the following diagnostics: Traceback (most recent call last): File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 734, in <module> success = run_tests(tests, opts) File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 601, in run_tests tests[test_case.name] = test_case.run() File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 285, in run return self.analyze() File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 275, in analyze self._join_test() File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 330, in _join_test (stdout, stderr) = self._test_process.communicate(timeout=self._opts.timeout) File "/usr/lib64/python3.10/subprocess.py", line 1154, in communicate stdout, stderr = self._communicate(input, endtime, timeout) File "/usr/lib64/python3.10/subprocess.py", line 2006, in _communicate self._check_timeout(endtime, orig_timeout, stdout, stderr) File "/usr/lib64/python3.10/subprocess.py", line 1198, in _check_timeout raise TimeoutExpired( subprocess.TimeoutExpired: Command '['/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/cli-tests/compression/window-resize.sh']' timed out after 60 seconds	2023-03-09 16:31:05 +04:00
Danielle Rozenblit	70850eb72b	assert to ensure that dict buffer type is valid	2023-03-08 16:54:57 -08:00
Yann Collet	64e8511b26	added clarifications for sizes of compressed huffman blocks and streams.	2023-03-08 15:31:36 -08:00
Nick Terrell	07a2a33135	Add ZSTD_set{C,F,}Params() helper functions * Add ZSTD_setFParams() and ZSTD_setParams() * Modify ZSTD_setCParams() to use ZSTD_setParameter() to avoid a second path setting parameters * Add unit tests * Update documentation to suggest using them to replace deprecated functions Fixes #3396.	2023-03-08 09:57:35 -08:00
Danielle Rozenblit	96e55c14f2	ability to disable mmap + struct to manage FIO dictionary	2023-03-08 08:06:10 -08:00
Nick Terrell	6313a58e45	[linux-kernel] Fix assert definition Backport upstream fix of the assert definition. This code is currently unused, and can be enabled for testing, which is why it wasn't caught. https://lore.kernel.org/lkml/20230129131436.1343228-1-j.neuschaefer@gmx.net/	2023-03-07 16:53:36 -08:00
Yonatan Komornik	988ce61a0c	Adds initialization of clevel to static cdict (#3525 ) (#3527 ) - Initializes clevel in `ZSTD_CCtxParams_init` - Adds CI workflow for msan fuzzers runs without optimization (`-O0`) - Fixes Makefile to correctly pass on user defined `MOREFLAGS` and `FUZZER_FLAGS` in cases they have been overwritten	2023-03-06 18:05:12 -08:00
Yann Collet	1e38e07b3d	simplified BMK_benchFilesAdvanced()	2023-03-06 12:34:13 -08:00
Yann Collet	9efc14804e	minor: fixed zlib wrapper internal benchmark another possibility could be to link it to programs/benchfn . Not worth the effort.	2023-03-06 12:20:06 -08:00
Yann Collet	db79219f70	simplify BMK_syntheticTest()	2023-03-06 12:15:22 -08:00
Yann Collet	db7d7b6974	Merge pull request #3516 from dloidolt/fullbench_2_files fullbench with two files	2023-03-06 11:56:30 -08:00
dependabot[bot]	1be95291a8	Bump github/codeql-action from 2.2.4 to 2.2.5 (#3518 ) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.2.4 to 2.2.5. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`17573ee1cc...32dc499307`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-02 10:44:06 -08:00
Yann Collet	bd86e24637	Merge pull request #3513 from DimitriPapadopoulos/codespell Fix typos found by codespell	2023-02-27 11:44:31 -08:00
Yann Collet	e1ab6913ad	Merge pull request #3514 from facebook/spec_huffman Clarify zstd specification for Huffman blocks	2023-02-23 15:35:00 -08:00
Nick Terrell	395a2c5462	[bug-fix] Fix rare corruption bug affecting the block splitter The block splitter confuses sequences with literal length == 65536 that use a repeat offset code. It interprets this as literal length == 0 when deciding the meaning of the repeat offset, and corrupts the repeat offset history. This is benign, merely causing suboptimal compression performance, if the confused history is flushed before the end of the block, e.g. if there are 3 consecutive non-repeat code sequences after the mistake. It also is only triggered if the block splitter decided to split the block. All that to say: This is a rare bug, and requires quite a few conditions to trigger. However, the good news is that if you have a way to validate that the decompressed data is correct, e.g. you've enabled zstd's checksum or have a checksum elsewhere, the original data is very likely recoverable. So if you were affected by this bug please reach out. The fix is to remind the block splitter that the literal length is actually 64K. The test case is a bit tricky to set up, but I've managed to reproduce the issue. Thanks to @danlark1 for alerting us to the issue and providing us a reproducer!	2023-02-23 10:54:31 -08:00
Dominik Loidolt	4b9e3d11a6	When benchmarking two files with fullbench, the second file will not be benchmarked because the benchNb has not been reset to zero.	2023-02-20 16:36:26 +01:00
Yann Collet	832f559b0b	clarify zstd specification for Huffman blocks Following detailed comments from @dweiller in #3508.	2023-02-18 18:18:16 -08:00
Dimitri Papadopoulos	547794ef40	Fix typos found by codespell	2023-02-18 10:31:48 +01:00
Yann Collet	4ebaf36582	Merge pull request #3490 from eli-schwartz/meson-tests-noprograms meson: always build the zstd binary when tests are enabled	2023-02-16 11:27:27 -08:00
Sutou Kouhei	8420502ef9	Don't require CMake 3.18 or later fix #3500 CMake 3.18 or later was required by #3392. Because it uses `CheckLinkerFlag`. But requiring CMake 3.18 or later is a bit aggressive. Because Ubuntu 20.04 LTS still uses CMake 3.16.3: https://packages.ubuntu.com/search?keywords=cmake This change disables `-z noexecstack` check with old CMake. This will not break any existing users. Because users who need `-z noexecstack` must already use CMake 3.18 or later.	2023-02-16 10:08:45 -08:00
Felix Handte	1c42844668	Merge pull request #3479 from felixhandte/faster-file-ops Use `f`-variants of `chmod()` and `chown()`	2023-02-16 13:07:34 -05:00
Felix Handte	3c50854c05	Merge pull request #3511 from felixhandte/fix-release-artifact-upload-permission Fix Permissions on Publish Release Artifacts Job	2023-02-15 13:35:04 -05:00
daniellerozenblit	345ed63976	Merge pull request #3509 from daniellerozenblit/fix-window-resize-test Fix cli-tests issues	2023-02-15 13:32:07 -05:00
W. Felix Handte	d54ad3c234	Fix Permissions on Publish Release Artifacts Job Publishing release artifacts requires the `contents` permission, as documented by: https://docs.github.com/en/rest/overview/permissions-required-for-github-apps.	2023-02-15 11:05:54 -05:00
Danielle Rozenblit	d3d0b92e5e	add make test for 32bit	2023-02-15 06:03:02 -08:00
Danielle Rozenblit	7da1c6ddbf	fix cli-tests issues	2023-02-14 11:33:26 -08:00
Danielle Rozenblit	2d8afd9ce1	add manual flag to mmap dictionary	2023-02-14 09:42:23 -08:00
Yonatan Komornik	6a86db11a4	CI workflow to test external compressors dependencies Implemented CI workflow for testing compilation with external compressors and without them. This serves as a sanity check to avoid any code dependencies on libraries that may not always be present. (Reference: #3497 for a bug fix related to this issue.)	2023-02-13 18:00:13 -08:00
Yonatan Komornik	727d03161f	Make Github workflows permissions read-only by default (#3488 ) * Make Github workflows permissions read-only by default * Pins `skx/github-action-publish-binaries` action to specific hash	2023-02-13 16:57:05 -08:00
Alex Xu	886de7bc04	Use correct types in LZMA comp/decomp (#3497 ) Bytef and uInt are zlib types, not available when zlib is disabled Fixes: 1598e6c634ac ("Async write for decompression") Fixes: cc0657f27d81 ("AsyncIO compression part 2 - added async read and asyncio to compression code (#3022)")	2023-02-13 16:30:56 -08:00
Danielle Rozenblit	8a189b1b29	refactor dictionary file stat	2023-02-13 15:23:06 -08:00

... 3 4 5 6 7 ...

10316 Commits