sharpetronics/zstd - zstd - Gitea: Git with a cup of tea

mirror of https://github.com/facebook/zstd.git synced 2025-10-05 00:07:15 -04:00

Author	SHA1	Message	Date
W. Felix Handte	8927f985ff	Update Copyright Headers 'Facebook' -> 'Meta Platforms' ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora $ -prune -o -type f); do sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f; done ```	2022-12-20 12:37:57 -05:00
Yonatan Komornik	26f1bf7d70	CR fixes	2022-12-19 15:13:43 -08:00
Yonatan Komornik	ec42c92aaa	Fix race condition in the Windows thread / pthread translation layer When spawning a Windows thread we have small worker wrapper function that translates between the interfaces of Windows and POSIX threads. This wrapper is given a pointer that might get stale before the worker starts running, resulting in UB and crashes. This commit adds synchronization so that we know the wrapper has finished reading the data it needs before we allow the main thread to resume execution.	2022-12-17 13:38:02 -08:00
Yonatan Komornik	500f02eb66	Fixes two bugs in the Windows thread / pthread translation layer 1. If threads are resized the threads' `ZSTD_pthread_t` might move while the worker still holds a pointer into it (see more details in #3120). 2. The join operation was waiting for a thread and then return its `thread.arg` as a return value, but since the `ZSTD_pthread_t thread` was passed by value it would have a stale `arg` that wouldn't match the thread's actual return value. This fix changes the `ZSTD_pthread_join` API and removes support for returning a value. This means that we are diverging from the `pthread_join` API and this is no longer just an alias. In the future, if needed, we could return a Windows thread's return value using `GetExitCodeThread`, but as this path wouldn't be excised in any case, it's preferable to not add it right now.	2022-12-17 13:38:02 -08:00
daniellerozenblit	e2fc93340f	Merge branch 'dev' into http-to-https	2022-12-15 10:46:13 -05:00
Alex Xu (Hello71)	a78c91ae59	Use proper unaligned access attributes Instead of using packed attribute hack, just use aligned attribute. It improves code generation on armv6 and armv7, and slightly improves code generation on aarch64. GCC generates identical code to regular aligned access on ARMv6 for all versions between 4.5 and trunk, except GCC 5 which is buggy and generates the same (bad) code as packed access: https://gcc.godbolt.org/z/hq37rz7sb	2022-12-14 16:00:37 -08:00
Danielle Rozenblit	4dffc35f2e	Convert references to https from http	2022-12-14 06:58:35 -08:00
Nick Terrell	dcc7228de9	[lazy] Use switch instead of indirect function calls. (#3295 ) Use a switch statement to select the search function instead of an indirect function call. This results in a sizable performance win. This PR is a modification of the approach taken in PR #2828. When I measured performance for that commit, it was neutral. However, I now see a performance regression on gcc, but still neutral on clang. I'm measuring on the same platform, but with newer compilers. The new approach beats both the current dev branch and the baseline before PR #2828 was merged. This PR is necessary for Issue #3275, to update zstd in the kernel. Without this PR there is a large regression in greedy - btlazy2 compression speed. With this PR it is about neutral. gcc version: 12.2.0 clang version: 14.0.6 dataset: silesia.tar \| Compiler \| Level \| Dev Speed (MB/s) \| PR Speed (MB/s) \| Delta \| \|----------\|-------\|------------------\|-----------------\|--------\| \| gcc \| 5 \| 102.6 \| 113.7 \| +10.8% \| \| gcc \| 7 \| 66.6 \| 74.8 \| +12.3% \| \| gcc \| 9 \| 51.5 \| 58.9 \| +14.3% \| \| gcc \| 13 \| 14.3 \| 14.3 \| +0.0% \| \| clang \| 5 \| 108.1 \| 114.8 \| +6.2% \| \| clang \| 7 \| 68.5 \| 72.3 \| +5.5% \| \| clang \| 9 \| 53.2 \| 56.2 \| +5.6% \| \| clang \| 13 \| 14.3 \| 14.7 \| +2.8% \| The binary size stays just about the same for clang and gcc, measured using the `size` command: \| Compiler \| Branch \| Text \| Data \| BSS \| Total \| \|----------\|--------\|---------\|------\|-----\|---------\| \| gcc \| dev \| 1127950 \| 3312 \| 280 \| 1131542 \| \| gcc \| PR \| 1123422 \| 2512 \| 280 \| 1126214 \| \| clang \| dev \| 1046254 \| 3256 \| 216 \| 1049726 \| \| clang \| PR \| 1048198 \| 2296 \| 216 \| 1050710 \|	2022-10-21 17:14:02 -07:00
daniellerozenblit	0d5d571080	Merge pull request #3285 from daniellerozenblit/optimal-huff-depth Optimal huf depth	2022-10-18 10:31:44 -04:00
Danielle Rozenblit	a910489ff5	No longer pass srcSize to minTableLog	2022-10-17 08:03:44 -07:00
Danielle Rozenblit	75cd42afd7	Update regression results and better variable naming for HUF_cardinality	2022-10-14 13:37:19 -07:00
Danielle Rozenblit	c4853e1553	Update threshold to use optimal depth	2022-10-14 11:29:32 -07:00
Danielle Rozenblit	e60cae33cf	Additional ratio optimizations	2022-10-14 10:37:35 -07:00
Yann Collet	b7d55cfa0d	fix issue #3119 fix segfault error when running zstreamtest with MALLOC_PERTURB_	2022-10-12 23:04:23 -07:00
Danielle Rozenblit	fa7d9c1139	Set threshold to use optimal table log	2022-10-11 14:33:25 -07:00
Danielle Rozenblit	8888a2ddcc	CI failure fixes	2022-10-11 13:12:19 -07:00
Nick Terrell	a70ca2bd7d	Fix off-by-one error in superblock mode (#3221 ) Fixes #3212. Long literal and match lengths had an off-by-one error in ZSTD_getSequenceLength. Fix the off-by-one error, and add a golden compression test that catches the bug. Also run all the golden tests in the cli-tests framework.	2022-08-03 11:28:39 -07:00
udayanbapat	43f21a600e	Intial commit to address 3090. Added support to decompress empty block. (#3118 ) * Intial commit to address 3090. Added support to decompress empty block * Update zstd_decompress_block.c Addressed review comments for the case of 'set_basic' * Update lib/decompress/zstd_decompress_block.c Co-authored-by: Nick Terrell <nickrterrell@gmail.com> * Update lib/decompress/zstd_decompress_block.c Co-authored-by: Nick Terrell <nickrterrell@gmail.com> Co-authored-by: Nick Terrell <nickrterrell@gmail.com>	2022-07-14 11:54:34 -07:00
Nick Terrell	3b915cd94b	Merge pull request #3145 from JunHe77/wildcopy common: apply two stage copy to aarch64	2022-06-09 13:38:30 -07:00
Ma Lin	95073b1af1	fix leaking thread handles on Windows On Windows, thread handle should be closed explicitly. Co-authored-by: luben karavelov <luben@users.noreply.github.com>	2022-05-30 16:35:44 +08:00
Jun He	d7249dafb4	common: apply two stage copy to aarch64 On aarch64 ZSTD_wildcopy uses a simple loop to do 16B based memory copy. There is existing optimized two stage copy that can achieve better performance. By applying this to aarch64 it is also observed ~1% uplift in silesia corpus. Signed-off-by: Jun He <jun.he@arm.com> Change-Id: Ic1253308e7a8a7df2d08963ba544e086c81ce8be	2022-05-26 14:40:21 +08:00
cuishuang	05796796fd	fix some typos Signed-off-by: cuishuang <imcusg@gmail.com>	2022-04-26 17:40:23 +08:00
Dominique Pelle	b772f53952	Typo and grammar fixes	2022-03-12 08:58:04 +01:00
Dimitris Apostolou	cf1894b324	Fix typos	2022-03-05 23:47:25 +02:00
Ilya Tokar	7c3d1cb3ab	Enable STATIC_BMI2 for gcc/clang Some usage (e.g. BIT_getLowerBit) uses it without checking for MSVC, so enabling for clang gives a small performance boost.	2022-03-03 15:03:54 -05:00
Ilya Tokar	0178c12dd9	Use helper function for bit manipulations. We already have BIT_getLowerBits, so use it. Benefits are 2fold: 1) Somewhat cleaner code 2) We are now using bzhi instructions, when available. Performance delta is too small for microbenchmarks, but avoiding load still helps larger applications, by reducing data cache pressure.	2022-02-23 17:59:56 -05:00
Elliot Gorokhovsky	71d9dab76f	Replace XOR with subtraction for readability	2022-02-16 16:49:42 -05:00
Elliot Gorokhovsky	856c7dc51d	Fix fuzzer.c nits and replace CLZ fallback	2022-02-16 11:40:05 -05:00
Elliot Gorokhovsky	00f2acba36	Add back check to prevent Win32 static analysis issues	2022-02-15 11:41:09 -05:00
Elliot Gorokhovsky	6994a9f99c	bits.h refactor and bugfix	2022-02-14 16:59:55 -05:00
Elliot Gorokhovsky	529cd7b821	Fix nits	2022-02-14 14:24:50 -05:00
Elliot Gorokhovsky	796182652d	Pull out software fallbacks	2022-02-14 11:16:03 -05:00
Elliot Gorokhovsky	db2f4a6532	Move bitwise builtins into bits.h	2022-02-14 11:16:03 -05:00
Oscar Shi	fede1d3abe	[trace] Add aarch64 to supported architectures for zstd_trace Arm Toolchain should support weak symbols	2022-02-07 14:41:07 -08:00
Yann Collet	cdee6a7dbd	Merge branch 'dev' into fix44168	2022-01-31 17:31:55 -08:00
Nick Terrell	0b70da6277	Merge pull request #3020 from terrelln/cli-tests Add new CLI testing platform	2022-01-31 10:02:27 -08:00
Nick Terrell	8d65f87416	Fix static analysis false-positives * It couldn't detect that the `fastCoverParams` can't be non-null, since it was just an assertion. * It thought we were accesing `wksp->dtable` beyond the bounds because we were using it to set the `workSpace` value. Instead, compute the workspace size used in a different way.	2022-01-30 12:16:16 -08:00
Yann Collet	637b2d7a24	fixed bug 44168 discovered by oss-fuzz It's a bug in the test itself : ZSTD_compressBound() as an upper bound of the compress size only works for data compressed "normally". But in situations where many flushes are forcefully introduced, this creates many more blocks, each of which has a potential to increase the size by 3 bytes. In extreme cases (lots of small incompressible blocks), the expansion can go beyond ZSTD_compressBound(). This situation is similar when using the CompressSequences() API with Explicit Block Delimiters. In which case, each explicit block acts like a deliberate flush. When employed by a fuzzer, it's possible to generate scenarios like the one described above, with tons of incompressible blocks of small sizes, thus going beyond ZSTD_compressBound(). fix : when using Explicit Block Delimiters, use a larger bound, to account for this scenario.	2022-01-29 16:36:20 -08:00
Yann Collet	9a68840176	minor refactor to blocksplit notably simplication of ZSTD_deriveSeqStoreChunk()	2022-01-27 20:24:35 -08:00
Yann Collet	bad7f82300	Merge pull request #2974 from facebook/fix2966_part3 Lazy parameters adaptation (part 1 - ZSTD_c_stableInBuffer)	2022-01-27 06:14:04 -08:00
Yann Collet	a66e8bb437	introduced LitHufLog constant which properly represents the maximum bit size of compressed literals (11) as defined in the specification. To be preferred from HUF_TABLELOG_DEFAULT which represents the same value but by accident. Name selected to keep the same convention as existing width definitions, MLFSELog, LLFSELog and OffFSELog.	2022-01-26 14:47:24 -08:00
Yann Collet	32a5d95dcb	moved HufLog to lib/decompress it's only used to size decompression tables	2022-01-26 14:47:24 -08:00
Yann Collet	b99ece96b9	converted checks into user validation generating error codes had to create a new error code for this condition, none of the existing ones were fitting enough.	2022-01-26 10:43:50 -08:00
Yann Collet	fc2ea97442	refactored fuzzer tests for sequence compression api add explicit delimiter mode to libfuzzer test	2022-01-26 00:19:35 -08:00
Yann Collet	87dcd3326a	fix sequence compression API in Explicit Delimiter mode	2022-01-25 13:33:41 -08:00
Yonatan Komornik	1598e6c634	Async write for decompression (#2975 ) * Async IO decompression: - Added --[no-]asyncio flag for CLI decompression. - Replaced dstBuffer in decompression with a pool of write jobs. - Added an ability to execute write jobs in a separate thread. - Added an ability to wait (join) on all jobs in a thread pool (queued and running).	2022-01-21 13:55:41 -08:00
H.J. Lu	51ab182bd4	x86-64: Enable Intel CET Intel Control-flow Enforcement Technology (CET): https://en.wikipedia.org/wiki/Control-flow_integrity#Intel_Control-flow_Enforcement_Technology requires that on Linux, all linker input files are marked as CET enabled in .note.gnu.property section. For high-level language source codes, .note.gnu.property section is added by compiler with the -fcf-protection option. For assembly sources, include <cet.h> to add .note.gnu.property section.	2022-01-11 13:19:16 -08:00
H.J. Lu	568c69a4eb	x86-64: Hide internal assembly functions Hide x86-64 internal assembly functions. Before $ nm -D lib/libzstd.so.1 \| grep usingDTable_internal_bmi2_asm_loop 00000000000c23c0 T _HUF_decompress4X1_usingDTable_internal_bmi2_asm_loop 00000000000c23c0 T HUF_decompress4X1_usingDTable_internal_bmi2_asm_loop 00000000000c283d T _HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop 00000000000c283d T HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop $ After $ nm -D lib/libzstd.so.1 \| grep usingDTable_internal_bmi2_asm_loop $ This fixes issue #2990.	2022-01-11 10:12:24 -08:00
Yann Collet	6211bfee5e	fixed backup prototype for POOL_sizeof()	2021-12-30 14:33:21 -08:00
Yann Collet	b1978d60ee	POOL_sizeof() only needs a const read-only reference	2021-12-30 14:08:51 -08:00

1 2 3 4 5 ...

702 Commits