sharpetronics/zstd - zstd - Gitea: Git with a cup of tea

mirror of https://github.com/facebook/zstd.git synced 2025-10-18 00:03:50 -04:00

Author	SHA1	Message	Date
Yann Collet	321583ccf5	fixed minor typecast warnings	2021-12-28 11:38:21 -08:00
Yann Collet	b7630a474b	abstracted usage of offBase sumtype within zstd_lazy.c	2021-12-28 10:59:47 -08:00
Yann Collet	435f5a2e6d	fixed regression test assert optLdm->offset might be == 0 in invalid case. Only use STORE_OFFSET() after validating it's a correct case.	2021-12-28 09:55:31 -08:00
Yann Collet	2068889146	created STORED_*() macros to act on values stored / expressed in the sumtype numeric representation required by `storedSeq()`. This makes it possible to abstract away this representation by using the macros to extract these values. First user : ZSTD_updateRep() .	2021-12-28 06:59:07 -08:00
Yann Collet	1aed962216	introduce macros STORE_OFFSET() and STORE_REPCODE() this meant to abstract the sumtype representation required to transfert `offcode` to `ZSTD_storeSeq()`. Unfortunately, the sumtype numeric representation is currently a leaky abstraction that has permeated many other parts of the code, especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`. While this PR makes a good job a transfering a large nb of call sites to using the new macros, there are still a few sites where this transformation is more complex, or where the numeric representation itself it used "as is". One of the problematics area is the decision to use the numeric format of the sumtype within the match finders of `zstd_lazy`. This commit doesn't change the behavior, it only introduces and employes the macros, but eventually the resulting code remains identical. At target, if the numeric representation of the sumtype can be completely abstracted and no other part of the code depends on it, it will be possible to move it towards something slightly more efficient.	2021-12-23 22:03:30 -08:00
Yann Collet	bec7bbb5a4	Merge branch 'dev' into seqStore_off	2021-12-23 18:03:17 -08:00
Yann Collet	aeff128331	change seqDef.offset into seqDef.offBase to better reflect the value stored in this field.	2021-12-23 17:56:08 -08:00
Yann Collet	75525fcb9f	library optimization flag can be selected on command line again `CFLAGS=-O0 make` will now use `-O0` instead of enforcing `-O3` which used to be the behavior before introduction of `libzstd.mk`. This should result in faster tests, since a few tests depend on this capability for faster roundtrips.	2021-12-23 17:43:12 -08:00
Yann Collet	e145b58cfd	changed seqDef.matchLength into seqDef.mlBase since this is effectively what is stored in this field (== matchLength - MINMATCH). This makes it clearer what needs to be done when reading from / writing to this field.	2021-12-23 13:39:46 -08:00
Yann Collet	b77fcac61f	change ZSTD_storeSeq() interface to accept matchLength instead of mlBase. This removes the need to do `- MINMATCH` at every call site. The new interface contract is checked with an `assert()`.	2021-12-23 12:03:33 -08:00
Yann Collet	a9e43b37d0	Revert "Limit `ZSTD_maxCLevel` to 21 for 32-bit binaries."	2021-12-20 11:43:14 -08:00
Yann Collet	f829c32258	forgot the chainlog is effectively a "fake" value with rowHash the only value which makes sense is `hashlog-1` as it mimics the real memory usage.	2021-12-16 11:37:40 -08:00
Yann Collet	db1b408a2f	rebalance lazy compression levels	2021-12-15 21:33:31 -08:00
Yann Collet	c8d6067615	fixed incorrect rowlog initialization the variable has only very limited usage, being only used once at the beginning of the block for prefetching only, hence the error had no impact on compression ratio.	2021-12-15 14:37:05 -08:00
Yann Collet	eaf786242d	Merge pull request #2929 from facebook/sse_row_lazy simplify SSE implementation of row_lazy match finder	2021-12-15 11:47:15 -08:00
Norbert Lange	2fbb1d10c1	Reduce bit tables to 8bit This saves some 1.7Kb in rodata section (x86_64, zstd tool), while assembler code stays the same except the type of a few load/extend instructions. Should not have negative performance implications.	2021-12-14 23:47:57 +01:00
Norbert Lange	99923dfc1a	Add typedefs for 8bit (un)signed To make code more expressive, add U8 and S8 typedefs	2021-12-14 23:47:57 +01:00
binhdvo	64205b7832	Fix performance degradation with -m32 (#2926 )	2021-12-14 15:53:50 -05:00
Felix Handte	5e2fede604	Merge pull request #2921 from felixhandte/neg-lvl-stagger-step Stagger Stepping in Negative Levels	2021-12-14 14:13:57 -05:00
Yann Collet	05430b25a8	roll SSE implementation of row_lazy match finder mostly for maintenance convenience. Performance wise, there is very little change, slightly faster for slog 3 & 4, neutral or very slightly negative for slot 5 & 6.	2021-12-14 10:44:23 -08:00
W. Felix Handte	82a49c88f9	Increment Step by 1 not 2 I couldn't find a good way to spread `ip0` and `ip1` apart when we accelerate due to incompressible inputs. (The methods I tried slowed things down quite a bit.) Since we aren't splaying ip0 and ip1 apart (which would be like `0_1_2_3_`, as opposed to the `01__23__` we were actually doing), it's a big ambitious to increment `step` by 2. Instead, let's increment it by 1, which has the benefit sliiightly improving compression. Speed remains pretty much unchanged.	2021-12-13 16:59:33 -05:00
W. Felix Handte	6ca5f42402	Rewrite `step` to Track Increment Between Pairs of Positions The position updates are rewritten from `ip[N] = ip[N-1] + step` to be `ip[N] = ip[N-2] + step`. This lets us only deal with the asymmetric spacing of gaps at setup and then we only have to keep a single `step` variable. This seems to work quite well on GCC and Clang!	2021-12-13 14:48:26 -05:00
W. Felix Handte	b8434cb754	Allow Templating `ZSTD_fast` Matchfinders on Acceleration (Lvl < -1)	2021-12-13 14:46:57 -05:00
Yann Collet	e1ab2200ff	fixed x32 compatibility	2021-12-10 21:02:17 -08:00
W. Felix Handte	ace6a7e746	Decompose `step` into Two Variables This avoids an additional addition, at the cost of an additional variable.	2021-12-10 16:44:23 -05:00
W. Felix Handte	22501cd283	Stagger Application of `stepSize` in ZSTD_fast This replicates the behavior of @terrelln's `ZSTD_fast` implementation. That is, it always looks at adjacent pairs of positions, and only applies the acceleration every other position. This produces a more fine-grained acceleration.	2021-12-10 16:44:23 -05:00
Yann Collet	57383d2317	Merge pull request #2914 from facebook/xxhash081 updated xxHash to latest v0.8.1	2021-12-08 16:48:46 -08:00
Yann Collet	3ce265fea8	remove offending static assert lines no idea why visual + clang-cl + appveyor don't like them, I've not been able to reproduce the issue locally, but these static assert are very unlikely to deliver a useful signal, I can't imagine a situation where they will be wrong, and if they are, then a ton of other things will be broken way before reaching that point.	2021-12-08 15:05:17 -08:00
Nick Terrell	8b40095b3f	Merge pull request #2916 from terrelln/issue-2906 Remove possible NULL pointer addition	2021-12-08 16:51:10 -05:00
Yann Collet	16241b7d26	altered copyright title	2021-12-08 13:18:41 -08:00
Yann Collet	a9cd6164d7	removed declarations of XXH3 symbols when XXH_NO_XXH3 is defined on top of implementations, which were already scoped out.	2021-12-08 12:56:16 -08:00
Yann Collet	27e706de88	replaces malloc / free / memcpy by Zstandard's version	2021-12-08 12:51:04 -08:00
Nick Terrell	b94407b6cf	Remove possible NULL pointer addition Refactor `ZSTDMT_isOverlapped()` to do NULL checks before computing the end pointer. Fixes #2906.	2021-12-08 12:40:40 -08:00
Nick Terrell	859e0500ab	Merge pull request #2915 from terrelln/oss-fuzz-build-fix Fix oss-fuzz build	2021-12-08 15:32:49 -05:00
Nick Terrell	aa7729c9f3	Fix oss-fuzz build Disable assembly when dataflow sanitizer is enabled. This regressed in PR #2893, which accidentally removed the check for dataflow sanitizer.	2021-12-08 11:01:52 -08:00
W. Felix Handte	9f1dee8fa5	Fix Up #2659 ; Build libzstd.pc Whenever Building the Lib on Unix	2021-12-08 12:43:34 -05:00
Yann Collet	1c7d2c4dd5	updated xxHash to latest v0.8.1 with minor modifications directly embedded in source : - does not compile XXH3 - namespace emulation (ZSTD_ prefix) Incidentally fix #2824	2021-12-07 21:16:15 -08:00
Felix Handte	9118ee04c2	Merge pull request #2659 from ericonr/pc [lib] Fix libzstd.pc for lib-mt builds	2021-12-07 14:18:38 -05:00
Nick Terrell	b6b4c9a3da	Merge pull request #2907 from Hello71/armv6-fix-legacy Apply FORCE_MEMORY_ACCESS=1 to legacy	2021-12-06 15:41:22 -05:00
Alex Xu (Hello71)	3d773d7013	Apply FORCE_MEMORY_ACCESS=1 to legacy See #2633, #2881.	2021-12-05 22:51:44 -05:00
Nick Terrell	486472c453	Merge pull request #2893 from terrelln/issue-2789 [asm] Share portability macros and restrict ASM further	2021-12-03 14:07:30 -05:00
Felix Handte	d2c86ec898	Merge pull request #2897 from felixhandte/zstd-deprecated-avoid-deprecated Avoid Using Deprecated Functions in Deprecated Code	2021-12-03 12:09:58 -05:00
Nick Terrell	c284569457	[asm] Share portability macros and restrict ASM further Move portability macros to `lib/common/portability_macros.h`. This file only contains platform/feature detection (e.g. 0/1 macros). This file is shared between C and ASM code, so it cannot include any C code. Rename `HUF_` ASM macros to be `ZSTD_` prefixed, and move to the new header. Restrict `ZSTD_ASM_SUPPORTED` to `__GNUC__`, because we need the GAS assembler. Finally, only include the ASM code if we are actually going to use it. This disables it on all Windows platforms, which should resolve the problem brought up in Issue #2789.	2021-12-02 16:58:04 -08:00
Nick Terrell	014bbb29f8	Merge pull request #2898 from terrelln/issue-2862 Improve zstd_opt build speed and size	2021-12-02 19:49:43 -05:00
Yann Collet	1bf3d8a475	Merge pull request #2896 from facebook/m68k Zstandard compiles and run on m68k cpus	2021-12-02 14:25:45 -08:00
Nick Terrell	e5bfaeede7	Improve zstd_opt build speed and size Use the same trick as we did for zstd_lazy in PR #2828: * Create one search function specialization for each (dictMode, mls). * Select the search function pointer at the top of the match finder. Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into every function, since `dictMode` is no longer used as a template. Create two specializations, for opt levels 0 and 2, and call one of the two specializations. Lastly, remove the hack that disabled inlining for zstd_opt for the Linux Kernel, as we've gotten most of the benefit already. Compilation time sees a ~4x reduction: \| Compiler \| Flags \| Dev Time (s) \| PR Time (s) \| Delta \| \|----------\|----------------------------------\|--------------\|-------------\|-------\| \| gcc \| -O3 \| 10.1 \| 2.3 \| -77% \| \| gcc \| -O3 -fsanitize=address,undefined \| 61.1 \| 10.2 \| -83% \| \| clang \| -O3 \| 9.0 \| 2.1 \| -76% \| \| clang \| -O3 -fsanitize=address,undefined \| 33.5 \| 5.1 \| -84% \| Build size is reduced by 150KB - 200KB: \| Compiler \| Dev libzstd.a Size (B) \| PR libzstd.a Size (B) \| Delta \| \|----------\|------------------------\|-----------------------\|-------\| \| gcc \| 1327476 \| 1177108 \| -11% \| \| clang \| 1378324 \| 1167780 \| -15% \| There is a <2% speed loss in all cases: \| Compiler \| Level \| Dev Speed (MB/s) \| PR Speed (MB/s) \| Delta \| \|----------\|-------\|------------------\|-----------------\|--------\| \| gcc \| 16 \| 4.78 \| 4.72 \| -1.25% \| \| gcc \| 17 \| 3.49 \| 3.46 \| -0.85% \| \| gcc \| 18 \| 2.92 \| 2.86 \| -2.04% \| \| gcc \| 19 \| 2.61 \| 2.61 \| 0.00% \| \| clang \| 16 \| 4.69 \| 4.80 \| 2.34% \| \| clang \| 17 \| 3.53 \| 3.49 \| -1.13% \| \| clang \| 18 \| 2.86 \| 2.85 \| -0.34% \| \| clang \| 19 \| 2.61 \| 2.61 \| 0.00% \| Fixes Issue #2862.	2021-12-02 14:19:41 -08:00
W. Felix Handte	e688317652	Fix Include Path	2021-12-02 16:53:52 -05:00
Nick Terrell	01ecd6ffc0	Merge pull request #2892 from terrelln/issue-2785 [CircleCI] Fix short-tests-0	2021-12-02 16:20:56 -05:00
W. Felix Handte	d82d67d073	Migrate to `FORWARD_IF_ERROR`	2021-12-02 16:06:07 -05:00
Yann Collet	30b9db8ae4	changed macro name to ZSTD_ALIGNOF for better consistency	2021-12-02 12:57:42 -08:00

1 2 3 4 5 ...

4151 Commits