4475 Commits

Author SHA1 Message Date
Carl Woffenden
0168914490 Fix for MSVC C4267 error 2022-11-18 11:31:17 +01:00
Danielle Rozenblit
c2638212af Change threshold for benchmarking 2022-10-27 13:13:17 -07:00
Danielle Rozenblit
db74d043d6 Speed optimizations with macro 2022-10-27 10:20:44 -07:00
Danielle Rozenblit
401331909e Commit for benchmarking 2022-10-24 12:35:16 -07:00
Nick Terrell
dcc7228de9
[lazy] Use switch instead of indirect function calls. (#3295)
Use a switch statement to select the search function instead of an
indirect function call. This results in a sizable performance win.

This PR is a modification of the approach taken in PR #2828.
When I measured performance for that commit, it was neutral.
However, I now see a performance regression on gcc, but still
neutral on clang. I'm measuring on the same platform, but with
newer compilers. The new approach beats both the current dev
branch and the baseline before PR #2828 was merged.

This PR is necessary for Issue #3275, to update zstd in the kernel.
Without this PR there is a large regression in greedy - btlazy2
compression speed. With this PR it is about neutral.

gcc version: 12.2.0
clang version: 14.0.6
dataset: silesia.tar

| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta  |
|----------|-------|------------------|-----------------|--------|
| gcc      |     5 |            102.6 |           113.7 | +10.8% |
| gcc      |     7 |             66.6 |            74.8 | +12.3% |
| gcc      |     9 |             51.5 |            58.9 | +14.3% |
| gcc      |    13 |             14.3 |            14.3 |  +0.0% |
| clang    |     5 |            108.1 |           114.8 |  +6.2% |
| clang    |     7 |             68.5 |            72.3 |  +5.5% |
| clang    |     9 |             53.2 |            56.2 |  +5.6% |
| clang    |    13 |             14.3 |            14.7 |  +2.8% |

The binary size stays just about the same for clang and gcc, measured
using the `size` command:

| Compiler | Branch | Text    | Data | BSS | Total   |
|----------|--------|---------|------|-----|---------|
| gcc      | dev    | 1127950 | 3312 | 280 | 1131542 |
| gcc      | PR     | 1123422 | 2512 | 280 | 1126214 |
| clang    | dev    | 1046254 | 3256 | 216 | 1049726 |
| clang    | PR     | 1048198 | 2296 | 216 | 1050710 |
2022-10-21 17:14:02 -07:00
Felix Handte
99d239de32
Merge pull request #3290 from felixhandte/ddict-dict-id-from-ddict
Make ZSTD_getDictID_fromDDict() Read DictID from DDict
2022-10-18 13:33:32 -04:00
daniellerozenblit
0d5d571080
Merge pull request #3285 from daniellerozenblit/optimal-huff-depth
Optimal huf depth
2022-10-18 10:31:44 -04:00
Danielle Rozenblit
b4f0d364af Merge 2022-10-17 11:24:24 -07:00
Danielle Rozenblit
a08fabd51a Rough draft speed optimization 2022-10-17 10:24:29 -07:00
Danielle Rozenblit
a910489ff5 No longer pass srcSize to minTableLog 2022-10-17 08:03:44 -07:00
Danielle Rozenblit
b34729018c Minor simplication: no longer need to check src size if using cardinality for minTableLog 2022-10-17 07:55:07 -07:00
W. Felix Handte
d7841d150b Make ZSTD_getDictID_fromDDict() Read DictID from DDict
Currently this function actually reads the dict ID from the dictionary's
header, via `ZSTD_getDictID_fromDict()`. But during decompression the decomp-
ressor actually compares the dict ID in the frame header with the dict ID in
the DDict. Now of course the dict ID in the dictionary contents and the dict
ID in the DDict struct *should* be the same. But in cases of memory corrupt-
ion, where they can drift out of sync, it's misleading for this function to
read it again from the dict buffer rather then return the dict ID that will
actually be used.

Also doing it this way avoids rechecking the magic and so on and so it is a
tiny bit more efficient.
2022-10-14 22:53:03 -04:00
Danielle Rozenblit
75cd42afd7 Update regression results and better variable naming for HUF_cardinality 2022-10-14 13:37:19 -07:00
Danielle Rozenblit
c4853e1553 Update threshold to use optimal depth 2022-10-14 11:29:32 -07:00
Danielle Rozenblit
e60cae33cf Additional ratio optimizations 2022-10-14 10:37:35 -07:00
Yann Collet
b7d55cfa0d fix issue #3119
fix segfault error when running zstreamtest with MALLOC_PERTURB_
2022-10-12 23:04:23 -07:00
Danielle Rozenblit
fa7d9c1139 Set threshold to use optimal table log 2022-10-11 14:33:25 -07:00
Danielle Rozenblit
8888a2ddcc CI failure fixes 2022-10-11 13:12:19 -07:00
Fangrui Song
5635827ede Move ZSTD_DEPRECATED before ZSTDLIB_API/ZSTDLIB_STATIC_API
Clang doesn't allow [[deprecated(...)]] attribute after __attribute__.
Move [[deprecated(...)]] before __attribute__ to fix C++14/C++17 uses
with Clang.

Fix #3250
2022-09-22 12:30:44 -07:00
Yann Collet
434ffe979c minor: refactor publication of ZSTD_copyCCtx()
for improved clarity
2022-09-22 11:14:21 -07:00
Yann Collet
97c23cf615
Merge pull request #3199 from JunHe77/comp
compress:check more bytes to reduce ZSTD_count call
2022-09-19 10:49:10 -07:00
Yann Collet
e9e88753d5
Merge pull request #3245 from haampie/fix/SED_ERE_OPT
drop -E flag in sed
2022-09-19 10:48:11 -07:00
Yann Collet
f7251f88b9
Merge pull request #3247 from haampie/fix/grep
Fix make variable
2022-09-19 10:47:38 -07:00
Jun He
ce52acd7dc compress:check more bytes to reduce ZSTD_count call
Comparing 4B instead of comparing 1B in ZSTD_noDict
mode, thus it can avoid cases like match in match[ml]
but mismatch in match[ml-3]..match[ml-1]. So the call
count of ZSTD_count can be reduced.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I3449ea423d5c8e8344f75341f19a2d1643c703f6
2022-09-18 14:45:41 +08:00
Danielle Rozenblit
8bb833bb5a Merge branch 'null-buffer-decompress' of github.com:daniellerozenblit/zstd into null-buffer-decompress 2022-09-12 18:57:53 -07:00
Danielle Rozenblit
e46b12e1b4 fix indentation 2022-09-12 18:56:59 -07:00
daniellerozenblit
f59f797aa8
Merge branch 'facebook:dev' into null-buffer-decompress 2022-09-12 14:54:36 -04:00
Danielle Rozenblit
a1d89424c2 fuzzer error fix 2022-09-12 11:53:37 -07:00
Danielle Rozenblit
aa82998821 add sequence bound function 2022-09-09 12:34:25 -07:00
Elliot Gorokhovsky
6600a05949
Merge pull request #3259 from DimitriPapadopoulos/codespell
Fix typos found by codespell
2022-09-09 11:15:05 -04:00
Danielle Rozenblit
a06e953db9 some additional comments, remove apt-get from clang jobs, better test titles 2022-09-08 18:30:07 -07:00
Dimitri Papadopoulos
0015308c0f
Fix typos found by codespell 2022-09-08 23:17:00 +02:00
Danielle Rozenblit
3d7f9a90df skip flush operation in case where op is NULL 2022-09-08 13:53:13 -07:00
Danielle Rozenblit
f3ddaaddd6 ternary operator instead of if statement 2022-09-08 12:59:49 -07:00
Danielle Rozenblit
028842788b fix zero offset to nullpointer errors 2022-09-07 17:52:26 -07:00
Harmen Stoppels
efef80b75e Fix make variable 2022-08-19 12:06:43 +02:00
Harmen Stoppels
ae5f273a92 drop -E flag in sed 2022-08-19 12:00:32 +02:00
Elliot Gorokhovsky
ef60302af9
Merge pull request #3230 from grossws/fix3229-docs
Add description for ZSTD_decompressStream and ZSTD_initDStream
2022-08-16 12:48:23 -04:00
Konstantin Gribov
1c847e2e32 Add description for ZSTD_decompressStream and ZSTD_initDStream
With that these functions become visible in generated docs.

Fixes #3229
2022-08-08 18:02:50 +03:00
Nick Terrell
a70ca2bd7d
Fix off-by-one error in superblock mode (#3221)
Fixes #3212.

Long literal and match lengths had an off-by-one error in ZSTD_getSequenceLength.
Fix the off-by-one error, and add a golden compression test that catches the bug.
Also run all the golden tests in the cli-tests framework.
2022-08-03 11:28:39 -07:00
Felix Handte
7e6278a706
Merge pull request #3196 from mileshu/dev
[T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd
2022-08-02 12:34:04 -04:00
Miles HU
c450f9f952 [T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd
The discussion for this task is here: facebook/zstd#3128.

This task can probably be scoped to the first part: marking these functions deprecated.
We'll later look at removal when we roll out v1.6.0.
2022-08-01 22:45:52 -07:00
Nick Terrell
0f4fd28a64
Deprecate ZSTD_getDecompressedSize() (#3225)
Fixes #3158.

Mark ZSTD_getDecompressedSize() as deprecated and replaced by ZSTD_getFrameContentSize().
2022-08-01 11:52:14 -07:00
Qiongsi Wu
1b445c1c2e
Fix hash4Ptr for big endian (#3227) 2022-08-01 10:41:24 -07:00
Jun He
ec5fdcde19
lib: add hint to generate more pipeline friendly code (#3138)
With statistic data of test data files of silesia
the chance of position beyond highThreshold is very
low (~1.3%@L8 in most cases, all <2.5%), and is in
"lowprob area". Add the branch hint so compiler can
get better pipiline codegen.
With this change it is observed ~1% of mozilla and
xml, and slight (0.3%~0.8%) but consistent uplift on
other files on Arm N1.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: Id9ba1d5c767e975290b5c1bf0ecce906544f4ade
2022-07-29 10:28:04 -07:00
Jun He
558cf20d0d
decomp: add prefetch for matched seq on aarch64 (#3164)
match is used for following sequence copy. It is
only updated when extDict is needed, which is a
low probability case. So it can be prefetched to
reduce cache miss.
The benchmarks on various Arm platforms showed
uplift from 1% ~ 14% with gcc-11/clang-14.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: If201af4799d2455d74c79f8387404439d7f684ae
2022-07-29 10:27:20 -07:00
udayanbapat
43f21a600e
Intial commit to address 3090. Added support to decompress empty block. (#3118)
* Intial commit to address 3090. Added support to decompress empty block

* Update zstd_decompress_block.c

Addressed review comments for the case of 'set_basic'

* Update lib/decompress/zstd_decompress_block.c

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>

* Update lib/decompress/zstd_decompress_block.c

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2022-07-14 11:54:34 -07:00
Miles HU
a5655e4017 Revert "T119975957"
This reverts commit 962746edffa5340315136af34ac3331eba82c3c8.
2022-07-12 11:17:25 -07:00
Miles HU
962746edff T119975957
Signed-off-by: Miles HU <yuanpu@fb.com>
2022-07-08 15:01:36 -07:00
Elliot Gorokhovsky
5c382bf110 1.5.3 version bump 2022-06-29 14:45:53 -04:00