9006 Commits

Author SHA1 Message Date
Sen Huang
1daf3c8dbc Use 32 buckets for log2 bucketing in huffman sort 2021-09-13 12:29:16 -04:00
Yann Collet
c10067c44e
Merge pull request #2775 from eli-schwartz/meson
meson: fix type error for integer option
2021-09-10 05:47:52 -07:00
Eli Schwartz
193aa49673
meson: fix type error for integer option
meson forgave using the wrong type, but this isn't guaranteed. muon
simply failed.
2021-09-09 23:40:58 -04:00
Felix Handte
d68aa19a2f
Merge pull request #2749 from felixhandte/zstd-fast-pipelined
Pipelined Implementation of ZSTD_fast (~+5% Speed)
2021-09-09 17:05:30 -04:00
sen
bcc68275f1
Merge pull request #2769 from senhuang42/typo_fix
[easy] Fix patch-from help msg typo
2021-09-07 11:09:55 -04:00
senhuang42
30fe49af4e Fix patch-from help msg typo 2021-09-07 10:08:35 -04:00
sen
71076b7a01
Merge pull request #2763 from senhuang42/opt_compiletime
Improve compile speed and binary size in `opt`
2021-09-02 11:59:02 -04:00
Yann Collet
a8cf85ad0a
Merge pull request #2762 from facebook/level13
minor rebalancing of level 13
2021-09-01 20:32:53 -07:00
sen
4d61f10e23
Merge pull request #2761 from senhuang42/fse_wksp_fix
Add 8 bytes to FSE_buildCTable wksp
2021-09-01 17:09:45 -04:00
Sen Huang
d88c1d95ce Remove inlining for opt 2021-09-01 16:58:57 -04:00
Yann Collet
40e44bd56d updated regression tests 2021-09-01 13:26:39 -07:00
Yann Collet
70d89e5a12 minor rebalancing of level 13
This new setup is slighly better on `silesia.tar` :
Ratio : 3.649 -> 3.655
Speed : 11.9 MB/s -> 12.2 MB/s
At the cost of more memory : 24 MB -> 32 MB
The new memory budget is a reasonable interpolation between neighboring levels 12 and 14:
level 12 : 24 MB
level 13 : 32 MB (increased from 24 MB)
level 14 : 48 MB
Window size remains unaffected (4 MB)
2021-09-01 13:05:10 -07:00
senhuang42
414e24becf Add 8 bytes to FSE workspace 2021-09-01 15:56:33 -04:00
W. Felix Handte
b0977e4ed2 Update results.csv 2021-09-01 14:45:00 -04:00
W. Felix Handte
d6fd7761c9 Fix VS Build: Explicitly Cast to Narrow Ints 2021-09-01 14:15:04 -04:00
W. Felix Handte
98d3df326b Change Target Size in Fuzzer
It's a bit strange, because this is hitting the dictionary special case where
the dictionary is contiguous with the input and still runs in the single-
segment path.

We should probably change that to hit the `extDict` path instead?
2021-09-01 14:15:04 -04:00
W. Felix Handte
15e67bfa7e Deduplicate Implementations
This removes the old `ZSTD_compressBlock_fast_generic()` and renames the new
`ZSTD_compressBlock_fast_generic_pipelined()` to replace it. This is
functionally a no-op.
2021-09-01 14:15:04 -04:00
W. Felix Handte
64054dec44 Tweak Step 2021-09-01 14:15:04 -04:00
W. Felix Handte
24fcccd05c Unroll Loop Core; Reduce Frequency of Repcode Check & Step Calc (+>1% Speed)
Unrolling the loop to handle 2 positions in each iteration allows us to reduce
the frequency of some operations that don't need to happen at every position.
One such operation is the step calculation, which is a very rough heuristic
anyways. It's fine if we do this a position later. The other operation is the
repcode check. But since the repcode check already tries expanding back one
position, we're really not missing much of importance by only trying it every
other position.

This commit also slightly reorders some operations.
2021-09-01 14:15:04 -04:00
W. Felix Handte
57a100f6dc Add ip1 + 128 Prefetch; Tiny Cleanup 2021-09-01 14:15:04 -04:00
W. Felix Handte
991d660ea9 Nit: Only Store 2 Hash Variables 2021-09-01 14:15:04 -04:00
W. Felix Handte
8706bc115a Nit: Dedup idx0 and idx1 2021-09-01 14:15:04 -04:00
W. Felix Handte
7c24c3e6ce Give Up on Searching End of Block
Amusingly, it seems to be a non-trivial performance hit to add in final
searches or even hash table insertions during cleanup. So let's not. It seems
to not make any meaningful difference in compression ratio.
2021-09-01 14:15:03 -04:00
W. Felix Handte
35932ab2f1 Prefetch Input in Incompressible Sections (+0.25% Speed) 2021-09-01 14:15:03 -04:00
W. Felix Handte
b092dd75b7 Shrink Pipeline from 4 Positions to 3 2021-09-01 14:15:03 -04:00
W. Felix Handte
387840af79 Re-Order Operations for Slightly Better Performance 2021-09-01 14:15:03 -04:00
W. Felix Handte
bc768bccc0 Track Step Size Statefully, Rather than Recalculating Every Time 2021-09-01 14:15:03 -04:00
W. Felix Handte
80bc12b33a Initial Pipelined Implementation for ZSTD_fast 2021-09-01 14:15:03 -04:00
W. Felix Handte
ab8aa49b8d Fix Benchmark Corruption Display 2021-09-01 14:15:03 -04:00
Yann Collet
6715096611
Merge pull request #2758 from facebook/qemu
added qemu tests
2021-08-31 09:56:50 -07:00
Yann Collet
c1de65535f Merge branch 'dev' into qemu 2021-08-31 08:16:46 -07:00
Yann Collet
6933ac67d3
Merge pull request #2757 from facebook/transferGA
Reduce test time on TravisCI
2021-08-31 07:40:21 -07:00
Yann Collet
333ecf6865 add powerpc qemu emulation 2021-08-30 06:37:50 -07:00
Yann Collet
2b27d07d06 attempt at adding m68k qemu tests
with optional success (for the time being)
2021-08-29 21:39:06 -07:00
Yann Collet
1e5c90cb5b remove qemu tests
that are being transfered to GA in #2758.
This represents a saving of ~25mn of cpu time on TravisCI.
2021-08-29 20:54:18 -07:00
Yann Collet
74b4171fb8 fix alignment condition in FSE_buildCTable
2-bytes alignment is enough for 16-bit fields
2021-08-29 19:05:04 -07:00
Yann Collet
f21977c5e6 fix playTests.sh when EXE_PREFIX not null 2021-08-29 17:20:12 -07:00
Yann Collet
18191c85c9 adding optional QEMU_SYS 2021-08-29 16:43:32 -07:00
Yann Collet
1c97ec73d7 added qemu tests
running zstd library on emulated targets
2021-08-29 16:28:41 -07:00
Yann Collet
b341aa2f95 remove versions-compatibility test from GA
since it fails on Github Actions specifically.

The test is run on TravisCI for the time being.
Its duration has been reduced to ~6mn anyway.
2021-08-29 15:47:04 -07:00
Yann Collet
72bd2a83a0 reduce length of scanbuild static analyzer test
This was ~30mn, by far the longest run on travisCI.
That's because it re-analyzes multiple times the same files (library files notably).
It also performs actions that make no sense for the static analyzer purpose,
such as building the single-file library.

Reduced time spent in this test by reducing its scope :
just build the CLI, and obviously the library along it.
These are the only ones that really deserve to be analyzed.

Unfortunately, it still results in a number of false positives when using newer versions of scanbuild
(each version of scanbuild generates a different list of false positives).
These will have to be fixed before transfering to Github Actions.
2021-08-29 15:26:31 -07:00
Yann Collet
7f37b8a547 accelerate versionsCompatibilityTest
by allowing parallel build of units,
and reducing optimization levels.

Parallel build is only effective on "recent" versions of `zstd`,
as previously, the list of units was passed as a list of source files,
which is something neither `make` nor `gcc` can parallelize.
So its impact is mildly effective (-20%).

Reducing optimization level to `-O1` makes compilation much faster.
It also makes runtime slower,
but in this test, compilation time dominates run time.
The savings are very significant (-50%).

On my test system, it reduces the length of this test from 13mn to 5mn.
2021-08-29 14:48:11 -07:00
Yann Collet
ef69539849 transferred inter-versions compatibility tests to GA 2021-08-29 11:53:56 -07:00
sen
1f3fc1936c
Merge pull request #2753 from senhuang42/better_error_msg
[easy] Fix zstd bench error message
2021-08-23 20:37:48 -04:00
senhuang42
dce48f53df Fix benchzstd error message 2021-08-23 19:10:16 -04:00
Yann Collet
be82a0ab8f
Merge pull request #2746 from eli-schwartz/meson-fixup
meson fixups
2021-08-23 15:57:47 -07:00
Yann Collet
18a20b3ad7
Merge pull request #2752 from facebook/hashLog3max
make ZSTD_HASHLOG3_MAX private
2021-08-20 12:51:17 -07:00
Yann Collet
2de42174bb make ZSTD_HASHLOG3_MAX private
This is an implementation detail,
it doesn't belong to public space (zstd.h).
2021-08-20 09:52:42 -07:00
sen
ae998544de
Merge pull request #2750 from senhuang42/sb_compress
Improve branch misses on FSE symbol spreading
2021-08-20 12:47:24 -04:00
senhuang42
da095ed899 Improve branch misses on FSE symbol spreading 2021-08-18 10:22:22 -07:00