This replicates the behavior of @terrelln's `ZSTD_fast` implementation. That
is, it always looks at adjacent pairs of positions, and only applies the
acceleration every other position. This produces a more fine-grained
acceleration.
no idea why visual + clang-cl + appveyor don't like them,
I've not been able to reproduce the issue locally,
but these static assert are very unlikely to deliver a useful signal,
I can't imagine a situation where they will be wrong,
and if they are, then a ton of other things will be broken way before reaching that point.
I hadn't seen #2890, so I wrote my own version. I like this approach a little
better, since it does an explicit check for a regular file, rather than
passing a magic value.
Addresses #2874.
Move portability macros to `lib/common/portability_macros.h`. This file
only contains platform/feature detection (e.g. 0/1 macros). This file is
shared between C and ASM code, so it cannot include any C code.
Rename `HUF_` ASM macros to be `ZSTD_` prefixed, and move to the new
header.
Restrict `ZSTD_ASM_SUPPORTED` to `__GNUC__`, because we need the GAS
assembler.
Finally, only include the ASM code if we are actually going to use it.
This disables it on all Windows platforms, which should resolve the
problem brought up in Issue #2789.
Use the same trick as we did for zstd_lazy in PR #2828:
* Create one search function specialization for each (dictMode, mls).
* Select the search function pointer at the top of the match finder.
Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into
every function, since `dictMode` is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.
Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.
Compilation time sees a ~4x reduction:
| Compiler | Flags | Dev Time (s) | PR Time (s) | Delta |
|----------|----------------------------------|--------------|-------------|-------|
| gcc | -O3 | 10.1 | 2.3 | -77% |
| gcc | -O3 -fsanitize=address,undefined | 61.1 | 10.2 | -83% |
| clang | -O3 | 9.0 | 2.1 | -76% |
| clang | -O3 -fsanitize=address,undefined | 33.5 | 5.1 | -84% |
Build size is reduced by 150KB - 200KB:
| Compiler | Dev libzstd.a Size (B) | PR libzstd.a Size (B) | Delta |
|----------|------------------------|-----------------------|-------|
| gcc | 1327476 | 1177108 | -11% |
| clang | 1378324 | 1167780 | -15% |
There is a <2% speed loss in all cases:
| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta |
|----------|-------|------------------|-----------------|--------|
| gcc | 16 | 4.78 | 4.72 | -1.25% |
| gcc | 17 | 3.49 | 3.46 | -0.85% |
| gcc | 18 | 2.92 | 2.86 | -2.04% |
| gcc | 19 | 2.61 | 2.61 | 0.00% |
| clang | 16 | 4.69 | 4.80 | 2.34% |
| clang | 17 | 3.53 | 3.49 | -1.13% |
| clang | 18 | 2.86 | 2.85 | -0.34% |
| clang | 19 | 2.61 | 2.61 | 0.00% |
Fixes Issue #2862.
`lib/deprecated` is no longer built by zstd's bundled build files. However,
users may try to build these files when they import the source tree into
their own build systems. And if they have `-Wdeprecated-declarations` on,
this can produce warnings.
This PR migrates these files away from using deprecated declarations.
This addresses #2767.
because mem.h is dropped in the Linux kernel.
Changed macro definition order (gcc/clang/msvc before c11)
due to a limitation in the kernel source builder.
Changed the backup to sizeof(),
reverting to previous behavior when no support of alignof() is detected.