zstd/compress at efd37a64eaff5a0a26ae2566fdb45dc4a0c91673 - zstd

mirror of https://github.com/facebook/zstd.git synced 2025-12-06 00:02:05 -05:00

History

Nick Terrell efd37a64ea Optimize decompression and fix wildcopy overread

* Bump `WILDCOPY_OVERLENGTH` to 16 to fix the wildcopy overread.
* Optimize `ZSTD_wildcopy()` by removing unnecessary branches and
  unrolling the loop.
* Extract `ZSTD_overlapCopy8()` into its own function.
* Add `ZSTD_safecopy()` for `ZSTD_execSequenceEnd()`. It is
  optimized for single long sequences, since that is the important
  case that can end up in `ZSTD_execSequenceEnd()`. Without this
  optimization, decompressing a block with 1 long match goes
  from 5.7 GB/s to 800 MB/s.
* Refactor `ZSTD_execSequenceEnd()`.
* Increase the literal copy shortcut to 16.
* Add a shortcut for offset >= 16.
* Simplify `ZSTD_execSequence()` by pushing more cases into
  `ZSTD_execSequenceEnd()`.
* Delete `ZSTD_execSequenceLong()` since it is exactly the
  same as `ZSTD_execSequence()`.

clang-8 seeds +17.5% on silesia and +21.8% on enwik8.
gcc-9 sees +12% on silesia and +15.5% on enwik8.

TODO: More detailed measurements, and on more datasets.

Crdit to OSS-Fuzz for finding the wildcopy overread.

2019-09-19 21:07:14 -07:00

fse_compress.c

Spelling (#1582 )

2019-04-12 11:18:11 -07:00

hist.c

refactor HUF_compress_internal for clarity

2018-10-26 13:21:37 -07:00

hist.h

refactor HUF_compress_internal for clarity

2018-10-26 13:21:37 -07:00

huf_compress.c

fix confusion between unsigned <-> U32