mirror of
https://github.com/facebook/zstd.git
synced 2025-12-07 00:02:39 -05:00
Fix ZSTD_execSequence() performance regression
Commit ae1cb3b3d07024618269b89e3421d828adfd34d9 caused the regression. It is an instruction alignment issue, because if it is `U64 i` instead of `U32 i`, the regression returns. This patch fixes the regression in gcc, but only gets some of the clang performance back. Benchmarks: Run on `silesia.tar`. I only show levels 1-5 because the performance regression was uniform across all levels. I did one run on levels 1-19 and it looked good. | Build | Level | Before | While | After | |-------|-------|-------:|------:|------:| | gcc | 1 | 931.4 | 904.4 | 932.8 | | gcc | 2 | 849.1 | 822.6 | 851.2 | | gcc | 3 | 815.6 | 790.6 | 818.9 | | gcc | 4 | 794.1 | 770.7 | 798.0 | | gcc | 5 | 785.7 | 760.7 | 788.8 | | clang | 1 | 705.5 | 683.2 | 693.8 | | clang | 2 | 670.0 | 649.2 | 660.7 | | clang | 3 | 659.6 | 639.8 | 651.4 | | clang | 4 | 652.5 | 634.7 | 645.9 | | clang | 5 | 646.9 | 625.5 | 637.7 |
This commit is contained in:
parent
ee5b725823
commit
10bfd0c0d5
@ -887,7 +887,8 @@ size_t ZSTD_execSequence(BYTE* op,
|
||||
sequence.matchLength -= length1;
|
||||
match = base;
|
||||
if (op > oend_w) {
|
||||
while (op < oMatchEnd) *op++ = *match++;
|
||||
U32 i;
|
||||
for (i = 0; i < sequence.matchLength; ++i) op[i] = match[i];
|
||||
return sequenceLength;
|
||||
}
|
||||
} }
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user