Increment Step by 1 not 2

I couldn't find a good way to spread `ip0` and `ip1` apart when we accelerate
due to incompressible inputs. (The methods I tried slowed things down quite a
bit.)

Since we aren't splaying ip0 and ip1 apart (which would be like `0_1_2_3_`, as
opposed to the `01__23__` we were actually doing), it's a big ambitious to
increment `step` by 2. Instead, let's increment it by 1, which has the benefit
sliiightly improving compression. Speed remains pretty much unchanged.
This commit is contained in:
W. Felix Handte 2021-12-13 15:46:41 -05:00
parent 6ca5f42402
commit 82a49c88f9

View File

@ -234,19 +234,19 @@ _start: /* Requires: ip0 */
hash0 = hash1; hash0 = hash1;
hash1 = ZSTD_hashPtr(ip2, hlog, mls); hash1 = ZSTD_hashPtr(ip2, hlog, mls);
/* calculate step */
if (ip2 >= nextStep) {
PREFETCH_L1(ip1 + 64);
PREFETCH_L1(ip1 + 128);
step += 2;
nextStep += kStepIncr;
}
/* advance to next positions */ /* advance to next positions */
ip0 = ip1; ip0 = ip1;
ip1 = ip2; ip1 = ip2;
ip2 = ip0 + step; ip2 = ip0 + step;
ip3 = ip1 + step; ip3 = ip1 + step;
/* calculate step */
if (ip2 >= nextStep) {
step++;
PREFETCH_L1(ip1 + 64);
PREFETCH_L1(ip1 + 128);
nextStep += kStepIncr;
}
} while (ip3 < ilimit); } while (ip3 < ilimit);
_cleanup: _cleanup: