mirror of
https://github.com/facebook/zstd.git
synced 2025-12-07 00:02:39 -05:00
Increment Step by 1 not 2
I couldn't find a good way to spread `ip0` and `ip1` apart when we accelerate due to incompressible inputs. (The methods I tried slowed things down quite a bit.) Since we aren't splaying ip0 and ip1 apart (which would be like `0_1_2_3_`, as opposed to the `01__23__` we were actually doing), it's a big ambitious to increment `step` by 2. Instead, let's increment it by 1, which has the benefit sliiightly improving compression. Speed remains pretty much unchanged.
This commit is contained in:
parent
6ca5f42402
commit
82a49c88f9
@ -234,19 +234,19 @@ _start: /* Requires: ip0 */
|
||||
hash0 = hash1;
|
||||
hash1 = ZSTD_hashPtr(ip2, hlog, mls);
|
||||
|
||||
/* calculate step */
|
||||
if (ip2 >= nextStep) {
|
||||
PREFETCH_L1(ip1 + 64);
|
||||
PREFETCH_L1(ip1 + 128);
|
||||
step += 2;
|
||||
nextStep += kStepIncr;
|
||||
}
|
||||
|
||||
/* advance to next positions */
|
||||
ip0 = ip1;
|
||||
ip1 = ip2;
|
||||
ip2 = ip0 + step;
|
||||
ip3 = ip1 + step;
|
||||
|
||||
/* calculate step */
|
||||
if (ip2 >= nextStep) {
|
||||
step++;
|
||||
PREFETCH_L1(ip1 + 64);
|
||||
PREFETCH_L1(ip1 + 128);
|
||||
nextStep += kStepIncr;
|
||||
}
|
||||
} while (ip3 < ilimit);
|
||||
|
||||
_cleanup:
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user