`zstd` CLI has progressively moved to the policy of
ignoring `--rm` command when the output is `stdout`.
The primary drive is to feature a behavior more consistent with `gzip`,
when `--rm` is the default, but is also ignored when output is `stdout`.
Other policies are certainly possible, but would break from this `gzip` convention.
The new policy was inconsistenly enforced, depending on the exact list of commands.
For example, it was possible to circumvent it by using `-c --rm` in this order,
which would re-establish source removal.
- Update the CLI so that it necessarily catch these situations and ensure that `--rm` is always disabled when output is `stdout`.
- Added a warning message in this case (for verbosity 3 `-v`).
- Added an `assert()`, which controls that `--rm` is no longer active with `stdout`
- Added tests, which control the behavior, even when `--rm` is added after `-c`
- Removed some legacy code which where trying to apply a specific policy for the `stdout` + `--rm` case, which is no longer possible
Older versions of zstandard have a bug in the dictionary builder, that
can cause dictionary building to fail. The process still exits 0, but
the dictionary is not created.
For reference, the bug is that it creates a dictionary that starts with
the zstd dictionary magic, in the process of writing the dictionary header,
but the header isn't fully written yet, and zstd fails compressions in
this case, because the dictionary is malformated. We fixed this later on
by trying to load the dictionary as a zstd dictionary, but if that fails
we fallback to content only (by default).
The fix is to:
1. Make the dictionary determinsitic by sorting the input files.
Previously the bug would only sometimes occur, when the input files
were in a particular order.
2. If dictionary creation fails, fallback to the `head` dictionary.
* Cap shortCache chainLog to 24
* Cap row match finder hashLog so that rowLog <= 24
* Add unit tests to expose all cases. The row match finder unit tests
are only run in 64-bit mode, because they allocate ~1GB.
Fixes#3336
The dictionary source files were taken from the `dev` branch before this
commit, which could introduce non-determinism on PR jobs. Instead take
the sources from the PR checkout.
This PR also adds stderr logging, and verbose output for the jobs that
are failing, to help catch the failure if it occurs again.
The timer storage type is no longer dependent on OS.
This will make it possible to re-enable posix precise timers
since the timer storage type will no longer be sensible to #include order.
See #3168 for details of pbs of previous interface.
Suggestion by @terrelln
* Add a function and macro ZSTD_decompressionMargin() that computes the
decompression margin for in-place decompression. The function computes
a tight margin that works in all cases, and the macro computes an upper
bound that will only work if flush isn't used.
* When doing in-place decompression, make sure that our output buffer
doesn't overlap with the input buffer. This ensures that we don't
decide to use the portion of the output buffer that overlaps the input
buffer for temporary memory, like for literals.
* Add a simple unit test.
* Add in-place decompression to the simple_round_trip and
stream_round_trip fuzzers. This should help verify that our margin stays
correct.
At Google we fuzz zstd without ZSTD_MULTITHREAD but we want inputs to be as much as reproducible. It allows us to test new fuzzing methods for our fuzz team internally and have more horsepower to find bugs
comparing level 19 to level 22 and expecting a stricter better result from level 22
is not that guaranteed,
because level 19 and 22 are very close to each other,
especially for small files,
so any noise in the final compression result
result in failing this test.
Level 22 could be compared to something much lower, like level 15,
But level 19 is required anyway, because there is a clamping test which depends on it.
Removed level 22, kept level 19
Inspired by #3395,
offer a new capability to set all parameters defined in a ZSTD_compressionParameters structure
with a single symbol invocation
to improve user code brevity.
Basic tests for (de)compressing in the following modes:
* file to file
* file to stdout
* stdin to file
* stdin to stdout
These are basic tests, and aren't testing more advanced scenarios, but
it adds the groundwork for more complex tests as needed.
Fixes#3010.
Thanks to @eli-schwartz for pointing it out!
We should maybe consider adding a helper function for applying
`ZSTD_parameters` and `ZSTD_compressionParameters` to a context.
That would aid the transition to the new API in situations like this.