9506 Commits

Author SHA1 Message Date
Yann Collet
8df1257c3c fix issue 44108
credit to oss-fuzz

In rare circumstances, the block-splitter might cut a block at the exact beginning of a repcode.
In which case, since litlength=0, if the repcode expected 1+ literals in front, its signification changes.
This scenario is controlled in ZSTD_seqStore_resolveOffCodes(),
and the repcode is transformed into a raw offset when its new meaning is incorrect.

In more complex scenarios, the previous block might be emitted as uncompressed after all,
thus modifying the expected repcode history.
In the case discovered by oss-fuzz, the first block is emitted as uncompressed,
so the repcode history remains at default values: 1,4,8.

But since the starting repcode is repcode3, and the literal length is == 0,
its meaning is : = repcode1 - 1.
Since repcode1==1, it results in an offset value of 0, which is invalid.

So that's what the `assert()` was verifying : the result of the repcode translation should be a valid offset.

But actually, it doesn't matter, because this result will then be compared to reality,
and since it's an invalid offset, it will necessarily be discarded if incorrect,
then the repcode will be replaced by a raw offset.

So the `assert()` is not useful.
Furthermore, it's incorrect, because it assumes this situation cannot happen, but it does, as described in above scenario.
2022-01-27 05:49:59 -08:00
brailovich
501a353b91
Update playTests.sh 2022-01-26 18:56:52 -08:00
Yann Collet
f2d9652ad8 more usage of new error code stabilityCondition_notRespected
as suggested by @terrelln
2022-01-26 18:30:55 -08:00
Nick Terrell
e60eba58bf Print zlib/lz4/lzma library versions in verbose version output
Knowing the version of zlib/lz4/lzma we're linking against is very
useful for debugging issues with those libraries, so print it out in the
verbosity 4 version output.

Also print this information at the top of `playTests.sh`.
2022-01-26 18:25:58 -08:00
Yann Collet
7543085013
Merge pull request #3019 from facebook/huf_traces
More traces to improved debugging of literals compression
2022-01-26 18:02:05 -08:00
brailovich
5e7523385b
Update playTests.sh 2022-01-26 16:53:11 -08:00
brailovich
beb4872241
Update zstdcli.c 2022-01-26 16:51:18 -08:00
Yann Collet
8b46895588 removed new huffman depth heuristic
results are now identical to before this PR
2022-01-26 15:22:06 -08:00
Yann Collet
a66e8bb437 introduced LitHufLog constant
which properly represents the maximum bit size of compressed literals (11) as defined in the specification.

To be preferred from HUF_TABLELOG_DEFAULT which represents the same value but by accident.

Name selected to keep the same convention as existing width definitions,
MLFSELog, LLFSELog and OffFSELog.
2022-01-26 14:47:24 -08:00
Yann Collet
2d154e627a renamed HufLog into ZSTD_HUFFDTABLE_CAPACITY_LOG
old name was not descriptive and actually misleading
2022-01-26 14:47:24 -08:00
Yann Collet
32a5d95dcb moved HufLog to lib/decompress
it's only used to size decompression tables
2022-01-26 14:47:24 -08:00
Yann Collet
e9dd923fa4 only declare debug functions in debug mode 2022-01-26 14:47:24 -08:00
Yann Collet
5db717af10 proper max limit to 11 2022-01-26 14:47:24 -08:00
Yann Collet
4684836f4f update regression tests
minor compression ratio benefits in some cases,
no compression ratio regression in the measured scenarios.
2022-01-26 14:47:24 -08:00
Yann Collet
51da2d2ff2 improved compression of literals in specific corner cases
In rare cases, the default huffman depth selector is a bit too harsh,
requiring brutal adaptations to the tree,
resulting is some loss of compression ratio.
This new heuristic avoids the worse cases, favoring compression ratio.

As an example, compression of a specific distribution of 771 literals
is now improved to 441 bytes, from 601 bytes before.
2022-01-26 14:47:24 -08:00
Yann Collet
7616e39f3b adding traces to better track processing of literals 2022-01-26 14:47:21 -08:00
Yann Collet
a0acf9aa49
Merge pull request #3023 from facebook/fix_seqCompress_withDelimiter
fix sequence compression API in Explicit Delimiter mode
2022-01-26 14:15:28 -08:00
Yann Collet
dda4c10f07 added ZSTD_compressStream2() + ZSTD_c_stableInBuffer test 2022-01-26 13:33:04 -08:00
Yann Collet
cbff372d10 added helper function inBuffer_forEndFlush() 2022-01-26 11:05:57 -08:00
Yann Collet
b99ece96b9 converted checks into user validation generating error codes
had to create a new error code for this condition,
none of the existing ones were fitting enough.
2022-01-26 10:43:50 -08:00
Yann Collet
af3d9c506e added streaming test starting from non-0 pos 2022-01-26 10:31:25 -08:00
Yann Collet
c1668a00d2 fix extended case combining stableInBuffer with continue() and flush() modes 2022-01-26 10:31:25 -08:00
Yann Collet
270f9bf005 better consistency in accessing @input
as suggested by @terrelln.

Also : commented zstreamtest more
to ensure ZSTD_stableInBuffer is tested/
2022-01-26 10:31:24 -08:00
Yann Collet
8296be4a0a pretend consuming input to provide a sense of forward progress 2022-01-26 10:31:24 -08:00
Yann Collet
4b9d1dd9ff fixed incorrect comment 2022-01-26 10:31:24 -08:00
Yann Collet
27d336b099 minor behavior refinements
specifically, there is no obligation to start streaming compression with pos=0.
stableSrc mode is now compatible with this setup.
2022-01-26 10:31:24 -08:00
Yann Collet
37b87add7a make stableSrc compatible with regular streaming API
including flushStream().

Now the only condition is for `input.size` to continuously grow.
2022-01-26 10:31:24 -08:00
Yann Collet
c0c5ffa973 streaming compression : lazy parameter adaptation with stable input
effectively makes ZSTD_c_stableInput compatible ZSTD_compressStream()
and zstd_e_continue operation mode.
2022-01-26 10:31:24 -08:00
Yann Collet
5684bae4f6 minor refactoring
on streaming compression implementation.
2022-01-26 10:31:23 -08:00
Yann Collet
fc2ea97442 refactored fuzzer tests for sequence compression api
add explicit delimiter mode to libfuzzer test
2022-01-26 00:19:35 -08:00
Yann Collet
87dcd3326a fix sequence compression API in Explicit Delimiter mode 2022-01-25 13:33:41 -08:00
brailovich
62583dc1ea
Merge pull request #1 from brailovich/brailovich-patch-1
fix for error message in recursive mode for an empty folder
2022-01-24 17:54:41 -08:00
brailovich
4021b78437
fix for error message in recursive mode for an empty folder
-r on empty directory resulted in zstd waiting input from stdin. now zstd exits without error and prints a warning message explaining why no processing happened (no files or directories to process).
2022-01-24 17:42:21 -08:00
Yann Collet
cc7d23bcec
Merge pull request #2965 from facebook/offbase
Converge sumtype (offset | repcode) numeric representation towards offBase
2022-01-24 15:47:42 -08:00
Yonatan Komornik
70df5de1b2
AsyncIO compression part 1 - refactor of existing asyncio code (#3021)
* Refactored fileio.c:
- Extracted asyncio code to fileio_asyncio.c/.h
- Moved type definitions to fileio_types.h
- Moved common macro definitions needed by both fileio.c and fileio_asyncio.c to fileio_common.h

* Bugfix - rename fileio_asycio to fileio_asyncio

* Added copyrights & license to new files

* CR fixes
2022-01-24 14:43:02 -08:00
Nick Terrell
87f81d0796
Merge pull request #3026 from trixirt/from-linux-fix
cleanup double word in comment.
2022-01-24 14:31:16 -08:00
Tom Rix
2b957afec7 cleanup double word in comment.
Remove the second 'a' and 'into'

Signed-off-by: Tom Rix <trix@redhat.com>
2022-01-24 12:43:39 -08:00
Yann Collet
feaaf7a6b1 slightly shortened status and summary lines in very verbose mode 2022-01-21 21:38:35 -08:00
binhdvo
17017ac8db
Change zstdless behavior to align with zless (#2909)
* Change zstdless behavior to align with zless
2022-01-21 19:57:19 -05:00
Yann Collet
b27356fd27
Merge pull request #3005 from cwoffenden/faster-amalgamate
Use faster Python script to amalgamate
2022-01-21 16:12:22 -08:00
Yann Collet
24318093cc slightly shortened compression status update line
to fit within 80 columns limit.
2022-01-21 14:08:46 -08:00
Yonatan Komornik
8ab95f24da
Merge pull request #2985 from yoniko/zstd-output-file-buffer
ZSTD CLI: Use buffered output
2022-01-21 13:57:05 -08:00
Yonatan Komornik
1598e6c634
Async write for decompression (#2975)
* Async IO decompression:
- Added --[no-]asyncio flag for CLI decompression.
- Replaced dstBuffer in decompression with a pool of write jobs.
- Added an ability to execute write jobs in a separate thread.
- Added an ability to wait (join) on all jobs in a thread pool (queued and running).
2022-01-21 13:55:41 -08:00
Nick Terrell
2f03c1996f
Merge pull request #3013 from WojciechMula/simplify-asm
Simplify HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop
2022-01-21 11:13:40 -08:00
Felix Handte
4a35912b3b
Merge pull request #3018 from felixhandte/gh-action-release-artifacts-trigger-fix
Trigger Release Artifact Generation on Publish
2022-01-20 21:08:59 -05:00
Yann Collet
71921e596f
Merge pull request #2983 from facebook/minLitPricev2
[opt] minor compression ratio improvement
2022-01-20 16:02:31 -08:00
W. Felix Handte
fa9cb4510a Trigger Release Artifact Generation on Publish
We previously triggered release artifact generation on release creation. We
sometimes observed that the action failed to run. I hypothesized that we were
hitting rate limiting or something. I just stumbled across [this documentat-
ion](https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#release), which says:

> Note: Workflows are not triggered for the `created`, `edited`, or `deleted`
> activity types for draft releases. When you create your release through the
> GitHub browser UI, your release may automatically be saved as a draft.

This must have been what was happening. This commit therefore changes the
trigger to the `published` activity. This should be more reliable.

This does have the unfortunate side effect that artifacts won't be generated
or attached until *after* the release has been published, which is what I was
trying to avoid by using the `created` activity. Oh well.
2022-01-20 17:36:28 -05:00
Felix Handte
330c97d2bf
Merge pull request #3015 from felixhandte/enable-cet-test-illegal-instruction
Add GitHub Action Checking that Zstd Runs Successfully Under CET
2022-01-20 16:18:42 -05:00
Elliot Gorokhovsky
a8f1aa2f6d
Merge pull request #3016 from embg/macro_lint
Minor lint fix
2022-01-20 13:05:02 -07:00
Felix Handte
c7e8315d88
Merge pull request #2994 from hjl-tools/hjl/cet-report/dev
x86: Append -z cet-report=error to LDFLAGS
2022-01-20 14:58:13 -05:00