From 042419ec2acb8d6bcf42cf337ae54c966a8e576d Mon Sep 17 00:00:00 2001 From: Sean Purcell Date: Fri, 17 Feb 2017 16:24:26 -0800 Subject: [PATCH 01/22] Restructure Format Specification --- doc/zstd_compression_format.md | 1289 +++++++++++++++++--------------- 1 file changed, 677 insertions(+), 612 deletions(-) diff --git a/doc/zstd_compression_format.md b/doc/zstd_compression_format.md index df983284f..f08dc9537 100644 --- a/doc/zstd_compression_format.md +++ b/doc/zstd_compression_format.md @@ -3,7 +3,7 @@ Zstandard Compression Format ### Notices -Copyright (c) 2016 Yann Collet +Copyright (c) 2016-present Yann Collet, Facebook, Inc. Permission is granted to copy and distribute this document for any purpose and without charge, @@ -16,8 +16,7 @@ Distribution of this document is unlimited. ### Version -0.2.3 (27/01/17) - +0.2.4 (17/02/17) Introduction ------------ @@ -57,17 +56,15 @@ Whenever it does not support a parameter defined in the compressed stream, it must produce a non-ambiguous error code and associated error message explaining which parameter is unsupported. -Overall conventions ------------ +### Overall conventions In this document: - square brackets i.e. `[` and `]` are used to indicate optional fields or parameters. -- a naming convention for identifiers is `Mixed_Case_With_Underscores` +- the naming convention for identifiers is `Mixed_Case_With_Underscores` -Definitions ------------ -A content compressed by Zstandard is transformed into a Zstandard __frame__. +### Definitions +Content compressed by Zstandard is transformed into a Zstandard __frame__. Multiple frames can be appended into a single file or stream. -A frame is totally independent, has a defined beginning and end, +A frame is completely independent, has a defined beginning and end, and a set of parameters which tells the decoder how to decompress it. A frame encapsulates one or multiple __blocks__. @@ -77,63 +74,33 @@ Unlike frames, each block depends on previous blocks for proper decoding. However, each block can be decompressed without waiting for its successor, allowing streaming operations. +Overview +--------- +- [Frames](#frames) + - [Zstandard frames](#zstandard-frames) + - [Blocks](#blocks) + - [Literals Section](#literals-section) + - [Sequences Section](#sequences-section) + - [Sequence Execution](#sequence-execution) + - [Skippable frames](#skippable-frames) +- [Entropy Encoding](#entropy-encoding) + - [FSE](#fse) + - [Huffman Coding](#huffman-coding) +- [Dictionary Format](#dictionary-format) -Frame Concatenation -------------------- +Frames +------ +Zstandard compressed data is made of up one or more __frames__. +Each frame is independent and can be decompressed indepedently of other frames. +The decompressed content of multiple concatenated frames is the concatenation of +each frames decompressed content. -In some circumstances, it may be required to append multiple frames, -for example in order to add new data to an existing compressed file -without re-framing it. +There are two frame formats defined by Zstandard: + Zstandard frames and Skippable frames. +Zstandard frames contain compressed data, while +skippable frames contain no data and can be used for metadata. -In such case, each frame brings its own set of descriptor flags. -Each frame is considered independent. -The only relation between frames is their sequential order. - -The ability to decode multiple concatenated frames -within a single stream or file is left outside of this specification. -As an example, the reference `zstd` command line utility is able -to decode all concatenated frames in their sequential order, -delivering the final decompressed result as if it was a single content. - - -Skippable Frames ----------------- - -| `Magic_Number` | `Frame_Size` | `User_Data` | -|:--------------:|:------------:|:-----------:| -| 4 bytes | 4 bytes | n bytes | - -Skippable frames allow the insertion of user-defined data -into a flow of concatenated frames. -Its design is pretty straightforward, -with the sole objective to allow the decoder to quickly skip -over user-defined data and continue decoding. - -Skippable frames defined in this specification are compatible with [LZ4] ones. - -[LZ4]:http://www.lz4.org - -__`Magic_Number`__ - -4 Bytes, little-endian format. -Value : 0x184D2A5?, which means any value from 0x184D2A50 to 0x184D2A5F. -All 16 values are valid to identify a skippable frame. - -__`Frame_Size`__ - -This is the size, in bytes, of the following `User_Data` -(without including the magic number nor the size field itself). -This field is represented using 4 Bytes, little-endian format, unsigned 32-bits. -This means `User_Data` can’t be bigger than (2^32-1) bytes. - -__`User_Data`__ - -The `User_Data` can be anything. Data will just be skipped by the decoder. - - - -General Structure of Zstandard Frame format -------------------------------------------- +## Zstandard frames The structure of a single Zstandard frame is following: | `Magic_Number` | `Frame_Header` |`Data_Block`| [More data blocks] | [`Content_Checksum`] | @@ -147,11 +114,11 @@ Value : 0xFD2FB528 __`Frame_Header`__ -2 to 14 Bytes, detailed in [next part](#the-structure-of-frame_header). +2 to 14 Bytes, detailed in [`Frame_Header`](#frame_header). __`Data_Block`__ -Detailed in [next chapter](#the-structure-of-data_block). +Detailed in [`Blocks`](#blocks). That’s where compressed data is stored. __`Content_Checksum`__ @@ -162,10 +129,9 @@ of [xxh64() hash function](http://www.xxhash.org) digesting the original (decoded) data as input, and a seed of zero. The low 4 bytes of the checksum are stored in little endian format. +### `Frame_Header` -The structure of `Frame_Header` -------------------------------- -The `Frame_Header` has a variable size, which uses a minimum of 2 bytes, +The `Frame_Header` has a variable size, with a minimum of 2 bytes, and up to 14 bytes depending on optional parameters. The structure of `Frame_Header` is following: @@ -173,10 +139,10 @@ The structure of `Frame_Header` is following: | ------------------------- | --------------------- | ----------------- | ---------------------- | | 1 byte | 0-1 byte | 0-4 bytes | 0-8 bytes | -### `Frame_Header_Descriptor` +#### `Frame_Header_Descriptor` The first header's byte is called the `Frame_Header_Descriptor`. -It tells which other fields are present. +It describes which other fields are present. Decoding this byte is enough to tell the size of `Frame_Header`. | Bit number | Field name | @@ -188,7 +154,7 @@ Decoding this byte is enough to tell the size of `Frame_Header`. | 2 | `Content_Checksum_flag` | | 1-0 | `Dictionary_ID_flag` | -In this table, bit 7 is highest bit, while bit 0 is lowest. +In this table, bit 7 the is highest bit, while bit 0 the is lowest. __`Frame_Content_Size_flag`__ @@ -216,7 +182,7 @@ but `Window_Descriptor` byte is skipped. As a consequence, the decoder must allocate a memory segment of size equal or bigger than `Frame_Content_Size`. -In order to preserve the decoder from unreasonable memory requirement, +In order to preserve the decoder from unreasonable memory requirements, a decoder can reject a compressed frame which requests a memory size beyond decoder's authorized range. @@ -256,7 +222,7 @@ It also specifies the size of this field as `Field_Size`. | ---------- | --- | --- | --- | --- | |`Field_Size`| 0 | 1 | 2 | 4 | -### `Window_Descriptor` +#### `Window_Descriptor` Provides guarantees on maximum back-reference distance that will be used within compressed data. @@ -294,12 +260,12 @@ It's merely a recommendation though, decoders are free to support larger or lower limits, depending on local limitations. -### `Dictionary_ID` +#### `Dictionary_ID` This is a variable size field, which contains the ID of the dictionary required to properly decode the frame. Note that this field is optional. When it's not present, -it's up to the caller to make sure it uses the correct dictionary. +it's up to the decoder to make sure it uses the correct dictionary. Format is little-endian. Field size depends on `Dictionary_ID_flag`. @@ -319,7 +285,7 @@ the following ranges are reserved for future use and should not be used : - high range : >= (2^31) -### `Frame_Content_Size` +#### `Frame_Content_Size` This is the original (uncompressed) size. This information is optional. The `Field_Size` is provided according to value of `Frame_Content_Size_flag`. @@ -337,10 +303,14 @@ When `Field_Size` is 1, 4 or 8 bytes, the value is read directly. When `Field_Size` is 2, _the offset of 256 is added_. It's allowed to represent a small size (for example `18`) using any compatible variant. +Blocks +------- +After the magic number and header of each block, +there are some number of blocks. +Each frame must have at least one block but there is no upper limit +on the number of blocks per frame. -The structure of `Data_Block` ------------------------------ -The structure of `Data_Block` is following: +The structure of a block is as follows: | `Last_Block` | `Block_Type` | `Block_Size` | `Block_Content` | |:------------:|:------------:|:------------:|:---------------:| @@ -351,8 +321,9 @@ The block header (`Last_Block`, `Block_Type`, and `Block_Size`) uses 3-bytes. __`Last_Block`__ The lowest bit signals if this block is the last one. -Frame ends right after this block. -It may be followed by an optional `Content_Checksum` . +The frame will end after this one. +It may be followed by an optional `Content_Checksum` +(see [Zstandard Frames](#zstandard-frames)). __`Block_Type` and `Block_Size`__ @@ -367,15 +338,19 @@ There are 4 block types : | `Block_Type` | `Raw_Block` | `RLE_Block` | `Compressed_Block` | `Reserved`| - `Raw_Block` - this is an uncompressed block. - `Block_Size` is the number of bytes to read and copy. + `Block_Content` contains `Block_Size` bytes to read and copy + as decoded data. + - `RLE_Block` - this is a single byte, repeated N times. - In which case, `Block_Size` is the size to regenerate, - while the "compressed" block is just 1 byte (the byte to repeat). -- `Compressed_Block` - this is a [Zstandard compressed block](#the-format-of-compressed_block), - detailed in another section of this specification. - `Block_Size` is the compressed size. - Decompressed size is unknown, + `Block_Content` consists of a single byte, + and `Block_Size` is the number of times this byte should be repeated. + +- `Compressed_Block` - this is a [Zstandard compressed block](#compressed-blocks), + explained later on. + `Block_Size` is the length of `Block_Content`, the compressed data. + The decompressed size is unknown, but its maximum possible value is guaranteed (see below) + - `Reserved` - this is not a block. This value cannot be used with current version of this specification. @@ -384,42 +359,36 @@ Block sizes must respect a few rules : - Block decompressed size is always <= maximum back-reference distance. - Block decompressed size is always <= 128 KB. - -__`Block_Content`__ - -The `Block_Content` is where the actual data to decode stands. -It might be compressed or not, depending on previous field indications. A data block is not necessarily "full" : since an arbitrary “flush” may happen anytime, -block decompressed content can be any size, +block decompressed content can be any size (even empty), up to `Block_Maximum_Decompressed_Size`, which is the smallest of : - Maximum back-reference distance - 128 KB - - -The format of `Compressed_Block` --------------------------------- -The size of `Compressed_Block` must be provided using `Block_Size` field from `Data_Block`. -The `Compressed_Block` has a guaranteed maximum regenerated size, -in order to properly allocate destination buffer. -See [`Data_Block`](#the-structure-of-data_block) for more details. +Compressed Blocks +----------------- +To decompress a compressed block, the compressed size must be provided from +`Block_Size` field in the block header. A compressed block consists of 2 sections : -- [`Literals_Section`](#literals_section) -- [`Sequences_Section`](#sequences_section) +- [Literals Section](#literals-section) +- [Sequences Section](#sequences-section) -### Prerequisites +The results of the two sections are then combined to produce the decompressed +data in [Sequence Execution](#sequence-execution) + +#### Prerequisites To decode a compressed block, the following elements are necessary : -- Previous decoded blocks, up to a distance of `Window_Size`, - or all previous blocks when `Single_Segment_flag` is set. -- List of "recent offsets" from previous compressed block. -- Decoding tables of previous compressed block for each symbol type +- Previous decoded data, up to a distance of `Window_Size`, + or all previous data when `Single_Segment_flag` is set. +- List of "recent offsets" from the previous compressed block. +- Decoding tables of the previous compressed block for each symbol type (literals, literals lengths, match lengths, offsets). - -### `Literals_Section` - +Literals Section +---------------- +During sequence execution, symbols from the literals section During sequence phase, literals will be entangled with match copy operations. All literals are regrouped in the first part of the block. They can be decoded first, and then copied during sequence operations, @@ -443,7 +412,7 @@ using little-endian convention. | --------------------- | ------------- | ------------------ | ----------------- | | 2 bits | 1 - 2 bits | 5 - 20 bits | 0 - 18 bits | -In this representation, bits on the left are smallest bits. +In this representation, bits on the left are the lowest bits. __`Literals_Block_Type`__ @@ -464,14 +433,16 @@ This field uses 2 lowest bits of first byte, describing 4 different block types - `Repeat_Stats_Literals_Block` - This is a Huffman-compressed block, using Huffman tree _from previous Huffman-compressed literals block_. Huffman tree description will be skipped. + Note: If this mode is used without any previous Huffman-table in the frame + (or [dictionary](#dictionary-format)), this should be treated as corruption. __`Size_Format`__ `Size_Format` is divided into 2 families : -- For `Compressed_Block`, it requires to decode both `Compressed_Size` - and `Regenerated_Size` (the decompressed size). It will also decode the number of streams. - For `Raw_Literals_Block` and `RLE_Literals_Block` it's enough to decode `Regenerated_Size`. +- For `Compressed_Block`, its required to decode both `Compressed_Size` + and `Regenerated_Size` (the decompressed size). It will also decode the number of streams. For values spanning several bytes, convention is little-endian. @@ -490,32 +461,595 @@ __`Size_Format` for `Raw_Literals_Block` and `RLE_Literals_Block`__ : `Literals_Section_Header` has 3 bytes. `Regenerated_Size = (Header[0]>>4) + (Header[1]<<4) + (Header[2]<<12)` +Only Stream1 is present for these cases. Note : it's allowed to represent a short value (for example `13`) using a long format, accepting the increased compressed data size. __`Size_Format` for `Compressed_Literals_Block` and `Repeat_Stats_Literals_Block`__ : - Value 00 : _A single stream_. - Both `Compressed_Size` and `Regenerated_Size` use 10 bits (0-1023). + Both `Regenerated_Size` and `Compressed_Size` use 10 bits (0-1023). `Literals_Section_Header` has 3 bytes. - Value 01 : 4 streams. - Both `Compressed_Size` and `Regenerated_Size` use 10 bits (0-1023). + Both `Regenerated_Size` and `Compressed_Size` use 10 bits (0-1023). `Literals_Section_Header` has 3 bytes. - Value 10 : 4 streams. - Both `Compressed_Size` and `Regenerated_Size` use 14 bits (0-16383). + Both `Regenerated_Size` and `Compressed_Size` use 14 bits (0-16383). `Literals_Section_Header` has 4 bytes. - Value 11 : 4 streams. - Both `Compressed_Size` and `Regenerated_Size` use 18 bits (0-262143). + Both `Regenerated_Size` and `Compressed_Size` use 18 bits (0-262143). `Literals_Section_Header` has 5 bytes. Both `Compressed_Size` and `Regenerated_Size` fields follow little-endian convention. Note: `Compressed_Size` __includes__ the size of the Huffman Tree description if it is present. +### Raw Literals Block +The data in Stream1 is `Regenerated_Size` bytes long, and contains the raw literals data +to be used in sequence execution. + +### RLE Literals Block +Stream1 consists of a single byte which should be repeated `Regenerated_Size` times +to generate the decoded literals. + +### Compressed Literals Block and Repeat Stats Literals Block +Both of these modes contain Huffman encoded data + #### `Huffman_Tree_Description` - This section is only present when `Literals_Block_Type` type is `Compressed_Literals_Block` (`2`). +The format of the Huffman tree description can be found at [Huffman Tree description](#huffman-tree-description). +The size Huffman Tree description will be determined during the decoding process, +and must be used to determine where the compressed Huffman streams begin. +If repeat stats mode is used, the Huffman table used in the previous compressed block will +be used to decompress this block as well. + +Huffman compressed data consists either 1 or 4 Huffman-coded streams. + +If only one stream is present, it is a single bitstream occupying the entire +remaining portion of the literals block, encoded as described at +[Huffman-Coded Streams](#huffman-coded-streams). + +If there are four streams, the literals section header only provides enough +information to know the regenerated and compressed sizes of all four streams combined. +The regenerated size of each stream is equal to `(totalSize+3)/4`, except for the last stream, +which may be up to 3 bytes smaller, to reach a total decompressed size match that described +in the literals header. + +The compressed size of each stream is provided explicitly: the first 6 bytes of the compressed +data consist of three 2-byte little endian fields, describing the compressed sizes +of the first three streams. +The last streams size is computed from the total compressed size and the size of the other +three streams. + +`stream4CSize = totalCSize - 6 - stream1CSize - stream2CSize - stream3CSize`. + +Note: remember that totalCSize may be smaller than the `Compressed_Size` found in the literals +block header as `Compressed_Size` also contains the size of the Huffman Tree description if it +is present. + +Each of these 4 bitstreams is then decoded independently as a Huffman-Coded stream, +as described at [Huffman-Coded Streams](#huffman-coded-streams) + +Sequences Section +----------------- +A compressed block is a succession of _sequences_ . +A sequence is a literal copy command, followed by a match copy command. +A literal copy command specifies a length. +It is the number of bytes to be copied (or extracted) from the literal section. +A match copy command specifies an offset and a length. + +When all _sequences_ are decoded, +if there is are any literals left in the _literal section_, +these bytes are added at the end of the block. + +This is described in more detail in [Sequence Execution](#sequence-execution) + +The `Sequences_Section` regroup all symbols required to decode commands. +There are 3 symbol types : literals lengths, offsets and match lengths. +They are encoded together, interleaved, in a single _bitstream_. + +The `Sequences_Section` starts by a header, +followed by optional probability tables for each symbol type, +followed by the bitstream. + +| `Sequences_Section_Header` | [`Literals_Length_Table`] | [`Offset_Table`] | [`Match_Length_Table`] | bitStream | +| -------------------------- | ------------------------- | ---------------- | ---------------------- | --------- | + +To decode the `Sequences_Section`, it's required to know its size. +This size is deduced from `blockSize - literalSectionSize`. + + +#### `Sequences_Section_Header` + +Consists of 2 items: +- `Number_of_Sequences` +- Symbol compression modes + +__`Number_of_Sequences`__ + +This is a variable size field using between 1 and 3 bytes. +Let's call its first byte `byte0`. +- `if (byte0 == 0)` : there are no sequences. + The sequence section stops there. + Regenerated content is defined entirely by literals section. +- `if (byte0 < 128)` : `Number_of_Sequences = byte0` . Uses 1 byte. +- `if (byte0 < 255)` : `Number_of_Sequences = ((byte0-128) << 8) + byte1` . Uses 2 bytes. +- `if (byte0 == 255)`: `Number_of_Sequences = byte1 + (byte2<<8) + 0x7F00` . Uses 3 bytes. + +__Symbol compression modes__ + +This is a single byte, defining the compression mode of each symbol type. + +|Bit number| 7-6 | 5-4 | 3-2 | 1-0 | +| -------- | ----------------------- | -------------- | -------------------- | ---------- | +|Field name| `Literals_Lengths_Mode` | `Offsets_Mode` | `Match_Lengths_Mode` | `Reserved` | + +The last field, `Reserved`, must be all-zeroes. + +`Literals_Lengths_Mode`, `Offsets_Mode` and `Match_Lengths_Mode` define the `Compression_Mode` of +literals lengths, offsets, and match lengths respectively. + +They follow the same enumeration : + +| Value | 0 | 1 | 2 | 3 | +| ------------------ | ----------------- | ---------- | --------------------- | ------------- | +| `Compression_Mode` | `Predefined_Mode` | `RLE_Mode` | `FSE_Compressed_Mode` | `Repeat_Mode` | + +- `Predefined_Mode` : A predefined FSE distribution table is used, defined in + [default distributions](#default-distributions). + The table takes no space in the compressed data. +- `RLE_Mode` : The table description consists of a single byte. + This code will be repeated for every sequence. +- `Repeat_Mode` : The table used in the previous compressed block will be used again. + No distribution table will be present. + Note: this includes RLE mode, so if repeat_mode follows rle_mode the same symbol will be repeated. + If this mode is used without any previous sequence table in the frame + (or [dictionary](#dictionary-format)) to repeat, this should be treated as corruption. +- `FSE_Compressed_Mode` : standard FSE compression. + A distribution table will be present. + The format of this distribution table is described in (FSE Table Description)[#fse-table-description]. + Note that the maximum allowed accuracy log for literals length and match length tables is 9, + and the maximum accuracy log for the offsets table is 8. + +#### The codes for literals lengths, match lengths, and offsets. + +Each symbol is a _code_ in its own context, +which specifies `Baseline` and `Number_of_Bits` to add. +_Codes_ are FSE compressed, +and interleaved with raw additional bits in the same bitstream. + +##### Literals length codes + +Literals length codes are values ranging from `0` to `35` included. +They define lengths from 0 to 131071 bytes. +The literals length is equal to the decoded `Baseline` plus +the result of reading `Number_of_Bits` bits from the bitstream, +as a little-endian value. + +| `Literals_Length_Code` | 0-15 | +| ---------------------- | ---------------------- | +| length | `Literals_Length_Code` | +| `Number_of_Bits` | 0 | + +| `Literals_Length_Code` | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | +| ---------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | +| `Baseline` | 16 | 18 | 20 | 22 | 24 | 28 | 32 | 40 | +| `Number_of_Bits` | 1 | 1 | 1 | 1 | 2 | 2 | 3 | 3 | + +| `Literals_Length_Code` | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | +| ---------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | +| `Baseline` | 48 | 64 | 128 | 256 | 512 | 1024 | 2048 | 4096 | +| `Number_of_Bits` | 4 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | + +| `Literals_Length_Code` | 32 | 33 | 34 | 35 | +| ---------------------- | ---- | ---- | ---- | ---- | +| `Baseline` | 8192 |16384 |32768 |65536 | +| `Number_of_Bits` | 13 | 14 | 15 | 16 | + + +##### Match length codes + +Match length codes are values ranging from `0` to `52` included. +They define lengths from 3 to 131074 bytes. +The match length is equal to the decoded `Baseline` plus +the result of reading `Number_of_Bits` bits from the bitstream, +as a little-endian value. + +| `Match_Length_Code` | 0-31 | +| ------------------- | ----------------------- | +| value | `Match_Length_Code` + 3 | +| `Number_of_Bits` | 0 | + +| `Match_Length_Code` | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | +| ------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | +| `Baseline` | 35 | 37 | 39 | 41 | 43 | 47 | 51 | 59 | +| `Number_of_Bits` | 1 | 1 | 1 | 1 | 2 | 2 | 3 | 3 | + +| `Match_Length_Code` | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | +| ------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | +| `Baseline` | 67 | 83 | 99 | 131 | 259 | 515 | 1027 | 2051 | +| `Number_of_Bits` | 4 | 4 | 5 | 7 | 8 | 9 | 10 | 11 | + +| `Match_Length_Code` | 48 | 49 | 50 | 51 | 52 | +| ------------------- | ---- | ---- | ---- | ---- | ---- | +| `Baseline` | 4099 | 8195 |16387 |32771 |65539 | +| `Number_of_Bits` | 12 | 13 | 14 | 15 | 16 | + +##### Offset codes + +Offset codes are values ranging from `0` to `N`. + +A decoder is free to limit its maximum `N` supported. +Recommendation is to support at least up to `22`. +For information, at the time of this writing. +the reference decoder supports a maximum `N` value of `28` in 64-bits mode. + +An offset code is also the number of additional bits to read in little-endian fashion, +and can be translated into an `Offset_Value` using the following formulas : + +``` +Offset_Value = (1 << offsetCode) + readNBits(offsetCode); +if (Offset_Value > 3) offset = Offset_Value - 3; +``` +It means that maximum `Offset_Value` is `(2^(N+1))-1` and it supports back-reference distance up to `(2^(N+1))-4` +but is limited by [maximum back-reference distance](#window_descriptor). + +`Offset_Value` from 1 to 3 are special : they define "repeat codes". +This is described in more detail in [Repeat Offsets](#repeat-offsets). + +#### Decoding Sequences +FSE bitstreams are read in reverse direction than written. In zstd, +the compressor writes bits forward into a block and the decompressor +must read the bitstream _backwards_. + +To find the start of the bitstream it is therefore necessary to +know the offset of the last byte of the block which can be found +by counting `Block_Size` bytes after the block header. + +After writing the last bit containing information, the compressor +writes a single `1`-bit and then fills the byte with 0-7 `0` bits of +padding. The last byte of the compressed bitstream cannot be `0` for +that reason. + +When decompressing, the last byte containing the padding is the first +byte to read. The decompressor needs to skip 0-7 initial `0`-bits and +the first `1`-bit it occurs. Afterwards, the useful part of the bitstream +begins. + +FSE decoding requires a 'state' to be carried from symbol to symbol. +For more explanation on FSE decoding, see the [FSE section](#fse). + +For sequence decoding, a separate state must be kept track of for each of +literal lengths, offsets, and match lengths. +Some FSE primitives are also used. +For more details on the operation of these primitives, see the [FSE section](#fse). + +##### Starting states +The bitstream starts with initial FSE state values, +each using the required number of bits in their respective _accuracy_, +decoded previously from their normalized distribution. + +It starts by `Literals_Length_State`, +followed by `Offset_State`, +and finally `Match_Length_State`. + +Reminder : always keep in mind that all values are read _backward_, +so the 'start' of the bitstream is at the highest position in memory, +immediately before the last `1`-bit for padding. + +After decoding the starting states, a single sequence is decoded +`Number_Of_Sequences` times. +These sequences are decoded in order from first to last. +Since the compressor writes the bitstream in the forward direction, +this means the compressor must encode the sequences starting with the last +one and ending with the first. + +##### Decoding a sequence +For each of the symbol types, the FSE state can be used to determine the appropriate code. +The code then defines the baseline and number of bits to read for each type. +See the [description of the codes] for how to determine these values. + +[description of the codes]: #the-codes-for-literals-lengths-match-lengths-and-offsets + +Decoding starts by reading the `Number_of_Bits` required to decode `Offset`. +It then does the same for `Match_Length`, +and then for `Literals_Length`. +This sequence is then used for [sequence execution](#sequence-execution). + +If it is not the last sequence in the block, +the next operation is to update states. +Using the rules pre-calculated in the decoding tables, +`Literals_Length_State` is updated, +followed by `Match_Length_State`, +and then `Offset_State`. +See the [FSE section](#fse) for details on how to update states from the bitstream. + +This operation will be repeated `Number_of_Sequences` times. +At the end, the bitstream shall be entirely consumed, +otherwise the bitstream is considered corrupted. + +#### Default Distributions +If `Predefined_Mode` is selected for a symbol type, +its FSE decoding table is generated from a predefined distribution table defined here. +For details on how to convert this distribution into a decoding table, see the [FSE section]. + +[FSE section]: #from-normalized-distribution-to-decoding-tables + +Sequence Execution +------------------ +Once literals and sequences have been decoded, +they are combined to produce the decoded content of a block. + +Each sequence consists of a tuple of (`literals_length`, `offset_value`, `match_length`), +decoded as described in the [Sequences Section)[#sequences-section]. +To execute a sequence, first copy `literals_length` bytes from the literals section +to the output. + +Then `match_length` bytes are copied from previous decoded data. +The offset to copy from is determined by `offset_value`: +if `offset_value > 3`, then the offset is `offset_value - 3`. +If `offset_value` is from 1-3, the offset is a special repeat offset value. +See the [repeat offset](#repeat-offsets) section for how the offset is determined +in this case. + +The offset is defined as from the current position, so an offset of 6 +and a match length of 3 means that 3 bytes should be copied from 6 bytes back. +Note that all offsets must be at most equal to the window size defined by the frame header. + +#### Repeat offsets +As seen in [Sequence Execution](#sequence-execution), +the first 3 values define a repeated offset and we will call them +`Repeated_Offset1`, `Repeated_Offset2`, and `Repeated_Offset3`. +They are sorted in recency order, with `Repeated_Offset1` meaning "most recent one". + +If `offset_value == 1`, then the offset used is `Repeated_Offset1`, etc. + +There is an exception though, when current sequence's `literals_length = 0`. +In this case, repeated offsets are shifted by one, +so an `offset_value` of 1 means `Repeated_Offset2`, +an `offset_value` of 2 means `Repeated_Offset3`, +and an `offset_value` of 3 means `Repeated_Offset1 - 1_byte`. + +In the first block, the offset history is populated with the following values : 1, 4 and 8 (in order). + +Then each block gets its starting offset history from the ending values of the most recent compressed block. +Note that non-compressed blocks are skipped, +they do not contribute to offset history. + +[Offset Codes]: #offset-codes + +###### Offset updates rules + +The newest offset takes the lead in offset history, +shifting others back (up to its previous place if it was already present). + +This means that when `Repeated_Offset1` (most recent) is used, history is unmodified. +When `Repeated_Offset2` is used, it's swapped with `Repeated_Offset1`. +If any other offset is used, it becomes `Repeated_Offset1` and the rest are shift back by one. + +Skippable Frames +---------------- + +| `Magic_Number` | `Frame_Size` | `User_Data` | +|:--------------:|:------------:|:-----------:| +| 4 bytes | 4 bytes | n bytes | + +Skippable frames allow the insertion of user-defined data +into a flow of concatenated frames. +Its design is pretty straightforward, +with the sole objective to allow the decoder to quickly skip +over user-defined data and continue decoding. + +Skippable frames defined in this specification are compatible with [LZ4] ones. + +[LZ4]:http://www.lz4.org + +__`Magic_Number`__ + +4 Bytes, little-endian format. +Value : 0x184D2A5?, which means any value from 0x184D2A50 to 0x184D2A5F. +All 16 values are valid to identify a skippable frame. + +__`Frame_Size`__ + +This is the size, in bytes, of the following `User_Data` +(without including the magic number nor the size field itself). +This field is represented using 4 Bytes, little-endian format, unsigned 32-bits. +This means `User_Data` can’t be bigger than (2^32-1) bytes. + +__`User_Data`__ + +The `User_Data` can be anything. Data will just be skipped by the decoder. + +Entropy Encoding +---------------- +Two types of entropy encoding are used by the Zstandard format: +FSE, and Huffman coding. + +FSE +--- +FSE, or FiniteStateEntropy is an entropy coding based on [ANS]. +FSE encoding/decoding involves a state that is carried over between symbols, +so decoding must be done in the opposite direction as encoding. +Therefore, all FSE bitstreams are read from end to beginning. + +For additional details on FSE, see [Finite State Entropy]. + +[Finite State Entropy]:https://github.com/Cyan4973/FiniteStateEntropy/ + +FSE decoding involves a decoding table which has a power of 2 size and three elements: +`Symbol`, `Num_Bits`, and `Baseline`. +The `log2` of the table size is its `Accuracy_Log`. +The FSE state represents an index in this table. +The next symbol in the stream is the symbol indicated by the table value for that state. +To obtain the next state value, +the decoder should consume `Num_Bits` bits from the stream as a little endian value and add it to baseline. + +To obtain the initial state value, consume `Accuracy_Log` bits from the stream as a little endian value. + +[ANS]: https://en.wikipedia.org/wiki/Asymmetric_Numeral_Systems + +### FSE Table Description +To decode FSE streams, it is necessary to construct the decoding table. +The Zstandard format encodes FSE table descriptions as follows: + +An FSE distribution table describes the probabilities of all symbols +from `0` to the last present one (included) +on a normalized scale of `1 << Accuracy_Log` . + +It's a bitstream which is read forward, in little-endian fashion. +It's not necessary to know its exact size, +since it will be discovered and reported by the decoding process. + +The bitstream starts by reporting on which scale it operates. +`Accuracy_Log = low4bits + 5`. + +Then follows each symbol value, from `0` to last present one. +The number of bits used by each field is variable. +It depends on : + +- Remaining probabilities + 1 : + __example__ : + Presuming an `Accuracy_Log` of 8, + and presuming 100 probabilities points have already been distributed, + the decoder may read any value from `0` to `255 - 100 + 1 == 156` (inclusive). + Therefore, it must read `log2sup(156) == 8` bits. + +- Value decoded : small values use 1 less bit : + __example__ : + Presuming values from 0 to 156 (inclusive) are possible, + 255-156 = 99 values are remaining in an 8-bits field. + They are used this way : + first 99 values (hence from 0 to 98) use only 7 bits, + values from 99 to 156 use 8 bits. + This is achieved through this scheme : + + | Value read | Value decoded | Number of bits used | + | ---------- | ------------- | ------------------- | + | 0 - 98 | 0 - 98 | 7 | + | 99 - 127 | 99 - 127 | 8 | + | 128 - 226 | 0 - 98 | 7 | + | 227 - 255 | 128 - 156 | 8 | + +Symbols probabilities are read one by one, in order. + +Probability is obtained from Value decoded by following formula : +`Proba = value - 1` + +It means value `0` becomes negative probability `-1`. +`-1` is a special probability, which means "less than 1". +Its effect on distribution table is described in the [next section]. +For the purpose of calculating total allocated probability points, it counts as one. + +[next section]:#from-normalized-distribution-to-decoding-tables + +When a symbol has a __probability__ of `zero`, +it is followed by a 2-bits repeat flag. +This repeat flag tells how many probabilities of zeroes follow the current one. +It provides a number ranging from 0 to 3. +If it is a 3, another 2-bits repeat flag follows, and so on. + +When last symbol reaches cumulated total of `1 << Accuracy_Log`, +decoding is complete. +If the last symbol makes cumulated total go above `1 << Accuracy_Log`, +distribution is considered corrupted. + +Then the decoder can tell how many bytes were used in this process, +and how many symbols are present. +The bitstream consumes a round number of bytes. +Any remaining bit within the last byte is just unused. + +##### From normalized distribution to decoding tables + +The distribution of normalized probabilities is enough +to create a unique decoding table. + +It follows the following build rule : + +The table has a size of `Table_Size = 1 << Accuracy_Log`. +Each cell describes the symbol decoded, +and instructions to get the next state. + +Symbols are scanned in their natural order for "less than 1" probabilities. +Symbols with this probability are being attributed a single cell, +starting from the end of the table. +These symbols define a full state reset, reading `Accuracy_Log` bits. + +All remaining symbols are sorted in their natural order. +Starting from symbol `0` and table position `0`, +each symbol gets attributed as many cells as its probability. +Cell allocation is spreaded, not linear : +each successor position follow this rule : + +``` +position += (tableSize>>1) + (tableSize>>3) + 3; +position &= tableSize-1; +``` + +A position is skipped if already occupied by a "less than 1" probability symbol. +`position` does not reset between symbols, it simply iterates through +each position in the table, switching to the next symbol when enough +states have been allocated to the current one. + +The result is a list of state values. +Each state will decode the current symbol. + +To get the `Number_of_Bits` and `Baseline` required for next state, +it's first necessary to sort all states in their natural order. +The lower states will need 1 more bit than higher ones. + +__Example__ : +Presuming a symbol has a probability of 5. +It receives 5 state values. States are sorted in natural order. + +Next power of 2 is 8. +Space of probabilities is divided into 8 equal parts. +Presuming the `Accuracy_Log` is 7, it defines 128 states. +Divided by 8, each share is 16 large. + +In order to reach 8, 8-5=3 lowest states will count "double", +taking shares twice larger, +requiring one more bit in the process. + +Numbering starts from higher states using less bits. + +| state order | 0 | 1 | 2 | 3 | 4 | +| ---------------- | ----- | ----- | ------ | ---- | ----- | +| width | 32 | 32 | 32 | 16 | 16 | +| `Number_of_Bits` | 5 | 5 | 5 | 4 | 4 | +| range number | 2 | 4 | 6 | 0 | 1 | +| `Baseline` | 32 | 64 | 96 | 0 | 16 | +| range | 32-63 | 64-95 | 96-127 | 0-15 | 16-31 | + +The next state is determined from current state +by reading the required `Number_of_Bits`, and adding the specified `Baseline`. + +See [Appendix A] for the results of this process applied to the default distributions. + +[Appendix A]: #appendix-a---decoding-tables-for-predefined-codes + +Huffman Coding +-------------- +Zstandard Huffman-coded streams are read backwards, +similar to the FSE bitstreams. +Therefore, to find the start of the bitstream it is therefore necessary to +know the offset of the last byte of the Huffman-coded stream. + +After writing the last bit containing information, the compressor +writes a single `1`-bit and then fills the byte with 0-7 `0` bits of +padding. The last byte of the compressed bitstream cannot be `0` for +that reason. + +When decompressing, the last byte containing the padding is the first +byte to read. The decompressor needs to skip 0-7 initial `0`-bits and +the first `1`-bit it occurs. Afterwards, the useful part of the bitstream +begins. + +The bitstream contains Huffman-coded symbols in little-endian order, +with the codes defined by the method below. + +### Huffman Tree Description Prefix coding represents symbols from an a priori known alphabet by bit sequences (codewords), one codeword for each symbol, in a manner such that different symbols may be represented @@ -598,19 +1132,7 @@ which describes how to decode the list of weights. ##### Finite State Entropy (FSE) compression of Huffman weights -FSE decoding uses three operations: `Init_State`, `Decode_Symbol`, and `Update_State`. -`Init_State` reads in the initial state value from a bitstream, -`Decode_Symbol` outputs a symbol based on the current state, -and `Update_State` goes to a new state based on the current state and some number of consumed bits. - -FSE streams must be read in reverse from the order they're encoded in, -so bitstreams start at a certain offset and works backwards towards their base. - -For more on how FSE bitstreams work, see [Finite State Entropy]. - -[Finite State Entropy]:https://github.com/Cyan4973/FiniteStateEntropy/ - -The series of Huffman weights is compressed using FSE compression. +In this case, the series of Huffman weights is compressed using FSE compression. It's a single bitstream with 2 interleaved states, sharing a single distribution table. @@ -622,17 +1144,16 @@ and last symbol's weight is not represented. An FSE bitstream starts by a header, describing probabilities distribution. It will create a Decoding Table. -The table must be pre-allocated, so a maximum accuracy must be fixed. -For a list of Huffman weights, maximum accuracy is 7 bits. +For a list of Huffman weights, the maximum accuracy log is 7 bits. +For more description see the [FSE header description](#fse-table-description) -The FSE header format is [described in a relevant chapter](#fse-distribution-table--condensed-format), -as well as the [FSE bitstream](#bitstream). -The main difference is that Huffman header compression uses 2 states, +The Huffman header compression uses 2 states, which share the same FSE distribution table. The first state (`State1`) encodes the even indexed symbols, and the second (`State2`) encodes the odd indexes. State1 is initialized first, and then State2, and they take turns decoding a single symbol and updating their state. +For more details on these FSE operations, see the [FSE section](#fse). The number of symbols to decode is determined by tracking bitStream overflow condition: @@ -667,39 +1188,9 @@ it gives the following distribution : | `Number_of_Bits` | 0 | 4 | 4 | 3 | 2 | 1 | | prefix codes | N/A | 0000| 0001| 001 | 01 | 1 | - -#### The content of Huffman-compressed literal stream - -##### Bitstreams sizes - -As seen in a previous paragraph, -there are 2 types of Huffman-compressed literals : -a single stream and 4 streams. - -Encoding using 4 streams is useful for CPU with multiple execution units and out-of-order operations. -Since each stream can be decoded independently, -it's possible to decode them up to 4x faster than a single stream, -presuming the CPU has enough parallelism available. - -For single stream, header provides both the compressed and regenerated size. -For 4 streams though, -header only provides compressed and regenerated size of all 4 streams combined. -In order to properly decode the 4 streams, -it's necessary to know the compressed and regenerated size of each stream. - -Regenerated size of each stream can be calculated by `(totalSize+3)/4`, -except for last one, which can be up to 3 bytes smaller, to reach `totalSize`. - -Compressed size is provided explicitly : in the 4-streams variant, -bitstreams are preceded by 3 unsigned little-endian 16-bits values. -Each value represents the compressed size of one stream, in order. -The last stream size is deducted from total compressed size -and from previously decoded stream sizes : - -`stream4CSize = totalCSize - 6 - stream1CSize - stream2CSize - stream3CSize`. - - -##### Bitstreams read and decode +### Huffman-coded Streams +Given a Huffman decoding table, +it's possible to decode a Huffman-coded stream. Each bitstream must be read _backward_, that is starting from the end down to the beginning. @@ -736,446 +1227,10 @@ If a bitstream is not entirely and exactly consumed, hence reaching exactly its beginning position with _all_ bits consumed, the decoding process is considered faulty. -### `Sequences_Section` - -A compressed block is a succession of _sequences_ . -A sequence is a literal copy command, followed by a match copy command. -A literal copy command specifies a length. -It is the number of bytes to be copied (or extracted) from the literal section. -A match copy command specifies an offset and a length. -The offset gives the position to copy from, -which can be within a previous block. - -When all _sequences_ are decoded, -if there is are any literals left in the _literal section_, -these bytes are added at the end of the block. - -The `Sequences_Section` regroup all symbols required to decode commands. -There are 3 symbol types : literals lengths, offsets and match lengths. -They are encoded together, interleaved, in a single _bitstream_. - -The `Sequences_Section` starts by a header, -followed by optional probability tables for each symbol type, -followed by the bitstream. - -| `Sequences_Section_Header` | [`Literals_Length_Table`] | [`Offset_Table`] | [`Match_Length_Table`] | bitStream | -| -------------------------- | ------------------------- | ---------------- | ---------------------- | --------- | - -To decode the `Sequences_Section`, it's required to know its size. -This size is deducted from `blockSize - literalSectionSize`. - - -#### `Sequences_Section_Header` - -Consists of 2 items: -- `Number_of_Sequences` -- Symbol compression modes - -__`Number_of_Sequences`__ - -This is a variable size field using between 1 and 3 bytes. -Let's call its first byte `byte0`. -- `if (byte0 == 0)` : there are no sequences. - The sequence section stops there. - Regenerated content is defined entirely by literals section. -- `if (byte0 < 128)` : `Number_of_Sequences = byte0` . Uses 1 byte. -- `if (byte0 < 255)` : `Number_of_Sequences = ((byte0-128) << 8) + byte1` . Uses 2 bytes. -- `if (byte0 == 255)`: `Number_of_Sequences = byte1 + (byte2<<8) + 0x7F00` . Uses 3 bytes. - -__Symbol compression modes__ - -This is a single byte, defining the compression mode of each symbol type. - -|Bit number| 7-6 | 5-4 | 3-2 | 1-0 | -| -------- | ----------------------- | -------------- | -------------------- | ---------- | -|Field name| `Literals_Lengths_Mode` | `Offsets_Mode` | `Match_Lengths_Mode` | `Reserved` | - -The last field, `Reserved`, must be all-zeroes. - -`Literals_Lengths_Mode`, `Offsets_Mode` and `Match_Lengths_Mode` define the `Compression_Mode` of -literals lengths, offsets, and match lengths respectively. - -They follow the same enumeration : - -| Value | 0 | 1 | 2 | 3 | -| ------------------ | ----------------- | ---------- | --------------------- | ------------- | -| `Compression_Mode` | `Predefined_Mode` | `RLE_Mode` | `FSE_Compressed_Mode` | `Repeat_Mode` | - -- `Predefined_Mode` : uses a predefined distribution table. -- `RLE_Mode` : it's a single code, repeated `Number_of_Sequences` times. -- `Repeat_Mode` : re-use distribution table from previous compressed block. -- `FSE_Compressed_Mode` : standard FSE compression. - A distribution table will be present. - It will be described in [next part](#distribution-tables). - -#### The codes for literals lengths, match lengths, and offsets. - -Each symbol is a _code_ in its own context, -which specifies `Baseline` and `Number_of_Bits` to add. -_Codes_ are FSE compressed, -and interleaved with raw additional bits in the same bitstream. - -##### Literals length codes - -Literals length codes are values ranging from `0` to `35` included. -They define lengths from 0 to 131071 bytes. - -| `Literals_Length_Code` | 0-15 | -| ---------------------- | ---------------------- | -| length | `Literals_Length_Code` | -| `Number_of_Bits` | 0 | - -| `Literals_Length_Code` | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | -| ---------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | -| `Baseline` | 16 | 18 | 20 | 22 | 24 | 28 | 32 | 40 | -| `Number_of_Bits` | 1 | 1 | 1 | 1 | 2 | 2 | 3 | 3 | - -| `Literals_Length_Code` | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | -| ---------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | -| `Baseline` | 48 | 64 | 128 | 256 | 512 | 1024 | 2048 | 4096 | -| `Number_of_Bits` | 4 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | - -| `Literals_Length_Code` | 32 | 33 | 34 | 35 | -| ---------------------- | ---- | ---- | ---- | ---- | -| `Baseline` | 8192 |16384 |32768 |65536 | -| `Number_of_Bits` | 13 | 14 | 15 | 16 | - -##### Default distribution for literals length codes - -When `Compression_Mode` is `Predefined_Mode`, -a predefined distribution is used for FSE compression. - -Its definition is below. It uses an accuracy of 6 bits (64 states). -``` -short literalsLength_defaultDistribution[36] = - { 4, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, - 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 1, 1, 1, 1, 1, - -1,-1,-1,-1 }; -``` - -##### Match length codes - -Match length codes are values ranging from `0` to `52` included. -They define lengths from 3 to 131074 bytes. - -| `Match_Length_Code` | 0-31 | -| ------------------- | ----------------------- | -| value | `Match_Length_Code` + 3 | -| `Number_of_Bits` | 0 | - -| `Match_Length_Code` | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | -| ------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | -| `Baseline` | 35 | 37 | 39 | 41 | 43 | 47 | 51 | 59 | -| `Number_of_Bits` | 1 | 1 | 1 | 1 | 2 | 2 | 3 | 3 | - -| `Match_Length_Code` | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | -| ------------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | -| `Baseline` | 67 | 83 | 99 | 131 | 259 | 515 | 1027 | 2051 | -| `Number_of_Bits` | 4 | 4 | 5 | 7 | 8 | 9 | 10 | 11 | - -| `Match_Length_Code` | 48 | 49 | 50 | 51 | 52 | -| ------------------- | ---- | ---- | ---- | ---- | ---- | -| `Baseline` | 4099 | 8195 |16387 |32771 |65539 | -| `Number_of_Bits` | 12 | 13 | 14 | 15 | 16 | - -##### Default distribution for match length codes - -When `Compression_Mode` is defined as `Predefined_Mode`, -a predefined distribution is used for FSE compression. - -Its definition is below. It uses an accuracy of 6 bits (64 states). -``` -short matchLengths_defaultDistribution[53] = - { 1, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, - 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, - 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1, - -1,-1,-1,-1,-1 }; -``` - -##### Offset codes - -Offset codes are values ranging from `0` to `N`. - -A decoder is free to limit its maximum `N` supported. -Recommendation is to support at least up to `22`. -For information, at the time of this writing. -the reference decoder supports a maximum `N` value of `28` in 64-bits mode. - -An offset code is also the number of additional bits to read, -and can be translated into an `Offset_Value` using the following formulas : - -``` -Offset_Value = (1 << offsetCode) + readNBits(offsetCode); -if (Offset_Value > 3) offset = Offset_Value - 3; -``` -It means that maximum `Offset_Value` is `(2^(N+1))-1` and it supports back-reference distance up to `(2^(N+1))-4` -but is limited by [maximum back-reference distance](#window_descriptor). - -`Offset_Value` from 1 to 3 are special : they define "repeat codes", -which means one of the previous offsets will be repeated. -They are sorted in recency order, with 1 meaning the most recent one. -See [Repeat offsets](#repeat-offsets) paragraph. - - -##### Default distribution for offset codes - -When `Compression_Mode` is defined as `Predefined_Mode`, -a predefined distribution is used for FSE compression. - -Below is its definition. It uses an accuracy of 5 bits (32 states), -and supports a maximum `N` of 28, allowing offset values up to 536,870,908 . - -If any sequence in the compressed block requires an offset larger than this, -it's not possible to use the default distribution to represent it. - -``` -short offsetCodes_defaultDistribution[29] = - { 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, - 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,-1,-1 }; -``` - -#### Distribution tables - -Following the header, up to 3 distribution tables can be described. -When present, they are in this order : -- Literals lengths -- Offsets -- Match Lengths - -The content to decode depends on their respective encoding mode : -- `Predefined_Mode` : no content. Use the predefined distribution table. -- `RLE_Mode` : 1 byte. This is the only code to use across the whole compressed block. -- `FSE_Compressed_Mode` : A distribution table is present. -- `Repeat_Mode` : no content. Re-use distribution from previous compressed block. - -##### FSE distribution table : condensed format - -An FSE distribution table describes the probabilities of all symbols -from `0` to the last present one (included) -on a normalized scale of `1 << Accuracy_Log` . - -It's a bitstream which is read forward, in little-endian fashion. -It's not necessary to know its exact size, -since it will be discovered and reported by the decoding process. - -The bitstream starts by reporting on which scale it operates. -`Accuracy_Log = low4bits + 5`. -Note that maximum `Accuracy_Log` for literal and match lengths is `9`, -and for offsets is `8`. Higher values are considered errors. - -Then follows each symbol value, from `0` to last present one. -The number of bits used by each field is variable. -It depends on : - -- Remaining probabilities + 1 : - __example__ : - Presuming an `Accuracy_Log` of 8, - and presuming 100 probabilities points have already been distributed, - the decoder may read any value from `0` to `255 - 100 + 1 == 156` (inclusive). - Therefore, it must read `log2sup(156) == 8` bits. - -- Value decoded : small values use 1 less bit : - __example__ : - Presuming values from 0 to 156 (inclusive) are possible, - 255-156 = 99 values are remaining in an 8-bits field. - They are used this way : - first 99 values (hence from 0 to 98) use only 7 bits, - values from 99 to 156 use 8 bits. - This is achieved through this scheme : - - | Value read | Value decoded | Number of bits used | - | ---------- | ------------- | ------------------- | - | 0 - 98 | 0 - 98 | 7 | - | 99 - 127 | 99 - 127 | 8 | - | 128 - 226 | 0 - 98 | 7 | - | 227 - 255 | 128 - 156 | 8 | - -Symbols probabilities are read one by one, in order. - -Probability is obtained from Value decoded by following formula : -`Proba = value - 1` - -It means value `0` becomes negative probability `-1`. -`-1` is a special probability, which means "less than 1". -Its effect on distribution table is described in [next paragraph]. -For the purpose of calculating cumulated distribution, it counts as one. - -[next paragraph]:#fse-decoding--from-normalized-distribution-to-decoding-tables - -When a symbol has a __probability__ of `zero`, -it is followed by a 2-bits repeat flag. -This repeat flag tells how many probabilities of zeroes follow the current one. -It provides a number ranging from 0 to 3. -If it is a 3, another 2-bits repeat flag follows, and so on. - -When last symbol reaches cumulated total of `1 << Accuracy_Log`, -decoding is complete. -If the last symbol makes cumulated total go above `1 << Accuracy_Log`, -distribution is considered corrupted. - -Then the decoder can tell how many bytes were used in this process, -and how many symbols are present. -The bitstream consumes a round number of bytes. -Any remaining bit within the last byte is just unused. - -##### FSE decoding : from normalized distribution to decoding tables - -The distribution of normalized probabilities is enough -to create a unique decoding table. - -It follows the following build rule : - -The table has a size of `tableSize = 1 << Accuracy_Log`. -Each cell describes the symbol decoded, -and instructions to get the next state. - -Symbols are scanned in their natural order for "less than 1" probabilities. -Symbols with this probability are being attributed a single cell, -starting from the end of the table. -These symbols define a full state reset, reading `Accuracy_Log` bits. - -All remaining symbols are sorted in their natural order. -Starting from symbol `0` and table position `0`, -each symbol gets attributed as many cells as its probability. -Cell allocation is spreaded, not linear : -each successor position follow this rule : - -``` -position += (tableSize>>1) + (tableSize>>3) + 3; -position &= tableSize-1; -``` - -A position is skipped if already occupied, -typically by a "less than 1" probability symbol. -`position` does not reset between symbols, it simply iterates through -each position in the table, switching to the next symbol when enough -states have been allocated to the current one. - -The result is a list of state values. -Each state will decode the current symbol. - -To get the `Number_of_Bits` and `Baseline` required for next state, -it's first necessary to sort all states in their natural order. -The lower states will need 1 more bit than higher ones. - -__Example__ : -Presuming a symbol has a probability of 5. -It receives 5 state values. States are sorted in natural order. - -Next power of 2 is 8. -Space of probabilities is divided into 8 equal parts. -Presuming the `Accuracy_Log` is 7, it defines 128 states. -Divided by 8, each share is 16 large. - -In order to reach 8, 8-5=3 lowest states will count "double", -taking shares twice larger, -requiring one more bit in the process. - -Numbering starts from higher states using less bits. - -| state order | 0 | 1 | 2 | 3 | 4 | -| ---------------- | ----- | ----- | ------ | ---- | ----- | -| width | 32 | 32 | 32 | 16 | 16 | -| `Number_of_Bits` | 5 | 5 | 5 | 4 | 4 | -| range number | 2 | 4 | 6 | 0 | 1 | -| `Baseline` | 32 | 64 | 96 | 0 | 16 | -| range | 32-63 | 64-95 | 96-127 | 0-15 | 16-31 | - -The next state is determined from current state -by reading the required `Number_of_Bits`, and adding the specified `Baseline`. - - -#### Bitstream - -FSE bitstreams are read in reverse direction than written. In zstd, -the compressor writes bits forward into a block and the decompressor -must read the bitstream _backwards_. - -To find the start of the bitstream it is therefore necessary to -know the offset of the last byte of the block which can be found -by counting `Block_Size` bytes after the block header. - -After writing the last bit containing information, the compressor -writes a single `1`-bit and then fills the byte with 0-7 `0` bits of -padding. The last byte of the compressed bitstream cannot be `0` for -that reason. - -When decompressing, the last byte containing the padding is the first -byte to read. The decompressor needs to skip 0-7 initial `0`-bits and -the first `1`-bit it occurs. Afterwards, the useful part of the bitstream -begins. - -##### Starting states - -The bitstream starts with initial state values, -each using the required number of bits in their respective _accuracy_, -decoded previously from their normalized distribution. - -It starts by `Literals_Length_State`, -followed by `Offset_State`, -and finally `Match_Length_State`. - -Reminder : always keep in mind that all values are read _backward_. - -##### Decoding a sequence - -A state gives a code. -A code provides `Baseline` and `Number_of_Bits` to add. -See [Symbol Decoding] section for details on each symbol. - -Decoding starts by reading the `Number_of_Bits` required to decode `Offset`. -It then does the same for `Match_Length`, -and then for `Literals_Length`. - -`Offset`, `Match_Length`, and `Literals_Length` define a sequence. -It starts by inserting the number of literals defined by `Literals_Length`, -then continue by copying `Match_Length` bytes from `currentPos - Offset`. - -If it is not the last sequence in the block, -the next operation is to update states. -Using the rules pre-calculated in the decoding tables, -`Literals_Length_State` is updated, -followed by `Match_Length_State`, -and then `Offset_State`. - -This operation will be repeated `Number_of_Sequences` times. -At the end, the bitstream shall be entirely consumed, -otherwise the bitstream is considered corrupted. - -[Symbol Decoding]:#the-codes-for-literals-lengths-match-lengths-and-offsets - -##### Repeat offsets - -As seen in [Offset Codes], the first 3 values define a repeated offset and we will call them `Repeated_Offset1`, `Repeated_Offset2`, and `Repeated_Offset3`. -They are sorted in recency order, with `Repeated_Offset1` meaning "most recent one". - -There is an exception though, when current sequence's literals length is `0`. -In this case, repeated offsets are shifted by one, -so `Repeated_Offset1` becomes `Repeated_Offset2`, `Repeated_Offset2` becomes `Repeated_Offset3`, -and `Repeated_Offset3` becomes `Repeated_Offset1 - 1_byte`. - -In the first block, the offset history is populated with the following values : 1, 4 and 8 (in order). - -Then each block gets its starting offset history from the ending values of the most recent compressed block. -Note that non-compressed blocks are skipped, -they do not contribute to offset history. - -[Offset Codes]: #offset-codes - -###### Offset updates rules - -The newest offset takes the lead in offset history, -shifting others back (up to its previous place if it was already present). - -This means that when `Repeated_Offset1` (most recent) is used, history is unmodified. -When `Repeated_Offset2` is used, it's swapped with `Repeated_Offset1`. -If any other offset is used, it becomes `Repeated_Offset1` and the rest are shift back by one. - - -Dictionary format +Dictionary Format ----------------- -`zstd` is compatible with "raw content" dictionaries, free of any format restriction, +Zstandard is compatible with "raw content" dictionaries, free of any format restriction, except that they must be at least 8 bytes. These dictionaries function as if they were just the `Content` block of a formatted dictionary. @@ -1203,10 +1258,15 @@ _Reserved ranges :_ - low range : 1 - 32767 - high range : >= (2^31) -__`Entropy_Tables`__ : following the same format as the tables in [compressed blocks]. +__`Entropy_Tables`__ : following the same format as the tables in compressed blocks. + See the relevant [FSE](#fse-table-description) + and [Huffman](#huffman-tree-description) sections for how to decode these tables. They are stored in following order : Huffman tables for literals, FSE table for offsets, FSE table for match lengths, and FSE table for literals lengths. + These tables populate the Repeat Stats literals mode and + Repeat distribution mode for sequence decoding. + It's finally followed by 3 offset values, populating recent offsets (instead of using `{1,4,8}`), stored in order, 4-bytes little-endian each, for a total of 12 bytes. Each recent offset must have a value < dictionary size. @@ -1214,9 +1274,13 @@ __`Entropy_Tables`__ : following the same format as the tables in [compressed bl __`Content`__ : The rest of the dictionary is its content. The content act as a "past" in front of data to compress or decompress, so it can be referenced in sequence commands. + As long as the amount of data decoded from this frame is less than or + equal to the window-size, sequence commands may specify offsets longer + than the lenght of total decoded output so far to reference back to the + dictionary. After the total output has surpassed the window size however, + this is no longer allowed and the dictionary is no longer accessible. [compressed blocks]: #the-format-of-compressed_block - Appendix A - Decoding tables for predefined codes ------------------------------------------------- @@ -1402,6 +1466,7 @@ to crosscheck that an implementation implements the decoding table generation al Version changes --------------- +- 0.2.4 : section restructuring, by Sean Purcell - 0.2.3 : clarified several details, by Sean Purcell - 0.2.2 : added predefined codes, by Johannes Rudolph - 0.2.1 : clarify field names, by Przemyslaw Skibinski From 6e18d33122a17648497fdddb3ce7806e9dfb593d Mon Sep 17 00:00:00 2001 From: Soojin Nam Date: Tue, 21 Feb 2017 09:51:40 +0900 Subject: [PATCH 02/22] original size unknown --- examples/dictionary_decompression.c | 2 +- examples/simple_decompression.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/dictionary_decompression.c b/examples/dictionary_decompression.c index 75183505d..deaf3888e 100644 --- a/examples/dictionary_decompression.c +++ b/examples/dictionary_decompression.c @@ -78,7 +78,7 @@ static void decompress(const char* fname, const ZSTD_DDict* ddict) size_t cSize; void* const cBuff = loadFile_orDie(fname, &cSize); unsigned long long const rSize = ZSTD_findDecompressedSize(cBuff, cSize); - if (rSize==0) { + if (rSize==ZSTD_CONTENTSIZE_UNKNOWN) { fprintf(stderr, "%s : original size unknown \n", fname); exit(6); } diff --git a/examples/simple_decompression.c b/examples/simple_decompression.c index 09b27baa6..e23f14887 100644 --- a/examples/simple_decompression.c +++ b/examples/simple_decompression.c @@ -63,7 +63,7 @@ static void decompress(const char* fname) size_t cSize; void* const cBuff = loadFile_orDie(fname, &cSize); unsigned long long const rSize = ZSTD_findDecompressedSize(cBuff, cSize); - if (rSize==0) { + if (rSize==ZSTD_CONTENTSIZE_UNKNOWN) { printf("%s : original size unknown. Use streaming decompression instead. \n", fname); exit(5); } From 3a751edeaedf79f320aae63de4134e0fcf54786e Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 15:57:03 +0100 Subject: [PATCH 03/22] uasan --- .travis.yml | 71 ++++++++++++++++++++--------------------------------- 1 file changed, 27 insertions(+), 44 deletions(-) diff --git a/.travis.yml b/.travis.yml index 0ac8efb80..7379fc504 100644 --- a/.travis.yml +++ b/.travis.yml @@ -16,6 +16,20 @@ matrix: os: linux sudo: false + - env: Ubu=12.04cont Cmd="make uasan" + os: linux + sudo: false + + - env: Ubu=14.04 Cmd='make test CC=clang-4.0 MOREFLAGS="-g -fsanitize=address -fsanitize=undefined"' + os: linux + dist: trusty + sudo: required + addons: + apt: + sources: + - llvm-toolchain-trusty-4.0 + packages: + - clang-4.0 - env: Ubu=14.04 Cmd='make -C tests test32 CC=clang-4.0 MOREFLAGS="-g -fsanitize=address"' os: linux @@ -31,9 +45,13 @@ matrix: - clang-4.0 - # Standard Ubuntu 12.04 LTS Server Edition 64 bit - - env: Ubu=12.04 Cmd='cd contrib/pzstd && make googletest && make tsan && make check && make clean && make asan && make check && make clean && cd ../..' + # Ubuntu 14.04 LTS Server Edition 64 bit + - env: Ubu=14.04 Cmd='cd contrib/pzstd && make googletest pzstd tests check && make clean + && make googletest32 all32 check && make clean + && make googletest tsan check && make clean + && make asan check && make clean' os: linux + dist: trusty sudo: required install: - export CXX="g++-6" CC="gcc-6" @@ -43,30 +61,15 @@ matrix: apt: sources: - ubuntu-toolchain-r-test - packages: - - gcc-6 - - g++-6 - - - # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd="make -C contrib/pzstd googletest pzstd tests check && make -C contrib/pzstd clean - && make -C contrib/pzstd googletest32 && make -C contrib/pzstd all32 && make -C contrib/pzstd check && make -C contrib/pzstd clean" - os: linux - dist: trusty - sudo: required - install: - - export CXX="g++-4.8" CC="gcc-4.8" - addons: - apt: packages: - libc6-dev-i386 - g++-multilib - - gcc-4.8 - - gcc-4.8-multilib - - g++-4.8 - - g++-4.8-multilib + - gcc-6 + - gcc-6-multilib + - g++-6 + - g++-6-multilib - - env: Ubu=14.04 Cmd="make armtest" + - env: Ubu=14.04 Cmd="make armtest && make clean && make aarch64test" dist: trusty sudo: required addons: @@ -76,19 +79,10 @@ matrix: - qemu-user-static - gcc-arm-linux-gnueabi - libc6-dev-armel-cross - - - env: Ubu=14.04 Cmd="make aarch64test" - dist: trusty - sudo: required - addons: - apt: - packages: - - qemu-system-arm - - qemu-user-static - gcc-aarch64-linux-gnu - libc6-dev-arm64-cross - - env: Ubu=14.04 Cmd='make ppctest' + - env: Ubu=14.04 Cmd='make ppctest && make clean && make ppc64test' dist: trusty sudo: required addons: @@ -98,17 +92,6 @@ matrix: - qemu-user-static - gcc-powerpc-linux-gnu - - env: Ubu=14.04 Cmd='make ppc64test' - dist: trusty - sudo: required - addons: - apt: - packages: - - qemu-system-ppc - - qemu-user-static - - gcc-powerpc-linux-gnu - - # other feature branches => short tests - env: Ubu=14.04 Cmd='make lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' os: linux @@ -137,7 +120,7 @@ script: # cron & master => long tests, as this is the final step towards a Release # dev => normal tests # other feature branches => short tests (number > 10) - - if [ "$TRAVIS_EVENT_TYPE" = "cron" ] || [ "$TRAVIS_BRANCH" = "master" ]; then + - if [ "$TRAVIS_EVENT_TYPE" = "cron" ] || [ "$TRAVIS_BRANCH" = "asan" ]; then FUZZERTEST=-T10mn sh -c "$Cmd" || travis_terminate 1; else if [ "$TRAVIS_PULL_REQUEST" = "true" ] || [ $JOB_NUMBER -gt 10 ] || [ "$TRAVIS_BRANCH" = "dev" ]; then From 684858e7b7924d7789395ec5950d4b29315a4516 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 18:17:24 +0100 Subject: [PATCH 04/22] fix memory leaks --- .travis.yml | 15 +-------------- contrib/pzstd/Makefile | 17 +++++++++++++++++ programs/zstdcli.c | 4 ++-- tests/zstreamtest.c | 2 ++ 4 files changed, 22 insertions(+), 16 deletions(-) diff --git a/.travis.yml b/.travis.yml index 7379fc504..38ed23431 100644 --- a/.travis.yml +++ b/.travis.yml @@ -8,14 +8,6 @@ matrix: # Container-based Ubuntu 12.04 LTS Server Edition 64 bit (doesn't support 32-bit includes) - - env: Ubu=12.04cont Cmd="make usan" - os: linux - sudo: false - - - env: Ubu=12.04cont Cmd="make asan" - os: linux - sudo: false - - env: Ubu=12.04cont Cmd="make uasan" os: linux sudo: false @@ -46,17 +38,12 @@ matrix: # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='cd contrib/pzstd && make googletest pzstd tests check && make clean - && make googletest32 all32 check && make clean - && make googletest tsan check && make clean - && make asan check && make clean' + - env: Ubu=14.04 Cmd='cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && test-pzstd-asan' os: linux dist: trusty sudo: required install: - export CXX="g++-6" CC="gcc-6" - - export LDFLAGS="-fuse-ld=gold" - - export TESTFLAGS='--gtest_filter=-*ExtremelyLarge*' addons: apt: sources: diff --git a/contrib/pzstd/Makefile b/contrib/pzstd/Makefile index f148bfd8e..10a133dd7 100644 --- a/contrib/pzstd/Makefile +++ b/contrib/pzstd/Makefile @@ -85,6 +85,23 @@ endif .PHONY: default default: all +.PHONY: test-pzstd +test-pzstd: TESTFLAGS='--gtest_filter=-*ExtremelyLarge*' +test-pzstd: clean googletest pzstd tests check + +.PHONY: test-pzstd32 +test-pzstd32: clean googletest32 all32 check + +.PHONY: test-pzstd-tsan +test-pzstd-tsan: LDFLAGS="-fuse-ld=gold" +test-pzstd-tsan: TESTFLAGS='--gtest_filter=-*ExtremelyLarge*' +test-pzstd-tsan: clean googletest tsan check + +.PHONY: test-pzstd-asan +test-pzstd-asan: LDFLAGS="-fuse-ld=gold" +test-pzstd-asan: TESTFLAGS='--gtest_filter=-*ExtremelyLarge*' +test-pzstd-asan: clean asan check + .PHONY: check check: $(TESTPROG) ./utils/test/BufferTest$(EXT) $(TESTFLAGS) diff --git a/programs/zstdcli.c b/programs/zstdcli.c index 588111913..a7b4fddc8 100644 --- a/programs/zstdcli.c +++ b/programs/zstdcli.c @@ -399,7 +399,7 @@ int main(int argCount, const char* argv[]) while (argument[0]!=0) { if (lastCommand) { DISPLAY("error : command must be followed by argument \n"); - return 1; + CLEAN_RETURN(1); } #ifndef ZSTD_NOCOMPRESS /* compression Level */ @@ -555,7 +555,7 @@ int main(int argCount, const char* argv[]) filenameTable[filenameIdx++] = argument; } - if (lastCommand) { DISPLAY("error : command must be followed by argument \n"); return 1; } /* forgotten argument */ + if (lastCommand) { DISPLAY("error : command must be followed by argument \n"); CLEAN_RETURN(1); } /* forgotten argument */ /* Welcome message (if verbose) */ DISPLAYLEVEL(3, WELCOME_MESSAGE); diff --git a/tests/zstreamtest.c b/tests/zstreamtest.c index 9a9fed98d..c22a284c7 100644 --- a/tests/zstreamtest.c +++ b/tests/zstreamtest.c @@ -496,6 +496,8 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo /* Bug will cause checksum to fail */ if (ZSTD_isError(r)) goto _output_error; } + + ZSTD_freeDStream(zds); } DISPLAYLEVEL(3, "OK \n"); From d8114e5802edfb236758d7a4d7ce24795e74afa2 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 18:59:56 +0100 Subject: [PATCH 05/22] zstd_compress.c: fix memory leaks --- contrib/pzstd/Makefile | 10 +++++----- lib/compress/zstd_compress.c | 4 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/contrib/pzstd/Makefile b/contrib/pzstd/Makefile index 10a133dd7..21ef935c6 100644 --- a/contrib/pzstd/Makefile +++ b/contrib/pzstd/Makefile @@ -86,20 +86,20 @@ endif default: all .PHONY: test-pzstd -test-pzstd: TESTFLAGS='--gtest_filter=-*ExtremelyLarge*' +test-pzstd: TESTFLAGS=--gtest_filter=-*ExtremelyLarge* test-pzstd: clean googletest pzstd tests check .PHONY: test-pzstd32 test-pzstd32: clean googletest32 all32 check .PHONY: test-pzstd-tsan -test-pzstd-tsan: LDFLAGS="-fuse-ld=gold" -test-pzstd-tsan: TESTFLAGS='--gtest_filter=-*ExtremelyLarge*' +test-pzstd-tsan: LDFLAGS=-fuse-ld=gold +test-pzstd-tsan: TESTFLAGS=--gtest_filter=-*ExtremelyLarge* test-pzstd-tsan: clean googletest tsan check .PHONY: test-pzstd-asan -test-pzstd-asan: LDFLAGS="-fuse-ld=gold" -test-pzstd-asan: TESTFLAGS='--gtest_filter=-*ExtremelyLarge*' +test-pzstd-asan: LDFLAGS=-fuse-ld=gold +test-pzstd-asan: TESTFLAGS=--gtest_filter=-*ExtremelyLarge* test-pzstd-asan: clean asan check .PHONY: check diff --git a/lib/compress/zstd_compress.c b/lib/compress/zstd_compress.c index 924189b0c..0e0f9d373 100644 --- a/lib/compress/zstd_compress.c +++ b/lib/compress/zstd_compress.c @@ -2786,7 +2786,7 @@ ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, u if (!cdict || !cctx) { ZSTD_free(cdict, customMem); - ZSTD_free(cctx, customMem); + ZSTD_freeCCtx(cctx); return NULL; } @@ -2804,8 +2804,8 @@ ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, u { size_t const errorCode = ZSTD_compressBegin_advanced(cctx, cdict->dictContent, dictSize, params, 0); if (ZSTD_isError(errorCode)) { ZSTD_free(cdict->dictBuffer, customMem); - ZSTD_free(cctx, customMem); ZSTD_free(cdict, customMem); + ZSTD_freeCCtx(cctx); return NULL; } } From 3bee41a70eaf343fbcae3637b3f6edbe52f35ed8 Mon Sep 17 00:00:00 2001 From: Sean Purcell Date: Tue, 21 Feb 2017 10:20:36 -0800 Subject: [PATCH 06/22] Add default distributions and fix typos --- doc/zstd_compression_format.md | 34 ++++++++++++++++++++++++++++++++-- 1 file changed, 32 insertions(+), 2 deletions(-) diff --git a/doc/zstd_compression_format.md b/doc/zstd_compression_format.md index f08dc9537..d4b46548a 100644 --- a/doc/zstd_compression_format.md +++ b/doc/zstd_compression_format.md @@ -776,13 +776,44 @@ For details on how to convert this distribution into a decoding table, see the [ [FSE section]: #from-normalized-distribution-to-decoding-tables +##### Literals Length +The decoding table uses an accuracy log of 6 bits (64 states). +``` +short literalsLength_defaultDistribution[36] = + { 4, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 1, 1, 1, 1, 1, + -1,-1,-1,-1 }; +``` + +##### Match Length +The decoding table uses an accuracy log of 6 bits (64 states). +``` +short matchLengths_defaultDistribution[53] = + { 1, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1, + -1,-1,-1,-1,-1 }; +``` + +##### Offset Codes +The decoding table uses an accuracy log of 5 bits (32 states), +and supports a maximum `N` value of 28, allowing offset values up to 536,870,908 . + +If any sequence in the compressed block requires a larger offset than this, +it's not possible to use the default distribution to represent it. +``` +short offsetCodes_defaultDistribution[29] = + { 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,-1,-1 }; +``` + Sequence Execution ------------------ Once literals and sequences have been decoded, they are combined to produce the decoded content of a block. Each sequence consists of a tuple of (`literals_length`, `offset_value`, `match_length`), -decoded as described in the [Sequences Section)[#sequences-section]. +decoded as described in the [Sequences Section](#sequences-section). To execute a sequence, first copy `literals_length` bytes from the literals section to the output. @@ -1266,7 +1297,6 @@ __`Entropy_Tables`__ : following the same format as the tables in compressed blo FSE table for match lengths, and FSE table for literals lengths. These tables populate the Repeat Stats literals mode and Repeat distribution mode for sequence decoding. - It's finally followed by 3 offset values, populating recent offsets (instead of using `{1,4,8}`), stored in order, 4-bytes little-endian each, for a total of 12 bytes. Each recent offset must have a value < dictionary size. From 346ce32adeb57c468d40f7c4e8ed75c0c84a4f4e Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 20:10:21 +0100 Subject: [PATCH 07/22] legacy.c: fix memory leaks --- contrib/pzstd/Makefile | 2 +- tests/legacy.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/contrib/pzstd/Makefile b/contrib/pzstd/Makefile index 21ef935c6..cec6959e6 100644 --- a/contrib/pzstd/Makefile +++ b/contrib/pzstd/Makefile @@ -134,7 +134,7 @@ debug: pzstd$(EXT) tests roundtrip .PHONY: tsan tsan: PZSTD_CCXXFLAGS += -fsanitize=thread -fPIC -tsan: PZSTD_LDFLAGS += -fsanitize=thread -pie +tsan: PZSTD_LDFLAGS += -fsanitize=thread tsan: debug .PHONY: asan diff --git a/tests/legacy.c b/tests/legacy.c index 5d93c68fa..e84e31273 100644 --- a/tests/legacy.c +++ b/tests/legacy.c @@ -65,6 +65,7 @@ int testSimpleAPI(void) return 1; } + free(output); DISPLAY("Simple API OK\n"); return 0; } @@ -118,6 +119,8 @@ int testStreamingAPI(void) } } + free(outBuff); + ZSTD_freeDStream(stream); DISPLAY("Streaming API OK\n"); return 0; } From 97cfec5e12b64bb0494f658143080b1735b8df0e Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 20:44:35 +0100 Subject: [PATCH 08/22] travis.yml: reduce number of jobs --- .travis.yml | 24 ++++++++++-------------- 1 file changed, 10 insertions(+), 14 deletions(-) diff --git a/.travis.yml b/.travis.yml index 38ed23431..1020df8e8 100644 --- a/.travis.yml +++ b/.travis.yml @@ -7,38 +7,34 @@ matrix: os: osx - # Container-based Ubuntu 12.04 LTS Server Edition 64 bit (doesn't support 32-bit includes) - - env: Ubu=12.04cont Cmd="make uasan" - os: linux - sudo: false - - - env: Ubu=14.04 Cmd='make test CC=clang-4.0 MOREFLAGS="-g -fsanitize=address -fsanitize=undefined"' + # Ubuntu 14.04 LTS Server Edition 64 bit + - env: Ubu=14.04 Cmd='make test CC=gcc-6 MOREFLAGS="-g -fsanitize=address -fsanitize=undefined"' os: linux dist: trusty sudo: required addons: apt: sources: - - llvm-toolchain-trusty-4.0 + - ubuntu-toolchain-r-test packages: - - clang-4.0 + - gcc-6 + - gcc-6-multilib - - env: Ubu=14.04 Cmd='make -C tests test32 CC=clang-4.0 MOREFLAGS="-g -fsanitize=address"' + - env: Ubu=14.04 Cmd='make -C tests test32 CC=gcc-6 MOREFLAGS="-g -fsanitize=address"' os: linux dist: trusty sudo: required addons: apt: sources: - - llvm-toolchain-trusty-4.0 + - ubuntu-toolchain-r-test packages: - libc6-dev-i386 - gcc-multilib - - clang-4.0 + - gcc-6 + - gcc-6-multilib - - # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && test-pzstd-asan' + - env: Ubu=14.04 Cmd='cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' os: linux dist: trusty sudo: required From 4d7a24328b3312c4531b68cb75d3d7e8e411cb4c Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 21:12:09 +0100 Subject: [PATCH 09/22] travis.yml: added LDFLAGS=-fuse-ld=gold --- .travis.yml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/.travis.yml b/.travis.yml index 1020df8e8..6d5f22bb1 100644 --- a/.travis.yml +++ b/.travis.yml @@ -8,7 +8,7 @@ matrix: # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='make test CC=gcc-6 MOREFLAGS="-g -fsanitize=address -fsanitize=undefined"' + - env: Ubu=14.04 Cmd='make test CC=gcc-6 MOREFLAGS="-g -fsanitize=address -fsanitize=undefined" LDFLAGS=-fuse-ld=gold' os: linux dist: trusty sudo: required @@ -18,9 +18,8 @@ matrix: - ubuntu-toolchain-r-test packages: - gcc-6 - - gcc-6-multilib - - env: Ubu=14.04 Cmd='make -C tests test32 CC=gcc-6 MOREFLAGS="-g -fsanitize=address"' + - env: Ubu=14.04 Cmd='make -C tests test32 CC=gcc-6 MOREFLAGS="-g -fsanitize=address" LDFLAGS=-fuse-ld=gold' os: linux dist: trusty sudo: required From 7704c3ca1ae07b76c4593418d22ebf89b64b9710 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 21:48:14 +0100 Subject: [PATCH 10/22] travis.yml: use CFLAGS=-Og with -fsanitize --- .travis.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.travis.yml b/.travis.yml index 6d5f22bb1..cf3bbb5bf 100644 --- a/.travis.yml +++ b/.travis.yml @@ -8,7 +8,7 @@ matrix: # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='make test CC=gcc-6 MOREFLAGS="-g -fsanitize=address -fsanitize=undefined" LDFLAGS=-fuse-ld=gold' + - env: Ubu=14.04 Cmd='LDFLAGS=-fuse-ld=gold CFLAGS=-Og make test CC=gcc-6 MOREFLAGS="-fsanitize=address -fsanitize=undefined"' os: linux dist: trusty sudo: required @@ -19,7 +19,7 @@ matrix: packages: - gcc-6 - - env: Ubu=14.04 Cmd='make -C tests test32 CC=gcc-6 MOREFLAGS="-g -fsanitize=address" LDFLAGS=-fuse-ld=gold' + - env: Ubu=14.04 Cmd='LDFLAGS=-fuse-ld=gold CFLAGS=-Og make -C tests test32 CC=gcc-6 MOREFLAGS="-fsanitize=address -fsanitize=undefined"' os: linux dist: trusty sudo: required From 8a51c692184e524d1f3bb167750c0e378f7c8f35 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 22:48:04 +0100 Subject: [PATCH 11/22] travis.yml: added uasan-test and uasan-test32 --- .travis.yml | 46 ++++++++++++++-------------------------------- Makefile | 3 +++ 2 files changed, 17 insertions(+), 32 deletions(-) diff --git a/.travis.yml b/.travis.yml index cf3bbb5bf..dba15d7ea 100644 --- a/.travis.yml +++ b/.travis.yml @@ -8,33 +8,7 @@ matrix: # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='LDFLAGS=-fuse-ld=gold CFLAGS=-Og make test CC=gcc-6 MOREFLAGS="-fsanitize=address -fsanitize=undefined"' - os: linux - dist: trusty - sudo: required - addons: - apt: - sources: - - ubuntu-toolchain-r-test - packages: - - gcc-6 - - - env: Ubu=14.04 Cmd='LDFLAGS=-fuse-ld=gold CFLAGS=-Og make -C tests test32 CC=gcc-6 MOREFLAGS="-fsanitize=address -fsanitize=undefined"' - os: linux - dist: trusty - sudo: required - addons: - apt: - sources: - - ubuntu-toolchain-r-test - packages: - - libc6-dev-i386 - - gcc-multilib - - gcc-6 - - gcc-6-multilib - - - env: Ubu=14.04 Cmd='cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' - os: linux + - env: Ubu=14.04 Cmd='make uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' dist: trusty sudo: required install: @@ -47,9 +21,19 @@ matrix: - libc6-dev-i386 - g++-multilib - gcc-6 - - gcc-6-multilib - g++-6 - - g++-6-multilib + + - env: Ubu=14.04 Cmd='CC=gcc-6 make uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy versionsTest' + dist: trusty + sudo: required + addons: + apt: + sources: + - ubuntu-toolchain-r-test + packages: + - libc6-dev-i386 + - gcc-multilib + - gcc-6 - env: Ubu=14.04 Cmd="make armtest && make clean && make aarch64test" dist: trusty @@ -76,7 +60,6 @@ matrix: # other feature branches => short tests - env: Ubu=14.04 Cmd='make lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' - os: linux dist: trusty sudo: required addons: @@ -84,8 +67,7 @@ matrix: packages: - valgrind - - env: Ubu=14.04 Cmd="make zlibwrapper && make clean && make -C tests test-zstd-nolegacy && make clean && make -C tests test32 versionsTest" - os: linux + - env: Ubu=14.04 Cmd="make -C tests test32" dist: trusty sudo: required addons: diff --git a/Makefile b/Makefile index d86db7cb3..128c72bb0 100644 --- a/Makefile +++ b/Makefile @@ -143,6 +143,9 @@ asan32: clean uasan: clean $(MAKE) test CC=clang MOREFLAGS="-g -fsanitize=address -fsanitize=undefined" +uasan-%: clean + LDFLAGS=-fuse-ld=gold CFLAGS="-Og -fsanitize=address -fsanitize=undefined" $(MAKE) -C $(TESTDIR) $* + endif From f58ac79f513cda1acd468de1853dde6ecfe793aa Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Tue, 21 Feb 2017 23:40:21 +0100 Subject: [PATCH 12/22] fix uasan-test32 --- .travis.yml | 7 ++++--- Makefile | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/.travis.yml b/.travis.yml index dba15d7ea..8688035e2 100644 --- a/.travis.yml +++ b/.travis.yml @@ -8,7 +8,7 @@ matrix: # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='make uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' + - env: Ubu=14.04 Cmd='LDFLAGS=-fuse-ld=gold make uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' dist: trusty sudo: required install: @@ -22,6 +22,7 @@ matrix: - g++-multilib - gcc-6 - g++-6 + - g++-6-multilib - env: Ubu=14.04 Cmd='CC=gcc-6 make uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy versionsTest' dist: trusty @@ -83,11 +84,11 @@ script: # cron & master => long tests, as this is the final step towards a Release # dev => normal tests - # other feature branches => short tests (number > 10) + # other feature branches => short tests (number > 5) - if [ "$TRAVIS_EVENT_TYPE" = "cron" ] || [ "$TRAVIS_BRANCH" = "asan" ]; then FUZZERTEST=-T10mn sh -c "$Cmd" || travis_terminate 1; else - if [ "$TRAVIS_PULL_REQUEST" = "true" ] || [ $JOB_NUMBER -gt 10 ] || [ "$TRAVIS_BRANCH" = "dev" ]; then + if [ "$TRAVIS_PULL_REQUEST" = "true" ] || [ $JOB_NUMBER -gt 5 ] || [ "$TRAVIS_BRANCH" = "dev" ]; then sh -c "$Cmd" || travis_terminate 1; fi fi diff --git a/Makefile b/Makefile index 128c72bb0..ff624e907 100644 --- a/Makefile +++ b/Makefile @@ -144,7 +144,7 @@ uasan: clean $(MAKE) test CC=clang MOREFLAGS="-g -fsanitize=address -fsanitize=undefined" uasan-%: clean - LDFLAGS=-fuse-ld=gold CFLAGS="-Og -fsanitize=address -fsanitize=undefined" $(MAKE) -C $(TESTDIR) $* + CFLAGS="-Og -fsanitize=address -fsanitize=undefined" $(MAKE) -C $(TESTDIR) $* endif From 971c1613189f8305c558287656f88d559430a2f8 Mon Sep 17 00:00:00 2001 From: Soojin Nam Date: Wed, 22 Feb 2017 16:04:48 +0900 Subject: [PATCH 13/22] test for fail to decompress --- examples/dictionary_decompression.c | 6 +++++- examples/simple_decompression.c | 19 ++++++++++++------- 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/examples/dictionary_decompression.c b/examples/dictionary_decompression.c index deaf3888e..ef739c189 100644 --- a/examples/dictionary_decompression.c +++ b/examples/dictionary_decompression.c @@ -78,10 +78,14 @@ static void decompress(const char* fname, const ZSTD_DDict* ddict) size_t cSize; void* const cBuff = loadFile_orDie(fname, &cSize); unsigned long long const rSize = ZSTD_findDecompressedSize(cBuff, cSize); - if (rSize==ZSTD_CONTENTSIZE_UNKNOWN) { + if (rSize==ZSTD_CONTENTSIZE_ERROR) { + fprintf(stderr, "%s : it was not compressed by zstd.\n", fname); + exit(5); + } else if (rSize==ZSTD_CONTENTSIZE_UNKNOWN) { fprintf(stderr, "%s : original size unknown \n", fname); exit(6); } + void* const rBuff = malloc_orDie((size_t)rSize); ZSTD_DCtx* const dctx = ZSTD_createDCtx(); diff --git a/examples/simple_decompression.c b/examples/simple_decompression.c index e23f14887..fa4e3e680 100644 --- a/examples/simple_decompression.c +++ b/examples/simple_decompression.c @@ -20,7 +20,7 @@ static off_t fsize_orDie(const char *filename) struct stat st; if (stat(filename, &st) == 0) return st.st_size; /* error */ - printf("stat: %s : %s \n", filename, strerror(errno)); + fprintf(stderr, "stat: %s : %s \n", filename, strerror(errno)); exit(1); } @@ -29,7 +29,7 @@ static FILE* fopen_orDie(const char *filename, const char *instruction) FILE* const inFile = fopen(filename, instruction); if (inFile) return inFile; /* error */ - printf("fopen: %s : %s \n", filename, strerror(errno)); + fprintf(stderr, "fopen: %s : %s \n", filename, strerror(errno)); exit(2); } @@ -38,7 +38,7 @@ static void* malloc_orDie(size_t size) void* const buff = malloc(size); if (buff) return buff; /* error */ - printf("malloc: %s \n", strerror(errno)); + fprintf(stderr, "malloc: %s \n", strerror(errno)); exit(3); } @@ -49,7 +49,7 @@ static void* loadFile_orDie(const char* fileName, size_t* size) void* const buffer = malloc_orDie(buffSize); size_t const readSize = fread(buffer, 1, buffSize, inFile); if (readSize != (size_t)buffSize) { - printf("fread: %s : %s \n", fileName, strerror(errno)); + fprintf(stderr, "fread: %s : %s \n", fileName, strerror(errno)); exit(4); } fclose(inFile); /* can't fail (read only) */ @@ -63,16 +63,21 @@ static void decompress(const char* fname) size_t cSize; void* const cBuff = loadFile_orDie(fname, &cSize); unsigned long long const rSize = ZSTD_findDecompressedSize(cBuff, cSize); - if (rSize==ZSTD_CONTENTSIZE_UNKNOWN) { - printf("%s : original size unknown. Use streaming decompression instead. \n", fname); + if (rSize==ZSTD_CONTENTSIZE_ERROR) { + fprintf(stderr, "%s : it was not compressed by zstd.\n", fname); exit(5); + } else if (rSize==ZSTD_CONTENTSIZE_UNKNOWN) { + fprintf(stderr, + "%s : original size unknown. Use streaming decompression instead.\n", fname); + exit(6); } + void* const rBuff = malloc_orDie((size_t)rSize); size_t const dSize = ZSTD_decompress(rBuff, rSize, cBuff, cSize); if (dSize != rSize) { - printf("error decoding %s : %s \n", fname, ZSTD_getErrorName(dSize)); + fprintf(stderr, "error decoding %s : %s \n", fname, ZSTD_getErrorName(dSize)); exit(7); } From 5dd18b314b482020be0014f6a0257d20a183b630 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Wed, 22 Feb 2017 08:15:17 +0100 Subject: [PATCH 14/22] travis.yml: reduce number of jobs to 7 --- .travis.yml | 3 ++- Makefile | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/.travis.yml b/.travis.yml index 8688035e2..41d90f380 100644 --- a/.travis.yml +++ b/.travis.yml @@ -8,7 +8,7 @@ matrix: # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='LDFLAGS=-fuse-ld=gold make uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' + - env: Ubu=14.04 Cmd='make uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' dist: trusty sudo: required install: @@ -35,6 +35,7 @@ matrix: - libc6-dev-i386 - gcc-multilib - gcc-6 + - gcc-6-multilib - env: Ubu=14.04 Cmd="make armtest && make clean && make aarch64test" dist: trusty diff --git a/Makefile b/Makefile index ff624e907..128c72bb0 100644 --- a/Makefile +++ b/Makefile @@ -144,7 +144,7 @@ uasan: clean $(MAKE) test CC=clang MOREFLAGS="-g -fsanitize=address -fsanitize=undefined" uasan-%: clean - CFLAGS="-Og -fsanitize=address -fsanitize=undefined" $(MAKE) -C $(TESTDIR) $* + LDFLAGS=-fuse-ld=gold CFLAGS="-Og -fsanitize=address -fsanitize=undefined" $(MAKE) -C $(TESTDIR) $* endif From 21911ad6cbc76a4aeab23fa0ecc64fec6c4da2c9 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Wed, 22 Feb 2017 08:54:56 +0100 Subject: [PATCH 15/22] move Ubuntu packages install to Makefile --- .travis.yml | 64 ++++++----------------------------------------------- Makefile | 25 +++++++++++++++++++++ 2 files changed, 32 insertions(+), 57 deletions(-) diff --git a/.travis.yml b/.travis.yml index 41d90f380..6ea9d31dc 100644 --- a/.travis.yml +++ b/.travis.yml @@ -6,77 +6,27 @@ matrix: - env: Ubu=OS_X_Mavericks Cmd="make gnu90test && make clean && make test && make clean && make travis-install" os: osx - # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='make uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' + - env: Ubu=14.04 Cmd='make gpp6install uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' dist: trusty - sudo: required install: - export CXX="g++-6" CC="gcc-6" - addons: - apt: - sources: - - ubuntu-toolchain-r-test - packages: - - libc6-dev-i386 - - g++-multilib - - gcc-6 - - g++-6 - - g++-6-multilib - - env: Ubu=14.04 Cmd='CC=gcc-6 make uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy versionsTest' + - env: Ubu=14.04 Cmd='CC=gcc-6 make gcc6install uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy versionsTest' dist: trusty - sudo: required - addons: - apt: - sources: - - ubuntu-toolchain-r-test - packages: - - libc6-dev-i386 - - gcc-multilib - - gcc-6 - - gcc-6-multilib - - env: Ubu=14.04 Cmd="make armtest && make clean && make aarch64test" + - env: Ubu=14.04 Cmd="make arminstall armtest && make clean && make aarch64test" dist: trusty - sudo: required - addons: - apt: - packages: - - qemu-system-arm - - qemu-user-static - - gcc-arm-linux-gnueabi - - libc6-dev-armel-cross - - gcc-aarch64-linux-gnu - - libc6-dev-arm64-cross - - env: Ubu=14.04 Cmd='make ppctest && make clean && make ppc64test' + - env: Ubu=14.04 Cmd='make ppcinstall ppctest && make clean && make ppc64test' dist: trusty - sudo: required - addons: - apt: - packages: - - qemu-system-ppc - - qemu-user-static - - gcc-powerpc-linux-gnu # other feature branches => short tests - - env: Ubu=14.04 Cmd='make lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' + - env: Ubu=14.04 Cmd='make valgrindinstall lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' dist: trusty - sudo: required - addons: - apt: - packages: - - valgrind - - env: Ubu=14.04 Cmd="make -C tests test32" + - env: Ubu=14.04 Cmd="make libc6install && make -C tests test32" dist: trusty - sudo: required - addons: - apt: - packages: - - libc6-dev-i386 - - gcc-multilib @@ -87,7 +37,7 @@ script: # dev => normal tests # other feature branches => short tests (number > 5) - if [ "$TRAVIS_EVENT_TYPE" = "cron" ] || [ "$TRAVIS_BRANCH" = "asan" ]; then - FUZZERTEST=-T10mn sh -c "$Cmd" || travis_terminate 1; + FUZZERTEST=-T5mn sh -c "$Cmd" || travis_terminate 1; else if [ "$TRAVIS_PULL_REQUEST" = "true" ] || [ $JOB_NUMBER -gt 5 ] || [ "$TRAVIS_BRANCH" = "dev" ]; then sh -c "$Cmd" || travis_terminate 1; diff --git a/Makefile b/Makefile index 128c72bb0..709d2f0ea 100644 --- a/Makefile +++ b/Makefile @@ -146,6 +146,31 @@ uasan: clean uasan-%: clean LDFLAGS=-fuse-ld=gold CFLAGS="-Og -fsanitize=address -fsanitize=undefined" $(MAKE) -C $(TESTDIR) $* +apt-install: + sudo apt-get -yq --no-install-suggests --no-install-recommends --force-yes install $(APT_PACKAGES) + +apt-add-repo: + sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test + sudo apt-get update -y -qq + +ppcinstall: + APT_PACKAGES="qemu-system-ppc qemu-user-static gcc-powerpc-linux-gnu" $(MAKE) apt-install + +arminstall: + APT_PACKAGES="qemu-system-arm qemu-user-static gcc-powerpc-linux-gnu gcc-arm-linux-gnueabi libc6-dev-armel-cross gcc-aarch64-linux-gnu libc6-dev-arm64-cross" $(MAKE) apt-install + +valgrindinstall: + APT_PACKAGES="valgrind" $(MAKE) apt-install + +libc6install: + APT_PACKAGES="libc6-dev-i386 gcc-multilib" $(MAKE) apt-install + +gcc6install: apt-add-repo + APT_PACKAGES="libc6-dev-i386 gcc-multilib gcc-6 gcc-6-multilib" $(MAKE) apt-install + +gpp6install: apt-add-repo + APT_PACKAGES="libc6-dev-i386 g++-multilib gcc-6 g++-6 g++-6-multilib" $(MAKE) apt-install + endif From 2e8ae51f8cf32ce22ec3caebd6c153c6248dfab9 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Wed, 22 Feb 2017 09:21:04 +0100 Subject: [PATCH 16/22] travis.yml: set "dist: trusty" as default --- .travis.yml | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/.travis.yml b/.travis.yml index 6ea9d31dc..e6b011003 100644 --- a/.travis.yml +++ b/.travis.yml @@ -1,4 +1,6 @@ language: c +sudo: required +dist: trusty matrix: fast_finish: true include: @@ -8,27 +10,15 @@ matrix: # Ubuntu 14.04 LTS Server Edition 64 bit - env: Ubu=14.04 Cmd='make gpp6install uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' - dist: trusty install: - export CXX="g++-6" CC="gcc-6" - - env: Ubu=14.04 Cmd='CC=gcc-6 make gcc6install uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy versionsTest' - dist: trusty - - env: Ubu=14.04 Cmd="make arminstall armtest && make clean && make aarch64test" - dist: trusty - - env: Ubu=14.04 Cmd='make ppcinstall ppctest && make clean && make ppc64test' - dist: trusty # other feature branches => short tests - env: Ubu=14.04 Cmd='make valgrindinstall lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' - dist: trusty - - env: Ubu=14.04 Cmd="make libc6install && make -C tests test32" - dist: trusty - - script: - JOB_NUMBER=$(echo $TRAVIS_JOB_NUMBER | sed -e 's:[0-9][0-9]*\.\(.*\):\1:') From 3d836bfd18e67f406024e874e97aedddb3ea0355 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Wed, 22 Feb 2017 09:36:42 +0100 Subject: [PATCH 17/22] travis.yml: fix versionsTest target --- .travis.yml | 7 +++---- tests/Makefile | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/.travis.yml b/.travis.yml index e6b011003..958633de7 100644 --- a/.travis.yml +++ b/.travis.yml @@ -22,12 +22,11 @@ matrix: script: - JOB_NUMBER=$(echo $TRAVIS_JOB_NUMBER | sed -e 's:[0-9][0-9]*\.\(.*\):\1:') - # cron & master => long tests, as this is the final step towards a Release - # dev => normal tests + # dev && pull requests => normal tests # other feature branches => short tests (number > 5) - - if [ "$TRAVIS_EVENT_TYPE" = "cron" ] || [ "$TRAVIS_BRANCH" = "asan" ]; then - FUZZERTEST=-T5mn sh -c "$Cmd" || travis_terminate 1; + - if [ "$TRAVIS_EVENT_TYPE" = "cron" ] || [ "$TRAVIS_BRANCH" = "master" ]; then + FUZZERTEST=-T7mn sh -c "$Cmd" || travis_terminate 1; else if [ "$TRAVIS_PULL_REQUEST" = "true" ] || [ $JOB_NUMBER -gt 5 ] || [ "$TRAVIS_BRANCH" = "dev" ]; then sh -c "$Cmd" || travis_terminate 1; diff --git a/tests/Makefile b/tests/Makefile index c5b8bdfa7..937ec96d3 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -170,7 +170,7 @@ namespaceTest: if $(CC) namespaceTest.c ../lib/common/xxhash.c -o $@ ; then echo compilation should fail; exit 1 ; fi $(RM) $@ -versionsTest: +versionsTest: clean $(PYTHON) test-zstd-versions.py clean: From 337ec875b61f56cc98bf297887da80029741b8d4 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Wed, 22 Feb 2017 10:31:30 +0100 Subject: [PATCH 18/22] minor tweaks --- .travis.yml | 2 +- programs/Makefile | 2 +- tests/Makefile | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.travis.yml b/.travis.yml index 958633de7..c1985d785 100644 --- a/.travis.yml +++ b/.travis.yml @@ -12,7 +12,7 @@ matrix: - env: Ubu=14.04 Cmd='make gpp6install uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' install: - export CXX="g++-6" CC="gcc-6" - - env: Ubu=14.04 Cmd='CC=gcc-6 make gcc6install uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy versionsTest' + - env: Ubu=14.04 Cmd='CC=gcc-6 make gcc6install uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy && make -C tests versionsTest' - env: Ubu=14.04 Cmd="make arminstall armtest && make clean && make aarch64test" - env: Ubu=14.04 Cmd='make ppcinstall ppctest && make clean && make ppc64test' diff --git a/programs/Makefile b/programs/Makefile index 0a9ab5a79..db718d14c 100644 --- a/programs/Makefile +++ b/programs/Makefile @@ -148,7 +148,7 @@ generate_res: windres/generate_res.bat clean: - $(MAKE) -C ../lib clean + $(MAKE) -C $(ZSTDDIR) clean @$(RM) $(ZSTDDIR)/decompress/*.o $(ZSTDDIR)/decompress/zstd_decompress.gcda @$(RM) core *.o tmp* result* *.gcda dictionary *.zst \ zstd$(EXT) zstd32$(EXT) zstd-compress$(EXT) zstd-decompress$(EXT) \ diff --git a/tests/Makefile b/tests/Makefile index 937ec96d3..17286a022 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -174,7 +174,7 @@ versionsTest: clean $(PYTHON) test-zstd-versions.py clean: - $(MAKE) -C ../lib clean + $(MAKE) -C $(ZSTDDIR) clean @$(RM) -fR $(TESTARTEFACT) @$(RM) -f core *.o tmp* result* *.gcda dictionary *.zst \ $(PRGDIR)/zstd$(EXT) $(PRGDIR)/zstd32$(EXT) \ From d41c048394fb4123d567560013e72424bea5a920 Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Wed, 22 Feb 2017 11:07:28 +0100 Subject: [PATCH 19/22] added arm-ppc-compilation Makefile target --- .travis.yml | 2 +- Makefile | 6 ++++++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/.travis.yml b/.travis.yml index c1985d785..a4d4a50ab 100644 --- a/.travis.yml +++ b/.travis.yml @@ -17,7 +17,7 @@ matrix: - env: Ubu=14.04 Cmd='make ppcinstall ppctest && make clean && make ppc64test' # other feature branches => short tests - - env: Ubu=14.04 Cmd='make valgrindinstall lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' + - env: Ubu=14.04 Cmd='make arminstall ppcinstall arm-ppc-compilation && make valgrindinstall lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' - env: Ubu=14.04 Cmd="make libc6install && make -C tests test32" script: diff --git a/Makefile b/Makefile index 709d2f0ea..ed8f16107 100644 --- a/Makefile +++ b/Makefile @@ -128,6 +128,12 @@ ppc64test: clean $(MAKE) -C $(TESTDIR) datagen # use native, faster $(MAKE) -C $(TESTDIR) test CC=powerpc-linux-gnu-gcc QEMU_SYS=qemu-ppc64-static ZSTDRTTEST= MOREFLAGS="-m64 -static" +arm-ppc-compilation: + $(MAKE) -C $(PRGDIR) clean zstd CC=arm-linux-gnueabi-gcc QEMU_SYS=qemu-arm-static ZSTDRTTEST= MOREFLAGS="-Werror -static" + $(MAKE) -C $(PRGDIR) clean zstd CC=aarch64-linux-gnu-gcc QEMU_SYS=qemu-aarch64-static ZSTDRTTEST= MOREFLAGS="-Werror -static" + $(MAKE) -C $(PRGDIR) clean zstd CC=powerpc-linux-gnu-gcc QEMU_SYS=qemu-ppc-static ZSTDRTTEST= MOREFLAGS="-Werror -Wno-attributes -static" + $(MAKE) -C $(PRGDIR) clean zstd CC=powerpc-linux-gnu-gcc QEMU_SYS=qemu-ppc64-static ZSTDRTTEST= MOREFLAGS="-m64 -static" + usan: clean $(MAKE) test CC=clang MOREFLAGS="-g -fsanitize=undefined" From bbbd43509950701d076cfdc3f0ef325a80f5d55a Mon Sep 17 00:00:00 2001 From: Przemyslaw Skibinski Date: Wed, 22 Feb 2017 11:21:34 +0100 Subject: [PATCH 20/22] travis.yml: test arm-ppc-compilation target --- .travis.yml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/.travis.yml b/.travis.yml index a4d4a50ab..b20c43329 100644 --- a/.travis.yml +++ b/.travis.yml @@ -5,20 +5,20 @@ matrix: fast_finish: true include: # OS X Mavericks - - env: Ubu=OS_X_Mavericks Cmd="make gnu90test && make clean && make test && make clean && make travis-install" + - env: Cmd="make gnu90test && make clean && make test && make clean && make travis-install" os: osx # Ubuntu 14.04 LTS Server Edition 64 bit - - env: Ubu=14.04 Cmd='make gpp6install uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' + - env: Cmd='make gpp6install uasan-test && cd contrib/pzstd && make test-pzstd && make test-pzstd32 && make test-pzstd-tsan && make test-pzstd-asan' install: - export CXX="g++-6" CC="gcc-6" - - env: Ubu=14.04 Cmd='CC=gcc-6 make gcc6install uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy && make -C tests versionsTest' - - env: Ubu=14.04 Cmd="make arminstall armtest && make clean && make aarch64test" - - env: Ubu=14.04 Cmd='make ppcinstall ppctest && make clean && make ppc64test' + - env: Cmd='CC=gcc-6 make gcc6install uasan-test32 && make clean zlibwrapper && make -C tests clean test-zstd-nolegacy && make -C tests versionsTest' + - env: Cmd="make arminstall armtest && make clean && make aarch64test" + - env: Cmd='make ppcinstall ppctest && make clean && make ppc64test' # other feature branches => short tests - - env: Ubu=14.04 Cmd='make arminstall ppcinstall arm-ppc-compilation && make valgrindinstall lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' - - env: Ubu=14.04 Cmd="make libc6install && make -C tests test32" + - env: Cmd='make valgrindinstall arminstall ppcinstall arm-ppc-compilation && make clean lib && CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make -C tests valgrindTest' + - env: Cmd="make libc6install && make -C tests test32" script: - JOB_NUMBER=$(echo $TRAVIS_JOB_NUMBER | sed -e 's:[0-9][0-9]*\.\(.*\):\1:') From 1f3d54ddb44c2834fbda6ef2b13eb354b7e82db4 Mon Sep 17 00:00:00 2001 From: Yann Collet Date: Wed, 22 Feb 2017 11:08:00 -0800 Subject: [PATCH 21/22] fixed malloc(0) potential issue Added test cases to cover #556 patch --- examples/Makefile | 17 ++++++++++++----- examples/simple_decompression.c | 2 +- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/examples/Makefile b/examples/Makefile index 741022869..b84983f08 100644 --- a/examples/Makefile +++ b/examples/Makefile @@ -9,7 +9,7 @@ # This Makefile presumes libzstd is installed, using `sudo make install` -LDFLAGS+= -lzstd +LDFLAGS += -lzstd .PHONY: default all clean test @@ -52,16 +52,23 @@ clean: test: all cp README.md tmp cp Makefile tmp2 - @echo starting simple compression + @echo -- Simple compression tests ./simple_compression tmp ./simple_decompression tmp.zst ./streaming_decompression tmp.zst > /dev/null - @echo starting streaming compression + @echo -- Streaming compression tests ./streaming_compression tmp ./streaming_decompression tmp.zst > /dev/null - @echo starting multiple streaming compression + @echo -- Edge cases detection + ! ./streaming_decompression tmp # invalid input, must fail + ! ./simple_decompression tmp # invalid input, must fail + ! ./simple_decompression tmp.zst # unknown input size, must fail + touch tmpNull # create 0-size file + ./simple_compression tmpNull + ./simple_decompression tmpNull.zst # 0-size frame : must work + @echo -- Multiple streaming tests ./multiple_streaming_compression *.c - @echo starting dictionary compression + @echo -- Dictionary compression tests ./dictionary_compression tmp2 tmp README.md ./dictionary_decompression tmp2.zst tmp.zst README.md $(RM) tmp* *.zst diff --git a/examples/simple_decompression.c b/examples/simple_decompression.c index fa4e3e680..4b7ea59e5 100644 --- a/examples/simple_decompression.c +++ b/examples/simple_decompression.c @@ -35,7 +35,7 @@ static FILE* fopen_orDie(const char *filename, const char *instruction) static void* malloc_orDie(size_t size) { - void* const buff = malloc(size); + void* const buff = malloc(size + !size); /* avoid allocating size of 0 : may return NULL (implementation dependent) */ if (buff) return buff; /* error */ fprintf(stderr, "malloc: %s \n", strerror(errno)); From 83038d236acd5a77bc4655c1e6efc723f716d4a9 Mon Sep 17 00:00:00 2001 From: Sean Purcell Date: Wed, 22 Feb 2017 13:52:48 -0800 Subject: [PATCH 22/22] Fix bug in FSE distribution normalization --- lib/compress/fse_compress.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/lib/compress/fse_compress.c b/lib/compress/fse_compress.c index 337b7a6ff..6708fb9d7 100644 --- a/lib/compress/fse_compress.c +++ b/lib/compress/fse_compress.c @@ -506,6 +506,7 @@ unsigned FSE_optimalTableLog(unsigned maxTableLog, size_t srcSize, unsigned maxS static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count, size_t total, U32 maxSymbolValue) { + short const NOT_YET_ASSIGNED = -2; U32 s; U32 distributed = 0; U32 ToDistribute; @@ -531,7 +532,8 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count, total -= count[s]; continue; } - norm[s]=-2; + + norm[s]=NOT_YET_ASSIGNED; } ToDistribute = (1 << tableLog) - distributed; @@ -539,7 +541,7 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count, /* risk of rounding to zero */ lowOne = (U32)((total * 3) / (ToDistribute * 2)); for (s=0; s<=maxSymbolValue; s++) { - if ((norm[s] == -2) && (count[s] <= lowOne)) { + if ((norm[s] == NOT_YET_ASSIGNED) && (count[s] <= lowOne)) { norm[s] = 1; distributed++; total -= count[s]; @@ -559,12 +561,19 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count, return 0; } + if (total == 0) { + /* all of the symbols were low enough for the lowOne or lowThreshold */ + for (s=0; ToDistribute > 0; s = (s+1)%(maxSymbolValue+1)) + if (norm[s] > 0) ToDistribute--, norm[s]++; + return 0; + } + { U64 const vStepLog = 62 - tableLog; U64 const mid = (1ULL << (vStepLog-1)) - 1; U64 const rStep = ((((U64)1<> vStepLog); U32 const sEnd = (U32)(end >> vStepLog);