updated spec on dictID==0

Specified decoder behavior on receiving a frame with dictID=0.

Pushed paragraph on reserved DictID ranges into the Dictionary Format section.
This commit is contained in:
Yann Collet 2020-05-25 08:15:09 -07:00
parent 9eb2ccc9fb
commit bb3c9bf43a

View File

@ -3,7 +3,7 @@ Zstandard Compression Format
### Notices ### Notices
Copyright (c) 2016-present Yann Collet, Facebook, Inc. Copyright (c) 2016-2020 Yann Collet, Facebook, Inc.
Permission is granted to copy and distribute this document Permission is granted to copy and distribute this document
for any purpose and without charge, for any purpose and without charge,
@ -16,7 +16,7 @@ Distribution of this document is unlimited.
### Version ### Version
0.3.5 (13/11/19) 0.3.6 (25/05/20)
Introduction Introduction
@ -291,21 +291,10 @@ Format is __little-endian__.
It's allowed to represent a small ID (for example `13`) It's allowed to represent a small ID (for example `13`)
with a large 4-bytes dictionary ID, even if it is less efficient. with a large 4-bytes dictionary ID, even if it is less efficient.
_Reserved ranges :_ A value of `0` has same meaning as no `Dictionary_ID`,
Within private environments, any `Dictionary_ID` can be used. in which case the frame may or may not need a dictionary to be decoded,
and the ID of such a dictionary is not specified.
However, for frames and dictionaries distributed in public space, The decoder must know this information by other means.
`Dictionary_ID` must be attributed carefully.
Rules for public environment are not yet decided,
but the following ranges are reserved for some future registrar :
- low range : `<= 32767`
- high range : `>= (1 << 31)`
Outside of these ranges, any value of `Dictionary_ID`
which is both `>= 32768` and `< (1<<31)` can be used freely,
even in public environment.
#### `Frame_Content_Size` #### `Frame_Content_Size`
@ -1429,14 +1418,18 @@ __`Dictionary_ID`__ : 4 bytes, stored in __little-endian__ format.
It's used by decoders to check if they use the correct dictionary. It's used by decoders to check if they use the correct dictionary.
_Reserved ranges :_ _Reserved ranges :_
If the frame is going to be distributed in a private environment, If the dictionary is going to be distributed in a public environment,
any `Dictionary_ID` can be used. the following ranges of `Dictionary_ID` are reserved for some future registrar
However, for public distribution of compressed frames, and shall not be used :
the following ranges are reserved and shall not be used :
- low range : <= 32767 - low range : <= 32767
- high range : >= (2^31) - high range : >= (2^31)
Outside of these ranges, any value of `Dictionary_ID`
which is both `>= 32768` and `< (1<<31)` can be used freely,
even in public environment.
__`Entropy_Tables`__ : follow the same format as tables in [compressed blocks]. __`Entropy_Tables`__ : follow the same format as tables in [compressed blocks].
See the relevant [FSE](#fse-table-description) See the relevant [FSE](#fse-table-description)
and [Huffman](#huffman-tree-description) sections for how to decode these tables. and [Huffman](#huffman-tree-description) sections for how to decode these tables.
@ -1673,6 +1666,7 @@ or at least provide a meaningful error code explaining for which reason it canno
Version changes Version changes
--------------- ---------------
- 0.3.6 : clarifications for Dictionary_ID
- 0.3.5 : clarifications for Block_Maximum_Size - 0.3.5 : clarifications for Block_Maximum_Size
- 0.3.4 : clarifications for FSE decoding table - 0.3.4 : clarifications for FSE decoding table
- 0.3.3 : clarifications for field Block_Size - 0.3.3 : clarifications for field Block_Size