Release notes for 0.9.2-rc1

Run copy_from_upstream
Checkout post-0.9.0 copy_from_upstream fixes
2025-06-23 00:01:22 -04:00 · 2024-01-11 17:54:58 +01:00 · 2024-01-08 11:51:32 -05:00 · 2024-01-08 11:51:32 -05:00 · 2024-01-08 11:51:32 -05:00 · 2024-01-08 11:51:32 -05:00
30 changed files with 578 additions and 136 deletions
--- a/.github/workflows/linux.yml
+++ b/.github/workflows/linux.yml
@ -30,6 +30,7 @@ jobs:
          git config --global user.name "ciuser" && \
          git config --global user.email "ci@openquantumsafe.org" && \
          export LIBOQS_DIR=`pwd` && \
          git config --global --add safe.directory $LIBOQS_DIR && \
          cd scripts/copy_from_upstream && \
          ! pip3 install -r requirements.txt 2>&1 | grep ERROR && \
          python3 copy_from_upstream.py copy && \
--- a/.travis.yml
+++ b/.travis.yml
@ -1,6 +1,6 @@
 language: c
 before_script:
-  - sudo apt -y install astyle cmake gcc ninja-build libssl-dev python3-pytest python3-pytest-xdist unzip xsltproc doxygen graphviz valgrind
+  - sudo apt update && sudo apt -y install astyle cmake gcc ninja-build libssl-dev python3-pytest python3-pytest-xdist unzip xsltproc doxygen graphviz valgrind
 jobs:
  include:
    - arch: ppc64le         # The IBM Power LXD container based build for OSS only
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@ -33,7 +33,7 @@ set(CMAKE_C_STANDARD 11)
 set(CMAKE_C_STANDARD_REQUIRED ON)
 set(CMAKE_POSITION_INDEPENDENT_CODE ON)
 set(CMAKE_C_VISIBILITY_PRESET hidden)
-set(OQS_VERSION_TEXT "0.9.0")
+set(OQS_VERSION_TEXT "0.9.2-rc1")
 set(OQS_COMPILE_BUILD_TARGET "${CMAKE_SYSTEM_PROCESSOR}-${CMAKE_HOST_SYSTEM}")
 set(OQS_MINIMAL_GCC_VERSION "7.1.0")
 set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
--- a/RELEASE.md
+++ b/RELEASE.md
@ -1,5 +1,5 @@
-liboqs version 0.9.0
+liboqs version 0.9.2-rc1
-====================
+========================
 About
 -----
@ -28,78 +28,22 @@ liboqs can also be used in the following programming languages via language-spec
 Release notes
 =============
-This is version 0.9.0 of liboqs. It was released on October 12, 2023.
+This is release candidate 1 of version 0.9.2 of liboqs. It was released on January 11, 2024.
-This release features an update to the Classic McEliece KEM, bringing it in line with NIST Round 4. It also adds or updates ARM implementations for Kyber, Dilithium, and Falcon.
+This release is a security release which fixes potential non-constant-time behaviour in Kyber based on https://github.com/pq-crystals/kyber/commit/272125f6acc8e8b6850fd68ceb901a660ff48196
 What's New
 ----------
-This release continues from the 0.8.0 release of liboqs.
+This release continues from the 0.9.1 release of liboqs.
 ### Key encapsulation mechanisms
- Classic McEliece: updated to Round 4 version.
+- Kyber: C, AVX2, and aarch64 implementation updated
 - Kyber: aarch64 implementation updated.
 ### Digital signature schemes
 - Dilithium: aarch64 implementation updated.
 - Falcon: aarch64 implementation added.
 ### Other changes
 - Update algorithm documentation
 - Support compilation for Windows on ARM64, Apple mobile, and Android platforms
 - Improve resilience of randombytes on Apple systems
 Release call
 ============
 Users of liboqs are invited to join a webinar on Thursday, November 2, 2023, from 12-1pm US Eastern time for information on this release, plans for the next release cycle, and to provide feedback on OQS usage and features.  
 The Zoom link for the webinar is: https://uwaterloo.zoom.us/j/98288698086
 ---
 Detailed changelog
 ------------------
-* Fix libdir value in liboqs.pc by @vt-alt in https://github.com/open-quantum-safe/liboqs/pull/1496
+* Pull Kyber division fixes from PQ-Crystals into dev-092 by @praveksharma in https://github.com/open-quantum-safe/liboqs/pull/1652
 * update version and remove CCI triggers by @baentsch in https://github.com/open-quantum-safe/liboqs/pull/1498
 * create deb package and retain as artifact by @baentsch in https://github.com/open-quantum-safe/liboqs/pull/1501
 * README correction to docs path & additional gitignore to macos + vscode by @planetf1 in https://github.com/open-quantum-safe/liboqs/pull/1503
 * Trigger liboqs-python CI via GitHub API by @SWilson4 in https://github.com/open-quantum-safe/liboqs/pull/1507
 * Update Classic McEliece by @praveksharma in https://github.com/open-quantum-safe/liboqs/pull/1470
 * update BIKE documentation by @baentsch in https://github.com/open-quantum-safe/liboqs/pull/1509
 * kyber/dilithium aarch64 pull from pqclean + patches by @bhess in https://github.com/open-quantum-safe/liboqs/pull/1512
 * Pull Falcon updates from PQClean by @dstebila in https://github.com/open-quantum-safe/liboqs/pull/1523
 * Bump XCode by @baentsch in https://github.com/open-quantum-safe/liboqs/pull/1526
 * Update Classic McEliece supression files by @praveksharma in https://github.com/open-quantum-safe/liboqs/pull/1527
 * Bump gitpython from 3.1.30 to 3.1.32 in /scripts/copy_from_upstream by @dependabot in https://github.com/open-quantum-safe/liboqs/pull/1524
 * ci: add CI for android by @res0nance in https://github.com/open-quantum-safe/liboqs/pull/1531
 * re-enable armhf speed testing by @baentsch in https://github.com/open-quantum-safe/liboqs/pull/1535
 * Bump gitpython from 3.1.32 to 3.1.34 in /scripts/copy_from_upstream by @dependabot in https://github.com/open-quantum-safe/liboqs/pull/1538
 * Prefer arc4random on Apple platforms by @res0nance in https://github.com/open-quantum-safe/liboqs/pull/1544
 * Bump gitpython from 3.1.34 to 3.1.35 in /scripts/copy_from_upstream by @dependabot in https://github.com/open-quantum-safe/liboqs/pull/1551
 * Update Classic McEliece suppression files by @praveksharma in https://github.com/open-quantum-safe/liboqs/pull/1541
 * Pull Neon implementation of Falcon from PQClean by @SWilson4 in https://github.com/open-quantum-safe/liboqs/pull/1547
 * ci: add CI for apple mobile platforms by @res0nance in https://github.com/open-quantum-safe/liboqs/pull/1546
 * Add Windows ARM64 support by @res0nance in https://github.com/open-quantum-safe/liboqs/pull/1545
 * Document Falcon constant time errors by @praveksharma in https://github.com/open-quantum-safe/liboqs/pull/1552
 * ci: github actions CI for Windows x86 and x64 by @res0nance in https://github.com/open-quantum-safe/liboqs/pull/1554
 * build: Align VS test folder with all other Generators by @res0nance in https://github.com/open-quantum-safe/liboqs/pull/1557
 * Fix weekly.yml to skip McEliece by @praveksharma in https://github.com/open-quantum-safe/liboqs/pull/1562
 * Enable extensions in constant-time tests by @SWilson4 in https://github.com/open-quantum-safe/liboqs/pull/1567
 * Update Classic McEliece supression files by @praveksharma in https://github.com/open-quantum-safe/liboqs/pull/1568
 * liboqs 0.9.0 release candidate 1 by @SWilson4 in https://github.com/open-quantum-safe/liboqs/pull/1570
 * add community standard documentation [skip ci] by @baentsch in https://github.com/open-quantum-safe/liboqs/pull/1565
 * Bump gitpython from 3.1.35 to 3.1.37 in /scripts/copy_from_upstream by @dependabot in https://github.com/open-quantum-safe/liboqs/pull/1575
-## New Contributors
+**Full Changelog**: https://github.com/open-quantum-safe/liboqs/compare/0.9.1...0.9.2-rc1
 * @planetf1 made their first contribution in https://github.com/open-quantum-safe/liboqs/pull/1503
 * @SWilson4 made their first contribution in https://github.com/open-quantum-safe/liboqs/pull/1507
 * @praveksharma made their first contribution in https://github.com/open-quantum-safe/liboqs/pull/1470
 * @res0nance made their first contribution in https://github.com/open-quantum-safe/liboqs/pull/1531
 **Full Changelog**: https://github.com/open-quantum-safe/liboqs/compare/0.8.0...0.9.0
--- a/docs/algorithms/kem/classic_mceliece.md
+++ b/docs/algorithms/kem/classic_mceliece.md
@ -35,8 +35,8 @@
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?‡   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:----------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                  |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                  |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | True                                           | True                  |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | False                                          | True                  |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -46,8 +46,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -55,8 +55,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -64,8 +64,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -73,8 +73,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -82,8 +82,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -91,8 +91,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -100,8 +100,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -109,8 +109,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT             | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
@ -118,8 +118,8 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | True                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | True                 |
-| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | True                                           | True                 |
+| [Primary Source](#primary-source) | avx2                     | x86\_64                     | Linux,Darwin                    | AVX2,POPCNT,BMI1        | False                              | False                                          | True                 |
 Are implementations chosen based on runtime CPU feature detection? **Yes**.
--- a/docs/algorithms/kem/classic_mceliece.yml
+++ b/docs/algorithms/kem/classic_mceliece.yml
@ -26,7 +26,9 @@ advisories:
  building with ``clang`` using optimization level ``-O2`` and ``-O3``. Care is advised
  when using the algorithm at higher optimization levels, and any other compiler and
  architecture.
- Current implementation of the algorithm may not be constant-time. Additionally, environment specific constant-time leaks may not be documented; please report potential constant-time leaks when found. 
+- Current implementation of the algorithm may not be constant-time. Additionally,
  environment specific constant-time leaks may not be documented; please report potential
  constant-time leaks when found.
 parameter-sets:
 - name: Classic-McEliece-348864
  claimed-nist-level: 1
--- a/docs/algorithms/kem/kyber.md
+++ b/docs/algorithms/kem/kyber.md
@ -7,9 +7,9 @@
 - **Authors' website**: https://pq-crystals.org/
 - **Specification version**: NIST Round 3 submission.
 - **Primary Source**<a name="primary-source"></a>:
-  - **Source**: https://github.com/pq-crystals/kyber/commit/518de2414a85052bb91349bcbcc347f391292d5b with copy_from_upstream patches
+  - **Source**: https://github.com/pq-crystals/kyber/commit/b628ba78711bc28327dc7d2d5c074a00f061884e with copy_from_upstream patches
  - **Implementation license (SPDX-Identifier)**: CC0-1.0 or Apache-2.0
- **Optimized Implementation sources**: https://github.com/pq-crystals/kyber/commit/518de2414a85052bb91349bcbcc347f391292d5b with copy_from_upstream patches
+- **Optimized Implementation sources**: https://github.com/pq-crystals/kyber/commit/b628ba78711bc28327dc7d2d5c074a00f061884e with copy_from_upstream patches
  - **pqclean-aarch64**:<a name="pqclean-aarch64"></a>
      - **Source**: https://github.com/PQClean/PQClean/commit/8e220a87308154d48fdfac40abbb191ac7fce06a with copy_from_upstream patches
      - **Implementation license (SPDX-Identifier)**: CC0-1.0 and (CC0-1.0 or Apache-2.0) and (CC0-1.0 or MIT) and MIT
--- a/docs/algorithms/kem/kyber.yml
+++ b/docs/algorithms/kem/kyber.yml
@ -17,7 +17,7 @@ website: https://pq-crystals.org/
 nist-round: 3
 spec-version: NIST Round 3 submission
 primary-upstream:
-  source: https://github.com/pq-crystals/kyber/commit/518de2414a85052bb91349bcbcc347f391292d5b
+  source: https://github.com/pq-crystals/kyber/commit/b628ba78711bc28327dc7d2d5c074a00f061884e
    with copy_from_upstream patches
  spdx-license-identifier: CC0-1.0 or Apache-2.0
 optimized-upstreams:
--- a/docs/algorithms/sig/falcon.md
+++ b/docs/algorithms/sig/falcon.md
@ -22,7 +22,7 @@
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?‡   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:----------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | False                 |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | False                 |
 | [Primary Source](#primary-source) | avx2                     | x86\_64                     | All                             | AVX2                    | False                              | False                                          | False                 |
 | [Primary Source](#primary-source) | aarch64                  | ARM64\_V8                   | Linux,Darwin                    | None                    | False                              | False                                          | False                 |
@ -34,7 +34,7 @@ Are implementations chosen based on runtime CPU feature detection? **Yes**.
 |       Implementation source       | Identifier in upstream   | Supported architecture(s)   | Supported operating system(s)   | CPU extension(s) used   | No branching-on-secrets claimed?   | No branching-on-secrets checked by valgrind?   | Large stack usage?   |
 |:---------------------------------:|:-------------------------|:----------------------------|:--------------------------------|:------------------------|:-----------------------------------|:-----------------------------------------------|:---------------------|
-| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | False                              | False                                          | False                |
+| [Primary Source](#primary-source) | clean                    | All                         | All                             | None                    | True                               | True                                           | False                |
 | [Primary Source](#primary-source) | avx2                     | x86\_64                     | All                             | AVX2                    | False                              | False                                          | False                |
 | [Primary Source](#primary-source) | aarch64                  | ARM64\_V8                   | Linux,Darwin                    | None                    | False                              | False                                          | False                |
--- a/scripts/copy_from_upstream/copy_from_upstream.py
+++ b/scripts/copy_from_upstream/copy_from_upstream.py
@ -611,8 +611,6 @@ def copy_from_upstream():
    for t in ["kem", "sig"]:
        with open(os.path.join(os.environ['LIBOQS_DIR'], 'tests', 'KATs', t, 'kats.json'), "w") as f:
            json.dump(kats[t], f, indent=2, sort_keys=True)
    if not keepdata:
        shutil.rmtree('repos')
    update_upstream_alg_docs.do_it(os.environ['LIBOQS_DIR'])
@ -622,6 +620,10 @@ def copy_from_upstream():
    update_docs_from_yaml.do_it(os.environ['LIBOQS_DIR'])
    update_cbom.update_cbom_if_algs_not_changed(os.environ['LIBOQS_DIR'], "git")
    if not keepdata:
        shutil.rmtree('repos')
 def verify_from_upstream():
    instructions = load_instructions()
    basedir = "verify_from_upstream"
--- a/scripts/copy_from_upstream/copy_from_upstream.yml
+++ b/scripts/copy_from_upstream/copy_from_upstream.yml
@ -8,13 +8,13 @@ upstreams:
    sig_meta_path: 'crypto_sign/{pqclean_scheme}/META.yml'
    kem_scheme_path: 'crypto_kem/{pqclean_scheme}'
    sig_scheme_path: 'crypto_sign/{pqclean_scheme}'
-    patches: [pqclean-sphincs.patch, pqclean-dilithium-arm-randomized-signing.patch, pqclean-kyber-armneon-shake-fixes.patch, pqclean-kyber-armneon-768-1024-fixes.patch]
+    patches: [pqclean-sphincs.patch, pqclean-dilithium-arm-randomized-signing.patch, pqclean-kyber-armneon-shake-fixes.patch, pqclean-kyber-armneon-768-1024-fixes.patch, pqclean-kyber-armneon-variable-timing-fix.patch]
    ignore: pqclean_sphincs-shake-256s-simple_aarch64, pqclean_sphincs-shake-256s-simple_aarch64, pqclean_sphincs-shake-256f-simple_aarch64, pqclean_sphincs-shake-192s-simple_aarch64, pqclean_sphincs-shake-192f-simple_aarch64, pqclean_sphincs-shake-128s-simple_aarch64, pqclean_sphincs-shake-128f-simple_aarch64
  -
    name: pqcrystals-kyber
    git_url: https://github.com/pq-crystals/kyber.git
    git_branch: master
-    git_commit: 518de2414a85052bb91349bcbcc347f391292d5b
+    git_commit: b628ba78711bc28327dc7d2d5c074a00f061884e
    kem_meta_path: '{pretty_name_full}_META.yml'
    kem_scheme_path: '.'
    patches: [pqcrystals-kyber-yml.patch, pqcrystals-kyber-ref-shake-aes.patch, pqcrystals-kyber-avx2-shake-aes.patch]
--- a/scripts/copy_from_upstream/patches/pqclean-kyber-armneon-variable-timing-fix.patch
+++ b/scripts/copy_from_upstream/patches/pqclean-kyber-armneon-variable-timing-fix.patch
@ -0,0 +1,274 @@
 927a0eff4a45781218062953002001af4e6a5c8a
 diff --git a/crypto_kem/kyber1024/aarch64/poly.c b/crypto_kem/kyber1024/aarch64/poly.c
 index 1dfa52c..3115d1c 100644
 --- a/crypto_kem/kyber1024/aarch64/poly.c
 +++ b/crypto_kem/kyber1024/aarch64/poly.c
@@ -51,6 +51,7 @@
 void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N]) {
     unsigned int i, j;
     int16_t u;
 +    uint32_t d0;
     uint8_t t[8];
     for (i = 0; i < KYBER_N / 8; i++) {
@@ -58,7 +59,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N
             // map to positive standard representatives
             u  = a[8 * i + j];
             u += (u >> 15) & KYBER_Q;
 -            t[j] = ((((uint32_t)u << 5) + KYBER_Q / 2) / KYBER_Q) & 31;
 +            // t[j] = ((((uint32_t)u << 5) + KYBER_Q / 2) / KYBER_Q) & 31;
 +            d0 = u << 5;
 +            d0 += 1664;
 +            d0 *= 40318;
 +            d0 >>= 27;
 +            t[j] = d0 & 0x1f;
         }
         r[0] = (t[0] >> 0) | (t[1] << 5);
@@ -207,14 +213,19 @@ void poly_frommsg(int16_t r[KYBER_N], const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 **************************************************/
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const int16_t a[KYBER_N]) {
     unsigned int i, j;
 -    uint16_t t;
 +    uint32_t t;
     for (i = 0; i < KYBER_N / 8; i++) {
         msg[i] = 0;
         for (j = 0; j < 8; j++) {
             t  = a[8 * i + j];
 -            t += ((int16_t)t >> 15) & KYBER_Q;
 -            t  = (((t << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
 +            // t += ((int16_t)t >> 15) & KYBER_Q;
 +            // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
 +            t <<= 1;
 +            t += 1665;
 +            t *= 80635;
 +            t >>= 28;
 +            t &= 1;
             msg[i] |= t << j;
         }
     }
 diff --git a/crypto_kem/kyber1024/aarch64/polyvec.c b/crypto_kem/kyber1024/aarch64/polyvec.c
 index d400348..f9a1ebf 100644
 --- a/crypto_kem/kyber1024/aarch64/polyvec.c
 +++ b/crypto_kem/kyber1024/aarch64/polyvec.c
@@ -21,6 +21,7 @@
 **************************************************/
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K][KYBER_N]) {
     unsigned int i, j, k;
 +    uint64_t d0;
     #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
     uint16_t t[8];
@@ -29,7 +30,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
             for (k = 0; k < 8; k++) {
                 t[k]  = a[i][8 * j + k];
                 t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
 -                t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
 +                // t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
 +                d0 = t[k];
 +                d0 <<= 11;
 +                d0 += 1664;
 +                d0 *= 645084;
 +                d0 >>= 31;
 +                t[k] = d0 & 0x7ff;
             }
             r[ 0] = (t[0] >>  0);
@@ -53,7 +60,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
             for (k = 0; k < 4; k++) {
                 t[k]  = a[i][4 * j + k];
                 t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
 -                t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
 +                // t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
 +                d0 = t[k];
 +                d0 <<= 10;
 +                d0 += 1665;
 +                d0 *= 1290167;
 +                d0 >>= 32;
 +                t[k] = d0 & 0x3ff;
             }
             r[0] = (t[0] >> 0);
 diff --git a/crypto_kem/kyber512/aarch64/poly.c b/crypto_kem/kyber512/aarch64/poly.c
 index dffc655..361ce89 100644
 --- a/crypto_kem/kyber512/aarch64/poly.c
 +++ b/crypto_kem/kyber512/aarch64/poly.c
@@ -51,6 +51,7 @@
 void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N]) {
     unsigned int i, j;
     int16_t u;
 +    uint32_t d0;
     uint8_t t[8];
     for (i = 0; i < KYBER_N / 8; i++) {
@@ -58,7 +59,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N
             // map to positive standard representatives
             u  = a[8 * i + j];
             u += (u >> 15) & KYBER_Q;
 -            t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
 +            // t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
 +            d0 = u << 4;
 +            d0 += 1665;
 +            d0 *= 80635;
 +            d0 >>= 28;
 +            t[j] = d0 & 0xf;
         }
         r[0] = t[0] | (t[1] << 4);
@@ -194,14 +200,19 @@ void poly_frommsg(int16_t r[KYBER_N], const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 **************************************************/
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const int16_t a[KYBER_N]) {
     unsigned int i, j;
 -    uint16_t t;
 +    uint32_t t;
     for (i = 0; i < KYBER_N / 8; i++) {
         msg[i] = 0;
         for (j = 0; j < 8; j++) {
             t  = a[8 * i + j];
 -            t += ((int16_t)t >> 15) & KYBER_Q;
 -            t  = (((t << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
 +            // t += ((int16_t)t >> 15) & KYBER_Q;
 +            // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
 +            t <<= 1;
 +            t += 1665;
 +            t *= 80635;
 +            t >>= 28;
 +            t &= 1;
             msg[i] |= t << j;
         }
     }
 diff --git a/crypto_kem/kyber512/aarch64/polyvec.c b/crypto_kem/kyber512/aarch64/polyvec.c
 index d400348..f9a1ebf 100644
 --- a/crypto_kem/kyber512/aarch64/polyvec.c
 +++ b/crypto_kem/kyber512/aarch64/polyvec.c
@@ -21,6 +21,7 @@
 **************************************************/
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K][KYBER_N]) {
     unsigned int i, j, k;
 +    uint64_t d0;
     #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
     uint16_t t[8];
@@ -29,7 +30,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
             for (k = 0; k < 8; k++) {
                 t[k]  = a[i][8 * j + k];
                 t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
 -                t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
 +                // t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
 +                d0 = t[k];
 +                d0 <<= 11;
 +                d0 += 1664;
 +                d0 *= 645084;
 +                d0 >>= 31;
 +                t[k] = d0 & 0x7ff;
             }
             r[ 0] = (t[0] >>  0);
@@ -53,7 +60,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
             for (k = 0; k < 4; k++) {
                 t[k]  = a[i][4 * j + k];
                 t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
 -                t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
 +                // t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
 +                d0 = t[k];
 +                d0 <<= 10;
 +                d0 += 1665;
 +                d0 *= 1290167;
 +                d0 >>= 32;
 +                t[k] = d0 & 0x3ff;
             }
             r[0] = (t[0] >> 0);
 diff --git a/crypto_kem/kyber768/aarch64/poly.c b/crypto_kem/kyber768/aarch64/poly.c
 index dffc655..361ce89 100644
 --- a/crypto_kem/kyber768/aarch64/poly.c
 +++ b/crypto_kem/kyber768/aarch64/poly.c
@@ -51,6 +51,7 @@
 void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N]) {
     unsigned int i, j;
     int16_t u;
 +    uint32_t d0;
     uint8_t t[8];
     for (i = 0; i < KYBER_N / 8; i++) {
@@ -58,7 +59,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N
             // map to positive standard representatives
             u  = a[8 * i + j];
             u += (u >> 15) & KYBER_Q;
 -            t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
 +            // t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
 +            d0 = u << 4;
 +            d0 += 1665;
 +            d0 *= 80635;
 +            d0 >>= 28;
 +            t[j] = d0 & 0xf;
         }
         r[0] = t[0] | (t[1] << 4);
@@ -194,14 +200,19 @@ void poly_frommsg(int16_t r[KYBER_N], const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 **************************************************/
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const int16_t a[KYBER_N]) {
     unsigned int i, j;
 -    uint16_t t;
 +    uint32_t t;
     for (i = 0; i < KYBER_N / 8; i++) {
         msg[i] = 0;
         for (j = 0; j < 8; j++) {
             t  = a[8 * i + j];
 -            t += ((int16_t)t >> 15) & KYBER_Q;
 -            t  = (((t << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
 +            // t += ((int16_t)t >> 15) & KYBER_Q;
 +            // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
 +            t <<= 1;
 +            t += 1665;
 +            t *= 80635;
 +            t >>= 28;
 +            t &= 1;
             msg[i] |= t << j;
         }
     }
 diff --git a/crypto_kem/kyber768/aarch64/polyvec.c b/crypto_kem/kyber768/aarch64/polyvec.c
 index d400348..f9a1ebf 100644
 --- a/crypto_kem/kyber768/aarch64/polyvec.c
 +++ b/crypto_kem/kyber768/aarch64/polyvec.c
@@ -21,6 +21,7 @@
 **************************************************/
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K][KYBER_N]) {
     unsigned int i, j, k;
 +    uint64_t d0;
     #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
     uint16_t t[8];
@@ -29,7 +30,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
             for (k = 0; k < 8; k++) {
                 t[k]  = a[i][8 * j + k];
                 t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
 -                t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
 +                // t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
 +                d0 = t[k];
 +                d0 <<= 11;
 +                d0 += 1664;
 +                d0 *= 645084;
 +                d0 >>= 31;
 +                t[k] = d0 & 0x7ff;
             }
             r[ 0] = (t[0] >>  0);
@@ -53,7 +60,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
             for (k = 0; k < 4; k++) {
                 t[k]  = a[i][4 * j + k];
                 t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
 -                t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
 +                // t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
 +                d0 = t[k];
 +                d0 <<= 10;
 +                d0 += 1665;
 +                d0 *= 1290167;
 +                d0 >>= 32;
 +                t[k] = d0 & 0x3ff;
             }
             r[0] = (t[0] >> 0);
--- a/src/kem/kyber/pqclean_kyber1024_aarch64/poly.c
+++ b/src/kem/kyber/pqclean_kyber1024_aarch64/poly.c
@ -51,6 +51,7 @@
 void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N]) {
    unsigned int i, j;
    int16_t u;
    uint32_t d0;
    uint8_t t[8];
    for (i = 0; i < KYBER_N / 8; i++) {
@ -58,7 +59,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N
            // map to positive standard representatives
            u  = a[8 * i + j];
            u += (u >> 15) & KYBER_Q;
-            t[j] = ((((uint32_t)u << 5) + KYBER_Q / 2) / KYBER_Q) & 31;
+            // t[j] = ((((uint32_t)u << 5) + KYBER_Q / 2) / KYBER_Q) & 31;
            d0 = u << 5;
            d0 += 1664;
            d0 *= 40318;
            d0 >>= 27;
            t[j] = d0 & 0x1f;
        }
        r[0] = (t[0] >> 0) | (t[1] << 5);
@ -207,14 +213,19 @@ void poly_frommsg(int16_t r[KYBER_N], const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 **************************************************/
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const int16_t a[KYBER_N]) {
    unsigned int i, j;
-    uint16_t t;
+    uint32_t t;
    for (i = 0; i < KYBER_N / 8; i++) {
        msg[i] = 0;
        for (j = 0; j < 8; j++) {
            t  = a[8 * i + j];
-            t += ((int16_t)t >> 15) & KYBER_Q;
+            // t += ((int16_t)t >> 15) & KYBER_Q;
-            t  = (((t << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
+            // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
            t <<= 1;
            t += 1665;
            t *= 80635;
            t >>= 28;
            t &= 1;
            msg[i] |= t << j;
        }
    }
--- a/src/kem/kyber/pqclean_kyber1024_aarch64/polyvec.c
+++ b/src/kem/kyber/pqclean_kyber1024_aarch64/polyvec.c
@ -21,6 +21,7 @@
 **************************************************/
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K][KYBER_N]) {
    unsigned int i, j, k;
    uint64_t d0;
    #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
    uint16_t t[8];
@ -29,7 +30,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
            for (k = 0; k < 8; k++) {
                t[k]  = a[i][8 * j + k];
                t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-                t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
+                // t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
                d0 = t[k];
                d0 <<= 11;
                d0 += 1664;
                d0 *= 645084;
                d0 >>= 31;
                t[k] = d0 & 0x7ff;
            }
            r[ 0] = (t[0] >>  0);
@ -53,7 +60,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
            for (k = 0; k < 4; k++) {
                t[k]  = a[i][4 * j + k];
                t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-                t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
+                // t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
                d0 = t[k];
                d0 <<= 10;
                d0 += 1665;
                d0 *= 1290167;
                d0 >>= 32;
                t[k] = d0 & 0x3ff;
            }
            r[0] = (t[0] >> 0);
--- a/src/kem/kyber/pqclean_kyber512_aarch64/poly.c
+++ b/src/kem/kyber/pqclean_kyber512_aarch64/poly.c
@ -51,6 +51,7 @@
 void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N]) {
    unsigned int i, j;
    int16_t u;
    uint32_t d0;
    uint8_t t[8];
    for (i = 0; i < KYBER_N / 8; i++) {
@ -58,7 +59,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N
            // map to positive standard representatives
            u  = a[8 * i + j];
            u += (u >> 15) & KYBER_Q;
-            t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
+            // t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
            d0 = u << 4;
            d0 += 1665;
            d0 *= 80635;
            d0 >>= 28;
            t[j] = d0 & 0xf;
        }
        r[0] = t[0] | (t[1] << 4);
@ -194,14 +200,19 @@ void poly_frommsg(int16_t r[KYBER_N], const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 **************************************************/
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const int16_t a[KYBER_N]) {
    unsigned int i, j;
-    uint16_t t;
+    uint32_t t;
    for (i = 0; i < KYBER_N / 8; i++) {
        msg[i] = 0;
        for (j = 0; j < 8; j++) {
            t  = a[8 * i + j];
-            t += ((int16_t)t >> 15) & KYBER_Q;
+            // t += ((int16_t)t >> 15) & KYBER_Q;
-            t  = (((t << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
+            // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
            t <<= 1;
            t += 1665;
            t *= 80635;
            t >>= 28;
            t &= 1;
            msg[i] |= t << j;
        }
    }
--- a/src/kem/kyber/pqclean_kyber512_aarch64/polyvec.c
+++ b/src/kem/kyber/pqclean_kyber512_aarch64/polyvec.c
@ -21,6 +21,7 @@
 **************************************************/
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K][KYBER_N]) {
    unsigned int i, j, k;
    uint64_t d0;
    #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
    uint16_t t[8];
@ -29,7 +30,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
            for (k = 0; k < 8; k++) {
                t[k]  = a[i][8 * j + k];
                t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-                t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
+                // t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
                d0 = t[k];
                d0 <<= 11;
                d0 += 1664;
                d0 *= 645084;
                d0 >>= 31;
                t[k] = d0 & 0x7ff;
            }
            r[ 0] = (t[0] >>  0);
@ -53,7 +60,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
            for (k = 0; k < 4; k++) {
                t[k]  = a[i][4 * j + k];
                t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-                t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
+                // t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
                d0 = t[k];
                d0 <<= 10;
                d0 += 1665;
                d0 *= 1290167;
                d0 >>= 32;
                t[k] = d0 & 0x3ff;
            }
            r[0] = (t[0] >> 0);
--- a/src/kem/kyber/pqclean_kyber768_aarch64/poly.c
+++ b/src/kem/kyber/pqclean_kyber768_aarch64/poly.c
@ -51,6 +51,7 @@
 void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N]) {
    unsigned int i, j;
    int16_t u;
    uint32_t d0;
    uint8_t t[8];
    for (i = 0; i < KYBER_N / 8; i++) {
@ -58,7 +59,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const int16_t a[KYBER_N
            // map to positive standard representatives
            u  = a[8 * i + j];
            u += (u >> 15) & KYBER_Q;
-            t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
+            // t[j] = ((((uint16_t)u << 4) + KYBER_Q / 2) / KYBER_Q) & 15;
            d0 = u << 4;
            d0 += 1665;
            d0 *= 80635;
            d0 >>= 28;
            t[j] = d0 & 0xf;
        }
        r[0] = t[0] | (t[1] << 4);
@ -194,14 +200,19 @@ void poly_frommsg(int16_t r[KYBER_N], const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 **************************************************/
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const int16_t a[KYBER_N]) {
    unsigned int i, j;
-    uint16_t t;
+    uint32_t t;
    for (i = 0; i < KYBER_N / 8; i++) {
        msg[i] = 0;
        for (j = 0; j < 8; j++) {
            t  = a[8 * i + j];
-            t += ((int16_t)t >> 15) & KYBER_Q;
+            // t += ((int16_t)t >> 15) & KYBER_Q;
-            t  = (((t << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
+            // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
            t <<= 1;
            t += 1665;
            t *= 80635;
            t >>= 28;
            t &= 1;
            msg[i] |= t << j;
        }
    }
--- a/src/kem/kyber/pqclean_kyber768_aarch64/polyvec.c
+++ b/src/kem/kyber/pqclean_kyber768_aarch64/polyvec.c
@ -21,6 +21,7 @@
 **************************************************/
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K][KYBER_N]) {
    unsigned int i, j, k;
    uint64_t d0;
    #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
    uint16_t t[8];
@ -29,7 +30,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
            for (k = 0; k < 8; k++) {
                t[k]  = a[i][8 * j + k];
                t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-                t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
+                // t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q / 2) / KYBER_Q) & 0x7ff;
                d0 = t[k];
                d0 <<= 11;
                d0 += 1664;
                d0 *= 645084;
                d0 >>= 31;
                t[k] = d0 & 0x7ff;
            }
            r[ 0] = (t[0] >>  0);
@ -53,7 +60,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], int16_t a[KYBER_K
            for (k = 0; k < 4; k++) {
                t[k]  = a[i][4 * j + k];
                t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-                t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
+                // t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q / 2) / KYBER_Q) & 0x3ff;
                d0 = t[k];
                d0 <<= 10;
                d0 += 1665;
                d0 *= 1290167;
                d0 >>= 32;
                t[k] = d0 & 0x3ff;
            }
            r[0] = (t[0] >> 0);
--- a/src/kem/kyber/pqcrystals-kyber_kyber1024_avx2/verify.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber1024_avx2/verify.c
@ -57,6 +57,16 @@ void cmov(uint8_t * restrict r, const uint8_t *x, size_t len, uint8_t b)
  size_t i;
  __m256i xvec, rvec, bvec;
 #if defined(__GNUC__) || defined(__clang__)
  // Prevent the compiler from
  //    1) inferring that b is 0/1-valued, and
  //    2) handling the two cases with a branch.
  // This is not necessary when verify.c and kem.c are separate translation
  // units, but we expect that downstream consumers will copy this code and/or
  // change how it is built.
  __asm__("" : "+r"(b) : /* no inputs */);
 #endif
  bvec = _mm256_set1_epi64x(-(uint64_t)b);
  for(i=0;i<len/32;i++) {
    rvec = _mm256_loadu_si256((__m256i *)&r[32*i]);
--- a/src/kem/kyber/pqcrystals-kyber_kyber1024_ref/poly.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber1024_ref/poly.c
@ -19,6 +19,7 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
 {
  unsigned int i,j;
  int16_t u;
  uint32_t d0;
  uint8_t t[8];
 #if (KYBER_POLYCOMPRESSEDBYTES == 128)
@ -27,7 +28,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
      // map to positive standard representatives
      u  = a->coeffs[8*i+j];
      u += (u >> 15) & KYBER_Q;
-      t[j] = ((((uint16_t)u << 4) + KYBER_Q/2)/KYBER_Q) & 15;
+/*    t[j] = ((((uint16_t)u << 4) + KYBER_Q/2)/KYBER_Q) & 15; */
      d0 = u << 4;
      d0 += 1665;
      d0 *= 80635;
      d0 >>= 28;
      t[j] = d0 & 0xf;
    }
    r[0] = t[0] | (t[1] << 4);
@ -42,7 +48,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
      // map to positive standard representatives
      u  = a->coeffs[8*i+j];
      u += (u >> 15) & KYBER_Q;
-      t[j] = ((((uint32_t)u << 5) + KYBER_Q/2)/KYBER_Q) & 31;
+/*      t[j] = ((((uint32_t)u << 5) + KYBER_Q/2)/KYBER_Q) & 31; */
      d0 = u << 5;
      d0 += 1664;
      d0 *= 40318;
      d0 >>= 27;
      t[j] = d0 & 0x1f;
    }
    r[0] = (t[0] >> 0) | (t[1] << 5);
@ -180,14 +191,19 @@ void poly_frommsg(poly *r, const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const poly *a)
 {
  unsigned int i,j;
-  uint16_t t;
+  uint32_t t;
  for(i=0;i<KYBER_N/8;i++) {
    msg[i] = 0;
    for(j=0;j<8;j++) {
      t  = a->coeffs[8*i+j];
-      t += ((int16_t)t >> 15) & KYBER_Q;
+      // t += ((int16_t)t >> 15) & KYBER_Q;
-      t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
+      // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
      t <<= 1;
      t += 1665;
      t *= 80635;
      t >>= 28;
      t &= 1;
      msg[i] |= t << j;
    }
  }
--- a/src/kem/kyber/pqcrystals-kyber_kyber1024_ref/polyvec.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber1024_ref/polyvec.c
@ -15,6 +15,7 @@
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
 {
  unsigned int i,j,k;
  uint64_t d0;
 #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
  uint16_t t[8];
@ -23,7 +24,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
      for(k=0;k<8;k++) {
        t[k]  = a->vec[i].coeffs[8*j+k];
        t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-        t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q/2)/KYBER_Q) & 0x7ff;
+/*      t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q/2)/KYBER_Q) & 0x7ff; */
        d0 = t[k];
        d0 <<= 11;
        d0 += 1664;
        d0 *= 645084;
        d0 >>= 31;
        t[k] = d0 & 0x7ff;
      }
      r[ 0] = (t[0] >>  0);
@ -47,7 +54,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
      for(k=0;k<4;k++) {
        t[k]  = a->vec[i].coeffs[4*j+k];
        t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-        t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q/2)/ KYBER_Q) & 0x3ff;
+/*      t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q/2)/ KYBER_Q) & 0x3ff; */
        d0 = t[k];
        d0 <<= 10;
        d0 += 1665;
        d0 *= 1290167;
        d0 >>= 32;
        t[k] = d0 & 0x3ff;
      }
      r[0] = (t[0] >> 0);
--- a/src/kem/kyber/pqcrystals-kyber_kyber1024_ref/verify.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber1024_ref/verify.c
@ -41,6 +41,16 @@ void cmov(uint8_t *r, const uint8_t *x, size_t len, uint8_t b)
 {
  size_t i;
 #if defined(__GNUC__) || defined(__clang__)
  // Prevent the compiler from
  //    1) inferring that b is 0/1-valued, and
  //    2) handling the two cases with a branch.
  // This is not necessary when verify.c and kem.c are separate translation
  // units, but we expect that downstream consumers will copy this code and/or
  // change how it is built.
  __asm__("" : "+r"(b) : /* no inputs */);
 #endif
  b = -b;
  for(i=0;i<len;i++)
    r[i] ^= b & (r[i] ^ x[i]);
--- a/src/kem/kyber/pqcrystals-kyber_kyber512_avx2/verify.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber512_avx2/verify.c
@ -57,6 +57,16 @@ void cmov(uint8_t * restrict r, const uint8_t *x, size_t len, uint8_t b)
  size_t i;
  __m256i xvec, rvec, bvec;
 #if defined(__GNUC__) || defined(__clang__)
  // Prevent the compiler from
  //    1) inferring that b is 0/1-valued, and
  //    2) handling the two cases with a branch.
  // This is not necessary when verify.c and kem.c are separate translation
  // units, but we expect that downstream consumers will copy this code and/or
  // change how it is built.
  __asm__("" : "+r"(b) : /* no inputs */);
 #endif
  bvec = _mm256_set1_epi64x(-(uint64_t)b);
  for(i=0;i<len/32;i++) {
    rvec = _mm256_loadu_si256((__m256i *)&r[32*i]);
--- a/src/kem/kyber/pqcrystals-kyber_kyber512_ref/poly.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber512_ref/poly.c
@ -19,6 +19,7 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
 {
  unsigned int i,j;
  int16_t u;
  uint32_t d0;
  uint8_t t[8];
 #if (KYBER_POLYCOMPRESSEDBYTES == 128)
@ -27,7 +28,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
      // map to positive standard representatives
      u  = a->coeffs[8*i+j];
      u += (u >> 15) & KYBER_Q;
-      t[j] = ((((uint16_t)u << 4) + KYBER_Q/2)/KYBER_Q) & 15;
+/*    t[j] = ((((uint16_t)u << 4) + KYBER_Q/2)/KYBER_Q) & 15; */
      d0 = u << 4;
      d0 += 1665;
      d0 *= 80635;
      d0 >>= 28;
      t[j] = d0 & 0xf;
    }
    r[0] = t[0] | (t[1] << 4);
@ -42,7 +48,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
      // map to positive standard representatives
      u  = a->coeffs[8*i+j];
      u += (u >> 15) & KYBER_Q;
-      t[j] = ((((uint32_t)u << 5) + KYBER_Q/2)/KYBER_Q) & 31;
+/*      t[j] = ((((uint32_t)u << 5) + KYBER_Q/2)/KYBER_Q) & 31; */
      d0 = u << 5;
      d0 += 1664;
      d0 *= 40318;
      d0 >>= 27;
      t[j] = d0 & 0x1f;
    }
    r[0] = (t[0] >> 0) | (t[1] << 5);
@ -180,14 +191,19 @@ void poly_frommsg(poly *r, const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const poly *a)
 {
  unsigned int i,j;
-  uint16_t t;
+  uint32_t t;
  for(i=0;i<KYBER_N/8;i++) {
    msg[i] = 0;
    for(j=0;j<8;j++) {
      t  = a->coeffs[8*i+j];
-      t += ((int16_t)t >> 15) & KYBER_Q;
+      // t += ((int16_t)t >> 15) & KYBER_Q;
-      t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
+      // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
      t <<= 1;
      t += 1665;
      t *= 80635;
      t >>= 28;
      t &= 1;
      msg[i] |= t << j;
    }
  }
--- a/src/kem/kyber/pqcrystals-kyber_kyber512_ref/polyvec.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber512_ref/polyvec.c
@ -15,6 +15,7 @@
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
 {
  unsigned int i,j,k;
  uint64_t d0;
 #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
  uint16_t t[8];
@ -23,7 +24,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
      for(k=0;k<8;k++) {
        t[k]  = a->vec[i].coeffs[8*j+k];
        t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-        t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q/2)/KYBER_Q) & 0x7ff;
+/*      t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q/2)/KYBER_Q) & 0x7ff; */
        d0 = t[k];
        d0 <<= 11;
        d0 += 1664;
        d0 *= 645084;
        d0 >>= 31;
        t[k] = d0 & 0x7ff;
      }
      r[ 0] = (t[0] >>  0);
@ -47,7 +54,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
      for(k=0;k<4;k++) {
        t[k]  = a->vec[i].coeffs[4*j+k];
        t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-        t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q/2)/ KYBER_Q) & 0x3ff;
+/*      t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q/2)/ KYBER_Q) & 0x3ff; */
        d0 = t[k];
        d0 <<= 10;
        d0 += 1665;
        d0 *= 1290167;
        d0 >>= 32;
        t[k] = d0 & 0x3ff;
      }
      r[0] = (t[0] >> 0);
--- a/src/kem/kyber/pqcrystals-kyber_kyber512_ref/verify.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber512_ref/verify.c
@ -41,6 +41,16 @@ void cmov(uint8_t *r, const uint8_t *x, size_t len, uint8_t b)
 {
  size_t i;
 #if defined(__GNUC__) || defined(__clang__)
  // Prevent the compiler from
  //    1) inferring that b is 0/1-valued, and
  //    2) handling the two cases with a branch.
  // This is not necessary when verify.c and kem.c are separate translation
  // units, but we expect that downstream consumers will copy this code and/or
  // change how it is built.
  __asm__("" : "+r"(b) : /* no inputs */);
 #endif
  b = -b;
  for(i=0;i<len;i++)
    r[i] ^= b & (r[i] ^ x[i]);
--- a/src/kem/kyber/pqcrystals-kyber_kyber768_avx2/verify.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber768_avx2/verify.c
@ -57,6 +57,16 @@ void cmov(uint8_t * restrict r, const uint8_t *x, size_t len, uint8_t b)
  size_t i;
  __m256i xvec, rvec, bvec;
 #if defined(__GNUC__) || defined(__clang__)
  // Prevent the compiler from
  //    1) inferring that b is 0/1-valued, and
  //    2) handling the two cases with a branch.
  // This is not necessary when verify.c and kem.c are separate translation
  // units, but we expect that downstream consumers will copy this code and/or
  // change how it is built.
  __asm__("" : "+r"(b) : /* no inputs */);
 #endif
  bvec = _mm256_set1_epi64x(-(uint64_t)b);
  for(i=0;i<len/32;i++) {
    rvec = _mm256_loadu_si256((__m256i *)&r[32*i]);
--- a/src/kem/kyber/pqcrystals-kyber_kyber768_ref/poly.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber768_ref/poly.c
@ -19,6 +19,7 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
 {
  unsigned int i,j;
  int16_t u;
  uint32_t d0;
  uint8_t t[8];
 #if (KYBER_POLYCOMPRESSEDBYTES == 128)
@ -27,7 +28,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
      // map to positive standard representatives
      u  = a->coeffs[8*i+j];
      u += (u >> 15) & KYBER_Q;
-      t[j] = ((((uint16_t)u << 4) + KYBER_Q/2)/KYBER_Q) & 15;
+/*    t[j] = ((((uint16_t)u << 4) + KYBER_Q/2)/KYBER_Q) & 15; */
      d0 = u << 4;
      d0 += 1665;
      d0 *= 80635;
      d0 >>= 28;
      t[j] = d0 & 0xf;
    }
    r[0] = t[0] | (t[1] << 4);
@ -42,7 +48,12 @@ void poly_compress(uint8_t r[KYBER_POLYCOMPRESSEDBYTES], const poly *a)
      // map to positive standard representatives
      u  = a->coeffs[8*i+j];
      u += (u >> 15) & KYBER_Q;
-      t[j] = ((((uint32_t)u << 5) + KYBER_Q/2)/KYBER_Q) & 31;
+/*      t[j] = ((((uint32_t)u << 5) + KYBER_Q/2)/KYBER_Q) & 31; */
      d0 = u << 5;
      d0 += 1664;
      d0 *= 40318;
      d0 >>= 27;
      t[j] = d0 & 0x1f;
    }
    r[0] = (t[0] >> 0) | (t[1] << 5);
@ -180,14 +191,19 @@ void poly_frommsg(poly *r, const uint8_t msg[KYBER_INDCPA_MSGBYTES])
 void poly_tomsg(uint8_t msg[KYBER_INDCPA_MSGBYTES], const poly *a)
 {
  unsigned int i,j;
-  uint16_t t;
+  uint32_t t;
  for(i=0;i<KYBER_N/8;i++) {
    msg[i] = 0;
    for(j=0;j<8;j++) {
      t  = a->coeffs[8*i+j];
-      t += ((int16_t)t >> 15) & KYBER_Q;
+      // t += ((int16_t)t >> 15) & KYBER_Q;
-      t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
+      // t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;
      t <<= 1;
      t += 1665;
      t *= 80635;
      t >>= 28;
      t &= 1;
      msg[i] |= t << j;
    }
  }
--- a/src/kem/kyber/pqcrystals-kyber_kyber768_ref/polyvec.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber768_ref/polyvec.c
@ -15,6 +15,7 @@
 void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
 {
  unsigned int i,j,k;
  uint64_t d0;
 #if (KYBER_POLYVECCOMPRESSEDBYTES == (KYBER_K * 352))
  uint16_t t[8];
@ -23,7 +24,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
      for(k=0;k<8;k++) {
        t[k]  = a->vec[i].coeffs[8*j+k];
        t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-        t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q/2)/KYBER_Q) & 0x7ff;
+/*      t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q/2)/KYBER_Q) & 0x7ff; */
        d0 = t[k];
        d0 <<= 11;
        d0 += 1664;
        d0 *= 645084;
        d0 >>= 31;
        t[k] = d0 & 0x7ff;
      }
      r[ 0] = (t[0] >>  0);
@ -47,7 +54,13 @@ void polyvec_compress(uint8_t r[KYBER_POLYVECCOMPRESSEDBYTES], const polyvec *a)
      for(k=0;k<4;k++) {
        t[k]  = a->vec[i].coeffs[4*j+k];
        t[k] += ((int16_t)t[k] >> 15) & KYBER_Q;
-        t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q/2)/ KYBER_Q) & 0x3ff;
+/*      t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q/2)/ KYBER_Q) & 0x3ff; */
        d0 = t[k];
        d0 <<= 10;
        d0 += 1665;
        d0 *= 1290167;
        d0 >>= 32;
        t[k] = d0 & 0x3ff;
      }
      r[0] = (t[0] >> 0);
--- a/src/kem/kyber/pqcrystals-kyber_kyber768_ref/verify.c
+++ b/src/kem/kyber/pqcrystals-kyber_kyber768_ref/verify.c
@ -41,6 +41,16 @@ void cmov(uint8_t *r, const uint8_t *x, size_t len, uint8_t b)
 {
  size_t i;
 #if defined(__GNUC__) || defined(__clang__)
  // Prevent the compiler from
  //    1) inferring that b is 0/1-valued, and
  //    2) handling the two cases with a branch.
  // This is not necessary when verify.c and kem.c are separate translation
  // units, but we expect that downstream consumers will copy this code and/or
  // change how it is built.
  __asm__("" : "+r"(b) : /* no inputs */);
 #endif
  b = -b;
  for(i=0;i<len;i++)
    r[i] ^= b & (r[i] ^ x[i]);
Author	SHA1	Message	Date
Douglas Stebila	9922f7cd13	Release notes for 0.9.2-rc1	2024-01-11 17:54:58 +01:00
Spencer Wilson	4522fae9a4	Run copy_from_upstream	2024-01-08 11:51:32 -05:00
Spencer Wilson	6252372d47	Checkout post-0.9.0 copy_from_upstream fixes	2024-01-08 11:51:32 -05:00
Spencer Wilson	b0c20b9fce	Update ARM patch	2024-01-08 11:51:32 -05:00
Pravek Sharma	9ffad26326	Run copy_from_upstream.py	2024-01-08 11:51:32 -05:00
Pravek Sharma	58eae24cce	Update to latest Kyber commit	2024-01-08 11:51:32 -05:00
Pravek Sharma	0c0675d180	Run copy_from_upstream.py -k	2024-01-08 11:51:32 -05:00
Pravek Sharma	9c42d64705	Update copy_from_upstream.yml	2024-01-08 11:51:32 -05:00
Douglas Stebila	31f570b553	0.9.2 dev branch	2024-01-02 13:09:51 -05:00
Douglas Stebila	7a680dff97	Release notes for 0.9.1	2023-12-22 15:27:57 -05:00
Douglas Stebila	0ab83c8fe4	Detailed changelog [skip ci]	2023-12-19 15:17:06 -05:00
Douglas Stebila	d9a34c93d3	Release notes for 0.9.1-rc1 [skip-ci]	2023-12-19 15:13:20 -05:00
Basil Hess	f22c8316f9	Adds patch to aarch64 Kyber pulled from PQClean for variable-time division in poly_tomsg.	2023-12-19 14:58:37 -05:00
Basil Hess	e68dbc6f6e	update .travis.yml (#1629 )	2023-12-19 11:25:34 -05:00
Basil Hess	5197b9e125	pull kyber from upstream: dda29cc63af721981ee2c831cf00822e69be3220 (#1631 )	2023-12-19 11:25:34 -05:00