--- slug: c11-fips203ipd-v0.6 title: "C11 FIPS 203 IPD v0.6" date: "2024-05-15T04:16:06-04:00" pics: bench-x1-svg: css: "image" tip: "Median cycles by backend, Lenovo ThinkPad X1 Carbon, 6th Gen (i7-1185G7)." sources: - src: "/files/posts/c11-fips203ipd-v0.6/x1-results.svg" width: 960 height: 480 bench-pi5-svg: css: "image" tip: "Median cycles by backend, Raspberry Pi 5 (Cortex-A76)." sources: - src: "/files/posts/c11-fips203ipd-v0.6/pi5-results.svg" width: 960 height: 480 bench-n2l-svg: css: "image" tip: "Median cycles by backend, Odroid N2L (Cortex-A73)." sources: - src: "/files/posts/c11-fips203ipd-v0.6/n2l-results.svg" width: 960 height: 480 tables: bench-x1: # table columns (required) cols: - id: "set" name: "Set" tip: "Parameter set." - id: "function" name: "Function" tip: "Function." - id: "scalar-gcc" name: "Scalar (GCC)" tip: "Median number of CPU cycles when using the scalar backend compiled with GCC." align: "right" - id: "scalar-clang" name: "Scalar (Clang)" tip: "Median number of CPU cycles when using the scalar backend compiled with Clang." align: "right" - id: "simd-gcc" name: "AVX-512 (GCC)" tip: "Median number of CPU cycles when using the AVX-512 backend compiled with GCC." align: "right" # table rows (required) rows: - set: "kem512" function: "keygen" scalar-gcc: "118733" scalar-clang: "70770" simd-gcc: "17448" - set: "kem512" function: "encaps" scalar-gcc: "126159" scalar-clang: "82713" simd-gcc: "21474" - set: "kem512" function: "decaps" scalar-gcc: "185426" scalar-clang: "97722" simd-gcc: "25685" - set: "kem768" function: "keygen" scalar-gcc: "172446" scalar-clang: "110192" simd-gcc: "29334" - set: "kem768" function: "encaps" scalar-gcc: "184614" scalar-clang: "132385" simd-gcc: "32528" - set: "kem768" function: "decaps" scalar-gcc: "234564" scalar-clang: "148425" simd-gcc: "38184" - set: "kem1024" function: "keygen" scalar-gcc: "268327" scalar-clang: "176256" simd-gcc: "39914" - set: "kem1024" function: "encaps" scalar-gcc: "270793" scalar-clang: "206497" simd-gcc: "45268" - set: "kem1024" function: "decaps" scalar-gcc: "370533" scalar-clang: "224686" simd-gcc: "52523" bench-pi5: # table columns (required) cols: - id: "set" name: "Set" tip: "Parameter set." - id: "function" name: "Function" tip: "Function." - id: "scalar-gcc" name: "Scalar (GCC)" tip: "Median number of CPU cycles when using the scalar backend compiled with GCC." align: "right" - id: "scalar-clang" name: "Scalar (Clang)" tip: "Median number of CPU cycles when using the scalar backend compiled with Clang." align: "right" - id: "simd-gcc" name: "Neon (GCC)" tip: "Median number of CPU cycles when using the Neon backend compiled with GCC." align: "right" # table rows (required) rows: - set: "kem512" function: "keygen" scalar-gcc: "127403" scalar-clang: "77030" simd-gcc: "53667" - set: "kem512" function: "encaps" scalar-gcc: "132432" scalar-clang: "90335" simd-gcc: "61321" - set: "kem512" function: "decaps" scalar-gcc: "176620" scalar-clang: "107868" simd-gcc: "73647" - set: "kem768" function: "keygen" scalar-gcc: "197268" scalar-clang: "114009" simd-gcc: "92471" - set: "kem768" function: "encaps" scalar-gcc: "205189" scalar-clang: "140042" simd-gcc: "104842" - set: "kem768" function: "decaps" scalar-gcc: "265442" scalar-clang: "162514" simd-gcc: "121529" - set: "kem1024" function: "keygen" scalar-gcc: "292543" scalar-clang: "180492" simd-gcc: "140220" - set: "kem1024" function: "encaps" scalar-gcc: "298150" scalar-clang: "212488" simd-gcc: "155127" - set: "kem1024" function: "decaps" scalar-gcc: "376114" scalar-clang: "242303" simd-gcc: "176042" bench-n2l: # table columns (required) cols: - id: "set" name: "Set" tip: "Parameter set." - id: "function" name: "Function" tip: "Function." - id: "scalar-gcc" name: "Scalar (GCC)" tip: "Median number of CPU cycles when using the scalar backend compiled with GCC." align: "right" - id: "scalar-clang" name: "Scalar (Clang)" tip: "Median number of CPU cycles when using the scalar backend compiled with Clang." align: "right" - id: "simd-gcc" name: "Neon (GCC)" tip: "Median number of CPU cycles when using the Neon backend compiled with GCC." align: "right" # table rows (required) rows: - set: "kem512" function: "keygen" scalar-gcc: "210900" scalar-clang: "123675" simd-gcc: "89625" - set: "kem512" function: "encaps" scalar-gcc: "216750" scalar-clang: "143325" simd-gcc: "101550" - set: "kem512" function: "decaps" scalar-gcc: "298050" scalar-clang: "173550" simd-gcc: "122475" - set: "kem768" function: "keygen" scalar-gcc: "325050" scalar-clang: "179025" simd-gcc: "153525" - set: "kem768" function: "encaps" scalar-gcc: "331725" scalar-clang: "219900" simd-gcc: "173325" - set: "kem768" function: "decaps" scalar-gcc: "444600" scalar-clang: "259350" simd-gcc: "201900" - set: "kem1024" function: "keygen" scalar-gcc: "482625" scalar-clang: "285375" simd-gcc: "234075" - set: "kem1024" function: "encaps" scalar-gcc: "475500" scalar-clang: "335025" simd-gcc: "256650" - set: "kem1024" function: "decaps" scalar-gcc: "619725" scalar-clang: "384825" simd-gcc: "293250" --- I just released v0.6 of [fips203ipd][fips203ipd-git]. [fips203ipd][fips203ipd-git] is an embeddable, dependency-free, [MIT-0][] licensed, [C11][] implementation of the [FIPS 203 initial public draft (IPD)][fips203ipd] with scalar, [AVX-512][], and [Neon][] backends. The final version of [FIPS 203][fips203ipd] will become ML-KEM, [NIST's][nist] standarized post-quantum [key encapsulation mechanism (KEM)][kem]. [Git Repository][fips203ipd-git], [API Documentation][fips203ipd-api-docs], [Original Announcement][fips203ipd-announce], [pqc-forum Announcement][pqc-forum-announce] ### Changes in v0.6 - Add [Neon][] backend - Add MacOS support to test suite (thanks [Rod][rod-chapman]!) - Add backend auto-detection, `BACKEND` command-line build parameter, and `fips203ipd_backend()` function - Add [Raspberry Pi 5 (Cortex-A76)][pi5] benchmarks - Add "Backends" documentation section with brief notes about each backend ### Benchmarks Here are median cycle count as measured by the included `bench` tool for each parameter set, function, compiler, and backend from several of my systems. For context, the results below are competitive with the [eBATS][] results ([kyber512][], [kyber768][], [kyber1024][]), although the comparison is inexact because the results were measured with different tools and because [Kyber][] and ML-KEM differ slightly. #### Lenovo ThinkPad X1 Carbon, 6th Gen (x86-64 i7-1185G7) [{{< pe-figure "bench-x1-svg" >}}][bench-x1-svg] {{< table "bench-x1" >}} [Download CSV][bench-x1-csv] #### Raspberry Pi 5 (ARM Cortex-A76) [{{< pe-figure "bench-pi5-svg" >}}][bench-pi5-svg] {{< table "bench-pi5" >}} [Download CSV][bench-pi5-csv] #### Odroid N2L (ARM Cortex-A73) [{{< pe-figure "bench-n2l-svg" >}}][bench-n2l-svg] {{< table "bench-n2l" >}} [Download CSV][bench-n2l-csv] **Update (2024-05-16):** Added cycle counts for scalar backend (clang and gcc), added bar charts, added downloadable [CSVs][csv]. The [CSVs][csv] and [SVGs][svg] generated by the [Python][] scripts in the [`scripts/bench-chart/` directory of the Git repository][bench-chart]. [fips203ipd-git]: https://github.com/pablotron/fips203ipd "Embedable, dependency-free, MIT-0 licensed, C11 implemention of the FIPS 203 initial public draft (IPD)." [fips203ipd-api-docs]: https://pmdn.org/api-docs/fips203ipd/ "Online API documentation" [fips203ipd-announce]: {{< relref "posts/2023-10-07-c11-fips203ipd.md" >}} "Original release announcement." [mit-0]: https://opensource.org/license/mit-0/ "MIT No Attribution License" [C11]: https://en.wikipedia.org/wiki/C11_(C_standard_revision) "ISO/IEC 9899:2011" [FIPS 202]: https://csrc.nist.gov/pubs/fips/202/final "SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions" [800-185]: https://csrc.nist.gov/pubs/sp/800/185/final "SHA-3 Derived Functions: cSHAKE, KMAC, TupleHash, and ParallelHash" [cavp]: https://csrc.nist.gov/Projects/Cryptographic-Algorithm-Validation-Program/Secure-Hashing "NIST Cryptographic Algorithm Validation Program (CAVP)" [turboshake]: https://eprint.iacr.org/2023/342.pdf "TurboSHAKE" [turboshake-ietf]: https://www.ietf.org/archive/id/draft-irtf-cfrg-kangarootwelve-10.html "KangarooTwelve and TurboSHAKE" [turboshake-ietf-test-vectors]: https://www.ietf.org/archive/id/draft-irtf-cfrg-kangarootwelve-10.html#name-test-vectors "KangarooTwelve and TurboSHAKE test vectors" [csrc-examples]: https://csrc.nist.gov/projects/cryptographic-standards-and-guidelines/example-values "NIST CSRC: Cryptographic Standards and Guidelines: Examples with Intermediate Values" [cavp]: https://csrc.nist.gov/Projects/Cryptographic-Algorithm-Validation-Program/Secure-Hashing "NIST Cryptographic Algorithm Validation Program (CAVP)" [fips203ipd]: https://csrc.nist.gov/pubs/fips/203/ipd "FIPS 203 (Initial Public Draft): Module-Lattice-Based Key-Encapsulation Mechanism Standard" [kem]: https://en.wikipedia.org/wiki/Key_encapsulation_mechanism "Key encapsulation mechanism." [nist]: https://nist.gov/ "National Institute of Standards and Technology" [avx512]: https://en.wikipedia.org/wiki/AVX-512 "Advanced Vector Extensions (AVX) SIMD instructions." [barrett reduction]: https://en.wikipedia.org/wiki/Barrett_reduction "Barrett modular reduction" [nist-tests]: https://csrc.nist.gov/Projects/post-quantum-cryptography/post-quantum-cryptography-standardization/example-files "NIST: Intermediate Values for draft ML-KEM and draft ML-DSA" [avx-512]: https://en.wikipedia.org/wiki/AVX-512 "AVX-512: 512-bit extensions to the Advanced Vector Extensions (AVX) instruction set." [intrinsics]: https://en.wikipedia.org/wiki/Intrinsic_function "Built-in compiler functions" [libcpucycles]: https://cpucycles.cr.yp.to/ "CPU cycle counting library." [csv]: https://en.wikipedia.org/wiki/Comma-separated_values "Comma-separated values (CSV)" [neon]: https://en.wikipedia.org/wiki/ARM_architecture_family#Advanced_SIMD_(Neon) "Advanced SIMD extension for ARM CPUs" [pi5]: https://en.wikipedia.org/wiki/Raspberry_Pi "Raspberry Pi" [rod-chapman]: https://github.com/rod-chapman "Rod Chapman" [pqc-forum-announce]: https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/mxWWySY9rB4 "fips203ipd v0.5 release announcement on the pqc-forum mailing list" [ebats]: http://bench.cr.yp.to/ebats.html "eBATS: ECRYPT Benchmarking of Asymmetric Systems" [kyber512]: http://bench.cr.yp.to/impl-kem/kyber512.html "eBATS: kyber512" [kyber768]: http://bench.cr.yp.to/impl-kem/kyber768.html "eBATS: kyber768" [kyber1024]: http://bench.cr.yp.to/impl-kem/kyber1024.html "eBATS: kyber1024" [kyber]: https://pq-crystals.org/kyber/ "Kyber" [bench-x1-svg]: /files/posts/c11-fips203ipd-v0.6/x1-results.svg "View SVG of median CPU cycles by backend, Lenovo ThinkPad X1 Carbon, 6th Gen (x86-64 i7-1185G7)." [bench-x1-csv]: /files/posts/c11-fips203ipd-v0.6/x1-results.csv "Download CSV of median CPU cycles by backend, Lenovo ThinkPad X1 Carbon, 6th Gen (x86-64 i7-1185G7)." [bench-pi5-svg]: /files/posts/c11-fips203ipd-v0.6/pi5-results.svg "View SVG of median CPU cycles by backend, Raspberry Pi 5 (ARM Cortex-A76)." [bench-pi5-csv]: /files/posts/c11-fips203ipd-v0.6/pi5-results.csv "Download CSV of median CPU cycles by backend, Raspberry Pi 5 (ARM Cortex-A76)." [bench-n2l-svg]: /files/posts/c11-fips203ipd-v0.6/n2l-results.svg "View SVG of median CPU cycles by backend, Odroid N2L (ARM Cortex-A73)." [bench-n2l-csv]: /files/posts/c11-fips203ipd-v0.6/n2l-results.csv "Download CSV of median CPU cycles by backend, Odroid N2L (ARM Cortex-A73)." [bench-chart]: https://github.com/pablotron/fips203ipd/tree/main/scripts/bench-chart "Python scripts used to generate the bar charts in this post." [python]: https://www.python.org/ "Python programming language." [svg]: https://en.wikipedia.org/wiki/SVG "Scalable Vector Graphics vector image format."