Classic McEliece

simon@josefsson.org

Internet-Draft This document specifies Classic McEliece, a Key Encapsulation Method (KEM) designed for IND-CCA2 security, even against quantum computers. Status information for this document may be found at . Source for this draft and an issue tracker can be found at .

Foreword This document is a transcribed version of the proposed ISO McEliece standard (the "20230419" version). The Classic McEliece team is to be considered the author and owner of the text in this document, and consists of (in alphabetical order): Daniel J. Bernstein, University of Illinois at Chicago and Ruhr University Bochum Tung Chou, Academia Sinica Carlos Cid, Simula UiB and Okinawa Institute of Science and Technology Jan Gilcher, ETH Zurich Tanja Lange, Eindhoven University of Technology Varun Maram, ETH Zurich Ingo von Maurich, self Rafael Misoczki, Google Ruben Niederhagen, Academia Sinica and University of Southern Denmark Edoardo Persichetti, Florida Atlantic University Christiane Peters, self Nicolas Sendrier, Inria Jakub Szefer, Yale University Cen Jung Tjhai, PQ Solutions Ltd. Martin Tomlinson, PQ Solutions Ltd. and University of Plymouth Wen Wang, Yale University The editor would like to thank John Mattsson and Peter C for comments that improved the document.

Introduction The first code-based public-key encryption system (PKE) was introduced in 1978 . The public key specifies a random binary Goppa code. A ciphertext is a codeword plus random errors. The private key allows efficient decoding: extracting the codeword from the ciphertext, identifying and removing the errors. The McEliece system was designed to be one-way (OW-CPA), meaning that an attacker cannot efficiently find the codeword from a ciphertext and public key, when the codeword is chosen randomly. The security level of the McEliece system has remained remarkably stable, despite dozens of attack papers over 45 years. The original McEliece parameters were designed for only 2^64 security, but the system easily scales up to "overkill" parameters that provide ample security margin against advances in computer technology, including quantum computers. The McEliece system has prompted a tremendous amount of followup work. Some of this work improves efficiency while clearly preserving security: this includes a "dual" PKE proposed by Niederreiter, software speedups, and hardware speedups. Furthermore, it is now well known how to efficiently convert an OW-CPA PKE into a KEM that is IND-CCA2 secure against all ROM attacks. This conversion is tight, preserving the security level, under two assumptions that are satisfied by the McEliece PKE: first, the PKE is deterministic (i.e., decryption recovers all randomness that was used); second, the PKE has no decryption failures for valid ciphertexts. Even better, recent work achieves similar tightness for a broader class of attacks, namely QROM attacks. The risk that a hash-function- specific attack could be faster than a ROM or QROM attack is addressed by the standard practice of selecting a well-studied, high-security, "unstructured" hash function. Classic McEliece brings all of this together. It is a KEM designed for IND-CCA2 security at a very high security level, even against quantum computers. The KEM is built conservatively from a PKE designed for OW-CPA security, namely Niederreiter's dual version of McEliece's PKE using binary Goppa codes. Every level of the construction is designed so that future cryptographic auditors can be confident in the long-term security of post-quantum public-key encryption.

Terms and definitions For the purposes of this document, the following terms and definitions apply. SHAKE256: see , the sole symmetric primitive used in Classic McEliece with the selected parameters IND-CCA2: indistinguishability against adaptive chosen-ciphertext attacks KEM: key-encapsulation mechanism OW-CPA: one-wayness against chosen-plaintext attacks PKE: public-key encryption system ROM: random-oracle model QROM: quantum random-oracle model F_q: finite field of q :=: member of a set A_b, A_{b}: entity A subscripted with expression b A^b, A^{b}: entity A superscripted with expression b A_b^c, A_{b}^{c}, A^{c}_{b}: entity A subscripted with expression b and superscripted with expression c =>: larger than or equal <=: less than or equal CEILING(a): function mapping a to the least integer greater than or equal to a p * q: matrix multiplication

Symbols and abbreviated terms

Guide to notation The list below introduces the notation used in this specification. It is meant as a reference guide only; for complete definitions of the terms listed, refer to the appropriate text. Some other symbols are also used occasionally; they are introduced in the text where appropriate. n: The code length (part of the CM parameters) k: The code dimension (part of the CM parameters) t: The guaranteed error-correction capability (part of the CM parameters) q: The size of the field used (part of the CM parameters) m: logarithm base 2 of q (part of the CM parameters) u: A nonnegative integer (part of the CM parameters) v: A nonnegative integer (part of the CM parameters) Hash: A cryptographic hash function (symmetric-cryptography parameter) HashLen: Length of an output of Hash (symmetric-cryptography parameter) Sigma_1 : A nonnegative integer (symmetric-cryptography parameter) Sigma_2 : A nonnegative integer (symmetric-cryptography parameter) PRG: A pseudorandom bit generator (symmetric-cryptography parameter) g: A polynomial in F_q[x] (part of the private key) alpha_i: An element of the finite field F_q (part of the private key) Gamma: (g, alpha_0, ..., alpha_{n-1}) (part of the private key) s: A bit string of length n (part of the private key) T: An mt * k matrix over F_2 (the CM public key) e: A bit string of length n and Hamming weight t C: A ciphertext encapsulating a session key

Column vectors vs. row vectors Elements of F_2^n, such as codewords and error vectors, are always viewed as column vectors. This convention avoids all transpositions. Beware that this differs from a common convention in coding theory, namely to write codewords as row vectors but to transpose the codewords for applying parity checks.

0-numbering vs. 1-numbering To simplify comparisons to software in most programming languages, this specification consistently uses indices numbered from 0, including row indices, column indices, and alpha indices. Beware that conventions in the mathematical literature sometimes agree with this but sometimes do not: for example, polynomial exponents are conventionally numbered from 0, while most vectors not related to polynomial exponents are conventionally numbered from 1.

Requirements This document defines the Classic McEliece KEM. The KEM consists of three mathematical functions, namely KeyGen, Encap, and Decap, for each of the "selected parameter sets" listed in Clause 10. The definitions for each selected parameter set are unified into a single definition for a broader parameter space specified in Clause 6. For each parameter set in that parameter space, subsequent clauses in this document define exactly which public key and private key are output by KeyGen given random bits; exactly which ciphertext and session key are output by Encap given a public key and random bits; and exactly which session key is output by Decap given a ciphertext and a private key. This document defines each mathematical function F by presenting an algorithm to compute F. Basic algorithms such as Gaussian elimination are not repeated here, but MatGen, Encode, Decode, Irreducible, FieldOrdering, SeededKeyGen, FixedWeight, KeyGen, Encap, and Decap are specified below as numbered lists of steps. Three of these algorithms, namely FixedWeight, KeyGen, and Encap, are randomized, generating random bits at specified moments. The set of strings of random bits allowed as input for the corresponding mathematical functions is defined as the set of strings of random bits consumed by these algorithms. For example, the KeyGen algorithm reads exactly HashLen random bits, so the domain of the mathematical function KeyGen is the set of HashLen-bit strings. Here HashLen, one of the Classic McEliece parameters, is 256 for each of the selected parameter sets. To claim conformance to this document, an algorithm shall (1) name either KeyGen or Encap or Decap; (2) identify a parameter set listed in Clause 10 (not another parameter set from Clause 6); and (3) compute exactly the corresponding mathematical function defined in this document for that parameter set. For example, a KeyGen implementation claimed to conform to this document for the mceliece6960119 parameter set shall compute the specified KeyGen function for that parameter set: i.e., the implementation shall read exactly HashLen = 256 bits of randomness, and shall produce the same output that the KeyGen algorithm specified below produces given the same 256-bit string. Conformance to this document for a tuple of three algorithms, one for each of KeyGen and Encap and Decap, is defined as conformance to this document for each algorithm, and again shall identify a parameter set listed in Clause 10. Users sometimes place further constraints on algorithms, for example to include various side-channel countermeasures (which could use their own random bits) or to achieve particular levels of performance. Such constraints are out of scope for this document. This document defines the mathematical functions that shall be computed by any conformant algorithms; this document does not constrain how these functions are computed.

Parameters The CM parameters are implicit inputs to the CM algorithms defined below. A CM parameter set specifies the following: A positive integer m. This also defines a parameter q = 2^m. A positive integer n with n <= q. A positive integer t => 2 with mt < n. This also defines a parameter k = n - mt. A monic irreducible polynomial f(z) := F_2[z] of degree m. This defines a representation F_2[z]/f(z) of the field F_q. A monic irreducible polynomial F(y) := F_q[y] of degree t. This defines a representation F_q[y]/F(y) of the field F_{q^t} = F_{2^mt}. Integers v => u => 0 with v <= k + u. Parameter sets that do not mention these parameters define them as (0,0) by default. The symmetric-cryptography parameters listed below. The symmetric-cryptography parameters are the following: A positive integer HashLen. A cryptographic hash function Hash that outputs HashLen bits. An integer Sigma_1 => m. An integer Sigma_2 => 2m. A pseudorandom bit generator PRG mapping a string of HashLen bits to a string of n + Sigma_2 q + Sigma_1 t + HashLen bits.

The one-way function

Matrix reduction

Reduced row-echelon form Given a matrix X, Gaussian elimination computes the unique matrix R in reduced row-echelon form having the same number of rows as X and the same row space as X. Being in reduced row-echelon form means that there is a sequence c_0 < c_1 < ... < c_{r-1} such that row 0 of R begins with a 1 in column c_0, and this is the only nonzero entry in column c_0; row 1 of R begins with a 1 in column c_1, the only nonzero entry in column c_1; row 2 of R begins with a 1 in column c_2, the only nonzero entry in column c_2; etc.; row r - 1 of R begins with a 1 in column c_{r-1}, the only nonzero entry in column c_{r-1}; and all subsequent rows of R are 0. Note that the rank of R is r.

Systematic form As a special case, R is in systematic form if R has exactly r rows, i.e., there are no zero rows; and c_i = i for 0 <= i < r. (This second condition is equivalent to simply saying c_{r-1} = r - 1, except in the degenerate case r = 0.) In other words, R has the form (I_r|T), where I is an r * r identity matrix. Reducing a matrix X to systematic form means computing the unique systematic-form matrix having the same row space as X, if such a matrix exists.

Semi-systematic form The following generalization of the concept of systematic form uses two integer parameters u,v satisfying v => u => 0. Let R be a rank-r matrix in reduced row-echelon form. Assume that r => u, and that there are at least r - u + v columns. We say that R is in (u,v)-semi-systematic form if R has r rows (i.e., no zero rows); c_i = i for 0 <= i < r - u; and c_i <= i - u + v for 0 <= i < r. (The c_i conditions are equivalent to simply c_{r-u-1} = r - u - 1 and c_{r-1} <= r - u + v - 1 except in the degenerate case r = u.) As a special case, (u,v)-semi-systematic form is equivalent to systematic form if u = v. However, if v > u then (u,v)-semi-systematic form allows more matrices than systematic form. This specification gives various definitions first for the simpler case (u,v) = (0,0) and then for the general case. The list of selected parameter sets provides, for each key size, one parameter set with (u,v) = (0,0), and one parameter set labeled "f" with (u,v) = (32,64).

Matrix generation for Goppa codes

General The following algorithm MatGen takes as input Gamma = (g, alpha_0, alpha_1, ..., alpha_{n-1}) where g is a monic irreducible polynomial in F_q[x] of degree t and alpha_0, alpha_1, ..., alpha_{n-1} are distinct elements of F_q. The algorithm output MatGen(Gamma) is defined first in the simpler case of systematic form, and then in the general case of semi-systematic form. The output is either NIL or of the form (T, ...), where T is the CM public key, an mt * k matrix over F_2.

Systematic form For (u,v) = (0,0), the algorithm output MatGen(Gamma) is either NIL or of the form (T, Gamma), where T is an mt * k matrix over F_2. Here is the algorithm: Compute the t * n matrix M = {h_{i,j}} over F_q, where h_{i,j} = alpha_j^i / g(alpha_j) for i = 0, ..., t - 1 and j = 0, ..., n - 1. Form an mt * n matrix N over F_2 by replacing each entry u_0 + u_1 z + ... + u_{m-1} z^{m-1} of M with a column of m bits u_0, u_1, ..., u_{m-1}. Reduce N to systematic form (I_{mt}|T), where I_{mt} is an mt * mt identity matrix. If this fails, return NIL. Return (T, Gamma).

Semi-systematic form For general u,v, the algorithm output MatGen(Gamma) is either NIL or of the form (T, c_{mt-u}, ..., c_{mt-1}, Gamma'), where T is an mt * k matrix over F_2; c_{mt-u}, ..., c_{mt-1} are integers with mt - u <= c_{mt-u} < c_{mt-u+1} < ... < c_{mt-1} < mt - u + v; Gamma' = (g, alpha'_0, alpha'_1, ..., alpha'_{n-1}); g is the same as in the input; and alpha'_0, alpha'_1, ..., alpha'_{n-1} are distinct elements of F_q. Here is the algorithm: Compute the t * n matrix M = {h_{i,j}} over F_q, where h_{i,j} = alpha_j^i / g(alpha_j) for i = 0, ..., t - 1 and j = 0, ..., n - 1. Form an mt * n matrix N over F_2 by replacing each entry u_0 + u_1 z + ... + u_{m-1} z^{m-1} of M with a column of m bits u_0, u_1, ..., u_{m-1}. Reduce N to (u,v)-semi-systematic form, obtaining a matrix H. If this fails, return NIL. (Now row i has its leading 1 in column c_i. By definition of semi-systematic form, c_i = i for 0 <= i < mt - u; and mt - u <= c_{mt-u} < c_{mt-u+1} < ... < c_{mt-1} < mt - u + v. The matrix H is a variable that can change later.) Set (alpha'_0, alpha'_1, ..., alpha'_{n-1}) = (alpha_0, alpha_1, ..., alpha_{n-1}). (Each alpha'_i is a variable that can change later.) For i = mt - u, then i = mt - u + 1, and so on through i = mt - 1, in this order: swap column i with column c_i in H, while swapping alpha'_i with alpha'_{c_i}. (After the swap, row i has its leading 1 in column i. The swap does nothing if c_i = i.) The matrix H now has systematic form (I_{mt}|T), where I_{mt} is an mt * mt identity matrix. Return (T, c_{mt-u}, ..., c_{mt-1}, Gamma') where Gamma' = (g, alpha'_0, alpha'_1, ..., alpha'_{n-1}). In the special case (u,v) = (0,0), the c_{mt-u}, ..., c_{mt-1} portion of the output is empty, and the i loop is empty, so Gamma' is guaranteed to be the same as Gamma. The reduction to (0,0)-semi-systematic form is exactly reduction to systematic form. The general algorithm definition thus matches the (0,0) algorithm definition.

Encoding subroutine The following algorithm Encode takes two inputs: a weight-t column vector e := F_2^n; and a public key T, i.e., an mt * k matrix over F_2. The algorithm output Encode(e, T) is a vector C := F_2^{mt}. Here is the algorithm: Define H = (I_{mt}|T). Compute and return C = He := F_2^{mt}.

Decoding subroutine The following algorithm Decode decodes C := F_2^{mt} to a word e of Hamming weight wt(e) = t with C = He if such a word exists; otherwise it returns failure. Formally, Decode takes two inputs: a vector C := F_2^{mt}; and Gamma', the last component of MatGen(Gamma) for some Gamma such that MatGen(Gamma) != NIL. Write T for the first component of MatGen(Gamma). By definition of MatGen, T is an mt * k matrix over F_2; Gamma' has the form (g, alpha'_0, alpha'_1, ..., alpha'_{n-1}); g is a monic irreducible polynomial in F_q[x] of degree t; and alpha'_0, alpha'_1, ..., alpha'_{n-1} are distinct elements of F_q. There are two possibilities for Decode(C, Gamma'): If C = Encode(e, T) then Decode(C, Gamma') = e. In other words, if there exists a weight-t vector e := F_2^n such that C = He with H = (I_{mt}|T), then Decode(C, Gamma') = e. If C does not have the form He for any weight-t vector e := F_2^n, then Decode(C, Gamma') = NIL. Here is the algorithm: Extend C to v = (C, 0, ..., 0) := F_2^n by appending k zeros. Find the unique c := F_2^n such that (1) Hc = 0 and (2) c has Hamming distance <= t from v. If there is no such c, return NIL. Set e = v + c. If wt(e) = t and C = He, return e. Otherwise return NIL.

The Classic McEliece KEM

Irreducible-polynomial generation The following algorithm Irreducible takes a string of Sigma_1 t input bits d_0, d_1, ..., d_{Sigma_1 t - 1}. It outputs either NIL or a monic irreducible degree-t polynomial g := F_q[x]. Here is the algorithm: Define beta_j = SUM^{m-1}_{i=0}(d_{Sigma_1 j + i} z^i) for each j := {0, 1, ..., t - 1}. (Within each group of Sigma_1 input bits, this uses only the first m bits. The algorithm ignores the remaining bits.) Define beta = beta_0 + beta_1 y + ... + beta_{t-1} y^{t-1} := F_q[y] / F(y). Compute the minimal polynomial g of beta over F_q. (By definition g is monic and irreducible, and g(beta) = 0.) Return g if g has degree t. Otherwise return NIL.

Field-ordering generation The following algorithm FieldOrdering takes a string of Sigma_2 q input bits. It outputs either NIL or a sequence (alpha_0, alpha_1, ..., alpha_{q-1}) of q distinct elements of F_q. Here is the algorithm: Take the first Sigma_2 input bits b_0, b_1, ..., b_{Sigma_2-1} as a Sigma_2-bit integer a_0 = b_0 + 2b_1 + ... + 2^{Sigma_2 -1} b_{Sigma_2-1}, take the next Sigma_2 bits as a Sigma_2-bit integer a_1, and so on through a_{q-1}. If a_0, a_1, ..., a_q-1 are not distinct, return NIL. Sort the pairs (a_i, i) in lexicographic order to obtain pairs (a_pi(i), pi(i)) where pi is a permutation of {0, 1, ..., q - 1}. Define alpha_i = SUM^{m-1}_{j=0}(pi(i)_j z^{m-1-j}) where pi(i)_j denotes the j:th least significant bit of pi(i). (Recall that the finite field F_q is constructed as F_2[z] / f(z).) Output (alpha_0, alpha_1, ..., alpha_{q-1}).

Key generation The following randomized algorithm KeyGen takes no input (beyond the parameters). It outputs a public key and private key. Here is the algorithm, using a subroutine SeededKeyGen defined below: Generate a uniform random HashLen-bit string Delta. (This is called a seed.) Output SeededKeyGen(Delta). The following algorithm SeededKeyGen takes an HashLen-bit input Delta. It outputs a public key and private key. Here is the algorithm: Compute E = PRG(Delta), a string of n + Sigma_2 q + Sigma_1 t + HashLen bits. Define Delta' as the last HashLen bits of E. Define s as the first n bits of E. Compute alpha_0, ..., alpha_{q-1} from the next Sigma_2 q bits of E by the FieldOrdering algorithm. If this fails, set Delta = Delta' and restart the algorithm. Compute g from the next Sigma_1 t bits of E by the Irreducible algorithm. If this fails, set Delta = Delta' and restart the algorithm. Define Gamma = (g, alpha_0, alpha_1, ..., alpha_{n-1}). (Note that alpha_n, ..., alpha_{q-1} are not used in Gamma.) Compute (T, c_{mt-u}, ..., c_{mt-1}, Gamma') = MatGen(Gamma). If this fails, set Delta = Delta' and restart the algorithm. Write Gamma' as (g, alpha'_0, alpha'_1, ..., alpha'_{n-1}). Output T as public key and (Delta, c, g, alpha, s) as private key, where c = (c_{mt-u}, ..., c_{mt-1}) and alpha = (alpha'_0, ..., alpha'_{n-1}, alpha_n, ..., alpha_{q-1}).

Fixed-weight-vector generation The following randomized algorithm FixedWeight takes no input. It outputs a vector e := F_2^n of weight t. The algorithm uses a precomputed integer tau => t defined below. Here is the algorithm: Generate Sigma_1 tau uniform random bits b_0, b_1, ..., b_{Sigma_1 tau-1}. Define d_j = SUM^{m-1}_{i=0}(b_{Sigma_1 j + i} 2^i) for each j := {0, 1, ..., tau - 1}. (Within each group of Sigma_1 random bits, this uses only the first m bits. The algorithm ignores the remaining bits.) Define a_0, a_1, ..., a_{t-1} as the first t entries in d_0, d_1, ..., d_{tau-1} in the range {0, 1, ..., n - 1}. If there are fewer than t such entries, restart the algorithm. If a_0, a_1, ..., a_{t-1} are not all distinct, restart the algorithm. Define e = (e_0, e_1, ..., e_{n-1}) := F_2^n as the weight-t vector such that e_{a_i} = 1 for each i. Return e. The integer tau is defined as t if n = q; as 2t if q/2 <= n < q; as 4t if q/4 <= n < q/2; etc. All of the selected parameter sets have q / 2 <= n <= q, so tau := {t, 2t}.

Encapsulation The following randomized algorithm Encap takes as input a public key T. It outputs a ciphertext C and a session key K. Here is the algorithm for non-pc parameter sets: Use FixedWeight to generate a vector e := F_2^n of weight t. Compute C = Encode(e, T). Compute K = Hash(1, e, C); see Clause 9.2 for Hash input encodings. Output ciphertext C and session key K. Here is the algorithm for pc parameter sets: Use FixedWeight to generate a vector e := F_2^n of weight t. Compute C_0 = Encode(e, T). Compute C_1 = Hash(2, e). Put C = (C_0, C_1). Compute K = Hash(1, e, C). Output ciphertext C and session key K.

Decapsulation The following algorithm Decap takes as input a ciphertext C and a private key, and outputs a session key K. Here is the algorithm for non-pc parameter sets: Set b = 1. Extract s := F_2^n and Gamma' = (g, alpha'_0, alpha'_1, ..., alpha'_{n-1}) from the private key. Compute e = Decode(C, Gamma'). If e = NIL, set e = s and b = 0. Compute K = Hash(b, e, C); see Clause 9.2 for Hash input encodings. Output session key K. Here is the algorithm for pc parameter sets: Split the ciphertext C as (C_0, C_1) with C_0 := F_2^{mt} and C_1 := F_2^HashLen. Set b = 1. Extract s := F_2^n and Gamma' = (g, alpha'_0, alpha'_1, ..., alpha'_{n-1}) from the private key. Compute e = Decode(C_0, Gamma'). If e = NIL, set e = s and b = 0. Compute C'_1 = Hash(2, e). If C'_1 != C_1, set e = s and b = 0. Compute K = Hash(b, e, C). Output session key K.

Bits and bytes

Choices of symmetric-cryptography parameters All of the selected parameter sets use the following symmetric-cryptography parameters: The integer HashLen is 256. The HashLen-bit string Hash(x) is defined as the first HashLen bits of output of SHAKE256(x). Byte strings here are viewed as bit strings in little-endian form; see Clause 9.2. The set of bytes is defined as {0, 1, ..., 255}. The integer Sigma_1 is 16. (All of the selected parameter sets have m <= 16, so Sigma_1 => m.) The integer Sigma_2 is 32. The (n + Sigma_2 q + Sigma_1 t + HashLen)-bit string PRG(Delta) is defined as the first n + Sigma_2 q + Sigma_1 t + HashLen bits of output of SHAKE256(64, Delta). Here 64, Delta means the 33-byte string that begins with byte 64 and continues with Delta. All Hash inputs used in Classic McEliece begin with byte 0 or 1 (or 2 for pc) (see Clause 9.2), and thus do not overlap the SHAKE256 inputs used in PRG.

Representation of objects as byte strings

Bit vectors If r is a multiple of 8 then an r-bit vector v = (v_0, v_1, ..., v_{r-1}) := F_2^r is represented as the following sequence of r/8 bytes: (v0 + 2v_1 + 4v_2 + ... + 128v_7, v_8 + 2v_9 + 4v_10 + ... + 128v_15, ..., v_{r-8} + 2v_{r-7} + 4v_{r-6} + ... + 128v_{r-1}). If r is not a multiple of 8 then an r-bit vector v = (v_0, v_1, ..., v_{r-1}) := F_2^r is zero-padded on the right to length between r + 1 and r + 7, whichever is a multiple of 8, and then represented as above. By definition, Simply Decoded Classic McEliece ignores padding bits on input, while Narrowly Decoded Classic McEliece rejects inputs (ciphertexts and public keys) where padding bits are nonzero; rejection means returning NIL. For some parameter sets (but not all), r is always a multiple of 8, so there are no padding bits, so Simply Decoded Classic McEliece and Narrowly Decoded Classic McEliece are identical. The definitions of Simply Decoded and Narrowly Decoded are provided for convenience in discussions of situations where the distinction is potentially relevant. Applications should avoid relying on the distinction. Conformance to this document does not require a Simply Decoded or Narrowly Decoded label.

Session keys A session key K is an element of F_2^HashLen. It is represented as a CEILING(HashLen/8)-byte string.

Ciphertexts for non-pc parameter sets For non-pc parameter sets: A ciphertext C is an element of F_2^{mt}. It is represented as a CEILING(mt/8)-byte string.

Ciphertexts for pc parameter sets For pc parameter sets, a ciphertext C has two components: C_0 := F_2^mt and C_1 := F_2^HashLen. The ciphertext is represented as the concatenation of the CEILING(mt/8)-byte string representing C_0 and the CEILING(HashLen/8)-byte string representing C_1.

Hash inputs for non-pc parameter sets For non-pc parameter sets, there are two types of hash inputs: (1, v, C), and (0, v, C). Here v := F_2^n, and C is a ciphertext. The initial 0 or 1 is represented as a byte. The vector v is represented as the next CEILING(n/8) bytes. The ciphertext is represented as the next CEILING(mt/8) bytes. All hash inputs thus begin with byte 0 or 1, as mentioned earlier.

Hash inputs for pc parameter sets For pc parameter sets, there are three types of hash inputs: (2, v); (1, v, C); and (0, v, C). Here v := F_2^n, and C is a ciphertext. The initial 0, 1, or 2 is represented as a byte. The vector v is represented as the next CEILING(n/8) bytes. The ciphertext, if present, is represented as the next CEILING(mt/8) + CEILING(HashLen/8) bytes. All hash inputs thus begin with byte 0, 1, or 2, as mentioned earlier.

Public keys The public key T, which is an mt * k matrix, is represented in a row-major fashion. Each row of T is represented as a CEILING(k/8)-byte string, and the public key is represented as the mt CEILING(k/8)-byte concatenation of these strings.

Field elements Each element of F_q congruent F_2[z] / f(z) has the form SUM^{m-1}_{i=0}(c_i z) where c_i := F_2. The representation of the field element is the representation of the vector (c_0, c_1, ..., c_{m-1}) := F_2^m.

Monic irreducible polynomials The monic irreducible degree-t polynomial g = g_0 + g_1 x + ... + g_{t-1} x^{t-1} + x t is represented as t CEILING(m/8) bytes, namely the concatenation of the representations of the field elements g_0, g_1, ..., g_{t-1}.

Field orderings The obvious representation of a sequence (alpha_0, ..., alpha_{q-1}) of q distinct elements of F_q would be as a sequence of q field elements. This document instead specifies the following representation. An "in-place Benes network" is a series of 2m - 1 stages of swaps applied to an array of q = 2^m objects (a_0, a_1, ..., a_{q-1}). The first stage conditionally swaps a_0 and a_1, conditionally swaps a_2 and a_3, conditionally swaps a_4 and a_5, etc., as specified by a sequence of q/2 control bits (1 meaning swap, 0 meaning leave in place). The second stage conditionally swaps a_0 and a_2, conditionally swaps a_1 and a_3, conditionally swaps a_4 and a_6, etc., as specified by the next q/2 control bits. This continues through the m:th stage, which conditionally swaps a_0 and a_{q/2}, conditionally swaps a_1 and a_{q/2+1}, etc. The (m + 1):st stage is just like the (m - 1):st stage (with new control bits), the (m + 2):nd stage is just like the (m - 2):nd stage, and so on through the (2m - 1):st stage. Define pi as the permutation of {0, 1, ..., q - 1} such that alpha_i = SUM^{m-1}_{j=0}(pi(i)_j z^{m-1-j} for all i := {0, 1, ..., q - 1}. The ordering (alpha_0, ..., alpha_{q-1}) is represented as a sequence of (2m - 1)2^{m-1} control bits for an in-place Benes network for pi. This vector is represented as CEILING((2m - 1)2^{m-4}) bytes as above. Mathemtically, each permutation has multiple choices of control-bit vectors. For conformance to this document, a permutation pi shall be converted to specifically the control bits defined by controlbits in the following Python script. This is not a requirement for the decapsulation algorithm reading control bits to check uniqueness.

Column selections Part of the private key generated by KeyGen is a sequence c = (c_{mt-u}, ..., c_{mt-1}) of u integers in increasing order between mt - u and mt - u + v - 1. This sequence c is represented as a CEILING(v/8)-byte string, the little-endian format of the integer SUM^{u-1}_{i=0}(2^{c_{mt-u+i}-(mt-u). However, for (u,v) = (0,0), the sequence c is instead represented as the 8-byte string which is the little- endian format of 2^32 - 1, i.e., 4 bytes of value 255 followed by 4 bytes of value 0.

Private keys A private key (Delta, c, g, alpha, s) is represented as the concatenation of five parts: The CEILING(HashLen/8)-byte string representing Delta := F_2^HashLen. The string representing the column selections c. This string has CEILING(v/8) bytes, or 8 bytes if (u,v) = (0,0). The tCEILING(m/8)-byte string representing the polynomial g. The CEILING((2m - 1)2^{m-4}) bytes representing the field ordering alpha. The CEILING(n/8)-byte string representing s := F_2^n.

Selected parameter sets

Parameter set mceliece6688128 KEM with m = 13, n = 6688, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. This is a non-pc parameter set.

Parameter set mceliece6688128f KEM with m = 13, n = 6688, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. Semi-systematic parameters (u,v) = (32,64). This is a non-pc parameter set.

Parameter set mceliece6688128pc KEM with m = 13, n = 6688, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. This is a pc parameter set.

Parameter set mceliece6688128pcf KEM with m = 13, n = 6688, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. Semi-systematic parameters (u,v) = (32,64). This is a pc parameter set.

Parameter set mceliece6960119 KEM with m = 13, n = 6960, t = 119. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y 119 + y^8 + 1. This is a non-pc parameter set.

Parameter set mceliece6960119f KEM with m = 13, n = 6960, t = 119. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y 119 + y^8 + 1. Semi-systematic parameters (u,v) = (32,64). This is a non-pc parameter set.

Parameter set mceliece6960119pc KEM with m = 13, n = 6960, t = 119. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y 119 + y^8 + 1. This is a pc parameter set.

Parameter set mceliece6960119pcf KEM with m = 13, n = 6960, t = 119. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y 119 + y^8 + 1. Semi-systematic parameters (u,v) = (32,64). This is a pc parameter set.

Parameter set mceliece8192128 KEM with m = 13, n = 8192, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. This is a non-pc parameter set.

Parameter set mceliece8192128f KEM with m = 13, n = 8192, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. Semi-systematic parameters (u,v) = (32,64). This is a non-pc parameter set.

Parameter set mceliece8192128pc KEM with m = 13, n = 8192, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. This is a pc parameter set.

Parameter set mceliece8192128pcf KEM with m = 13, n = 8192, t = 128. Field polynomials f(z) = z^13 + z^4 + z^3 + z + 1 and F(y) = y^128 + y^7 + y^2 + y + 1. Semi-systematic parameters (u,v) = (32,64). This is a pc parameter set.

Security Considerations Classic McEliece is a Key Encapsulation Mechanism designed to achieve IND-CCA2 security at a very high security level, against conventional and quantum computers. The quality of the random data is critical for security of Classic McEliece, see for additional discussion and recommendations. Implementation should be designed to minize leaking of security sensitive material, including protecting against side-channel attacks. New research results on the security of Classic McEliece may be published at any time that may warrant implementation or deployment reconsiderations. To hedge against new research findings, Classic McEliece may be combined with other algorithms (e.g., Curve25519) in a "hybrid" mode intended to be no weaker than any one of the individual algorithms used and the way the algorithms are combined.

IANA Considerations This document has no IANA actions.

SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions National Institute of Standards and Technology Information Technology Laboratory National Institute of Standards and Technology

US Gaithersburg

Randomness Requirements for Security Security systems are built on strong cryptographic algorithms that foil pattern analysis attempts. However, the security of these systems is dependent on generating secret quantities for passwords, cryptographic keys, and similar quantities. The use of pseudo-random processes to generate secret quantities can result in pseudo-security. A sophisticated attacker may find it easier to reproduce the environment that produced the secret quantities and to search the resulting small set of possibilities than to locate the quantities in the whole of the potential number space. Choosing random quantities to foil a resourceful and motivated adversary is surprisingly difficult. This document points out many pitfalls in using poor entropy sources or traditional pseudo-random number generation techniques for generating such quantities. It recommends the use of truly random hardware techniques and shows that the existing hardware on many systems can be used for this purpose. It provides suggestions to ameliorate the problem when a hardware solution is not available, and it gives examples of how large such quantities need to be for some applications. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Classic McEliece: conservative code-based cryptography: cryptosystem specification Classic McEliece Team Classic McEliece: conservative code-based cryptography: design rationale Classic McEliece Team Classic McEliece: conservative code-based cryptography: guide for implementors Classic McEliece Team Classic McEliece: conservative code-based cryptography: guide for security reviewers Classic McEliece Team Classic McEliece: conservative code-based cryptography: what plaintext confirmation means Classic McEliece Team Classic McEliece: papers Classic McEliece Team Classic McEliece: Sage package Classic McEliece Team Information security — Encryption algorithms — Part 1978: Classic McEliece Classic McEliece Team A public-key cryptosystem based on algebraic coding theory

Overview of Classic McEliece resources (informative) Classic McEliece is specified in and, for the pc options, . The specification in this document is compatible with and . For the design rationale, see . presents algorithms for the Classic McEliece functions in the Sage language. Subject to being computer-executable, this package is designed for the algorithms to be as readable as possible, including detailed comments matching the algorithms to (and ). provides guidance to implementors. For example, it covers security against side-channel attacks, considerations in picking a parameter set, engineering cryptographic network applications for efficiency, existing implementations, and how to build new implementations. provides guidance to security reviewers. As a preliminary matter, covers correctness of the cryptosystem: for example, c in Step 2 of Decode is unique if it exists, and c always exists when C is output by Encap. then reviews the stability of attacks against the original 1978 McEliece cryptosystem introduced in , and reviews the tight relationship between the OW-CPA security of that cryptosystem and the QROM IND-CCA2 security of Classic McEliece. Given the analysis in , all of the parameters selected in this document meet ISO's requirement of 2^128 post-quantum security against known attacks. This is true even if one counts merely qubit operations, ignoring (1) qubit overheads and (2) the costs of memory access inside attacks. (This document does not comment on whether parameters not listed here also meet this requirement.) For comparison, 128-bit ciphers such as AES-128 provide only slightly more than 2^64 security in this metric. Many further references can be found in the documents cited above and in .