HKDF

Key derivation function based on an HMAC From Wikipedia, the free encyclopedia

HKDF is a multi-purpose key derivation function (KDF) based on the HMAC message authentication code. HKDF follows "extract-then-expand" paradigm, where the KDF logically consists of two modules: the first stage takes the input keying material and "extracts" from it a fixed-length pseudorandom key, and then the second stage "expands" this key into several additional, independent pseudorandom keys as the output of the KDF.

Mechanism

HKDF is the composition of two functions, HKDF-Extract and HKDF-Expand:

HKDF(salt, IKM, info, length) = HKDF-Expand(HKDF-Extract(salt, IKM), info, length)^[1]^: 11

HKDF-Extract

HKDF-Extract (XTR) takes "input key material" or "source key material" (IKM or SKM) such as a shared secret generated using Diffie-Hellman; an optional, non-secret, random or pseudorandom salt (r); and generates a cryptographic key called the PRK ("pseudorandom key"). HKDF-Extract acts as a "randomness extractor",^[a]^[1]^: 1 specifically a "computational extractor", taking a potentially non-uniform value of sufficient min-entropy and generating a value indistinguishable from a uniform random value (pseudorandom).^[1]^: 9–11^[2] Computational extractors assume attackers are computationally bounded and source entropy may only exist in a computational sense. Such extractors can be built using cryptographic functions under suitable assumptions, modeled as universal hash function (in the generic case) or a random oracle (in constrained scenarios like sources with weak entropy).^[1]^: 1–3

Salt (r) acts as a "source-independent extractor",^[b] strengthening HKDF's security guarantees.^[2] Using a fixed public r is safe for multiple invocations of HKDF (on "independent" but secret IKMs which may or may not be derived from the same source),^[1]^: 4–6^[3]^: 6 provided r isn't chosen or manipulated by an attacker.^[3]^: 6 Ideally, r is a random string of hash function's output length. Even low quality r (weak entropy or shorter length) is recommended as they contribute "significantly" to the security of the OKM.^[3]^: 4–5 Without or with a low-entropy, non-secret r, if an attacker can influence the IKMs source in a way that specifically exploits HKDF-Extract's underlying hash function (finding a collision or a specific bias), XTR provides no protection. A random r, even if fixed by the application (for example, random number generators using r as seed), would strengthen protections for that specific extractor session.^[1]^{: 9, 24–26} In such a setting, sufficiently long IKMs also provide better entropy extraction.^[1]^: 24 However, allowing the attacker to influence enough of the IKM after seeing r may result in a completely insecure KDF.^[1]^: 8–9

HKDF-Extract is the result of HMAC with r as the key (all zeros up to length of the underlying extractor hash function, if not provided) and the IKM as the message.^[3]^: 3 The underlying hash function used for HKDF-Extract step may be different to the one used by HKDF-Expand. It is recommended that HKDF-Extract uses strongest hash function available to the application,^[1]^: 27 as it "concentrates" the entropy already present in IKM but may not necessarily "add" to it.^[3]^: 2 Truncated output from a stronger underlying hash function for XTR (for example, SHA512/256) offers stronger extraction properties.^[1]^: 17 The attacker is assumed to have partial knowledge about IKM (publicly known values in the case of Diffie-Hellman) or partial control over it (entropy pools).^[3]^: 2

HKDF-Extract may be skipped if the IKM is itself a cryptographically strong key (and hence can assume the role of PRK), though it is recommended that HKDF-Extract be applied for the sake of compatibility with the general case,^[3]^: 5 especially if r is available to the application.

HKDF-Expand

HKDF-Expand (PRF*) takes the PRK^[1]^: 9–11 (or any random key-derivation key if HKDF-Extract step is skipped),^[3]^: 5 optional info (CTXinfo), and a length (L), to generate output key material (OKM) of length L.^[1]^: 9–11 Multiple OKMs can be generated from a single PRK by using different values for CTXinfo, which must be "independent" of the IKM passed in HKDF-Extract.^[3]^: 5 Even if an attacker, who knows r and some auxillary information about the secret IKM, can force the use of the same IKM (and PRK, by extension), in two or more HKDF-Expand contexts (represented by CTXinfo), the OKMs output are computationally independent (leak no useful information on each other).^[1]^: 7–8

HKDF-Expand, acting as a variable-output-length pseudorandom function (PRF*) keyed on PRK,^[1]^: 15 calls HMAC on CTXinfo as the message (empty string, if unspecified) appended to a 8-bit counter i initialized to 1.^[1]^: 18 Subsequent calls to HMAC are chained in "feedback mode" by prepending the previous HMAC output to CTXinfo and incrementing i.^[3] OKM is a function of the output size (k bits) of HMAC's underlying hash function; i.e., SHA-256 outputs OKM in segments of k=256 bits for up to a maximum of length i × k bits (255 × 256 bits = 8160 bytes) truncated to desired length L.^[1]^: 11

HKDF-Expand may be skipped if PRK is at least desired length L, though it is recommended that HKDF-Expand be applied for additional "smoothing" of the OKM.^[1]^: 27^[3]^: 6

Standardization

HKDF was proposed as a building block in various protocols and applications, as well as to discourage the proliferation of multiple KDF mechanisms by its authors.^[3]^: 1

It is formally described in RFC 5869^[3] with detailed analysis in a paper published in 2010.^[1] NIST SP800-56Cr2^[4] specifies a parameterizable extract-then-expand scheme, noting that RFC 5869 HKDF is a version of it and citing its paper for the rationale for the recommendations' extract-and-expand mechanisms.

Applications

HKDF is used in the Signal Protocol for end-to-end encrypted messaging where it generates the message keys, in conjunction with the triple Elliptic-curve Diffie-Hellman handshake (X3DH) key agreement protocol.^[5] Signal's "Secure Value Recovery"^[6] and "Sealed Sender" are based on HKDF.^[7] HKDF is a main component in the Noise Protocol Framework, Message Layer Security, and is used in widely deployed protocols like IPsec Internet Key Exchange and TLS 1.3.^[5]

The "multi-purpose" nature of HKDF is meant to serve applications that require key extraction, key expansion, and key hierarchies in key wrapping, key exchange, PRNG, and password-based key derivation schemes.^[8]

Implementations

There are implementations of HKDF for C#, Go,^[9] Java, JavaScript, Perl, PHP,^[10] Python, Ruby, Rust, and other programming languages. RFC6234 lays out a reference C implementation of HKDF based on the Secure Hash Standard.^[11]

Example in Python

#!/usr/bin/env python3

import hashlib
import hmac

# nb: unused here; but SHA512/256 in extract phase provides stronger 
# extraction guarantees when expand is SHA256.
# hash_function_extract = hashlib.sha512
# RFC5869 includes SHA-1 test vectors
hash_function = hashlib.sha256

def hmac_digest(key: bytes, data: bytes) -> bytes:
    return hmac.new(key, data, hash_function).digest()


def hkdf_extract(salt: bytes, ikm: bytes) -> bytes:
    if len(salt) == 0:
        salt = bytes([0] * hash_function().digest_size)
    return hmac_digest(salt, ikm)


def hkdf_expand(prk: bytes, info: bytes, length: int) -> bytes:
    t = b""
    okm = b""
    i = 0
    while len(okm) < length:
        i += 1
        t = hmac_digest(prk, t + info + bytes([i]))
        okm += t
    return okm[:length]


def hkdf(salt: bytes, ikm: bytes, info: bytes, length: int) -> bytes:
    prk = hkdf_extract(salt, ikm)
    return hkdf_expand(prk, info, length)


okm = hkdf(
    salt=bytes.fromhex("000102030405060708090a0b0c"),
    ikm=bytes.fromhex("0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b"),
    info=bytes.fromhex("f0f1f2f3f4f5f6f7f8f9"),
    length=42,
)
assert okm == bytes.fromhex(
    "3cb25f25faacd57a90434f64d0362f2a"
    "2d2d0a90cf1a5a4c5db02d56ecc4c5bf"
    "34007208d5b887185865"
)

# Zero-length salt
assert hkdf(
    salt=b"",
    ikm=bytes.fromhex("0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b"),
    info=b"",
    length=42,
) == bytes.fromhex(
    "8da4e775a563c18f715f802a063c5a31"
    "b8a11f5c5ee1879ec3454e5f3c738d2d"
    "9d201395faa4b61a96c8"
)

Notes

[a]
In complexity theory, informally, an extractor maps input probability distributions with sufficient entropy into output distributions that are statistically close to uniform.^[1]^: 4
[b]
A random salt enforces independence between the source distribution (of the initial key material) and the extractor itself, and between different uses of the same extractor scheme (for example, a specific instantiation of HMAC) by an application.^[1]^: 25–26

References

[1]
Krawczyk, Hugo (2010). "Cryptographic Extraction and Key Derivation: The HKDF Scheme". Cryptology ePrint Archive. International Association for Cryptologic Research.
[2]
Krawczyk, Hugo (2012). Cryptographic Extraction. Isaac Newton Institute for Mathematical Sciences. Event occurs at 16m – via YouTube. Randomness extractors are algorithms that map sources of sufficient min-entropy to outputs that are statistically close to uniform. Randomness extraction has become a central and ubiquitous notion in complexity theory and theoretical computer science with innumerable applications and surprising and unifying connections to other notions. Cryptography, too, has greatly benefited from this notion. Cryptographic applications of randomness extractors range from the construction of pseudorandom generators from one-way functions to the design of cryptographic functionality from noisy and weak sources (including applications to quantum cryptography) to the more recent advances in areas such as leakage- and exposure-resilient cryptography, circular encryption, fully homomorphic encryption, etc. Randomness extractors have also found important cryptographic uses in practical applications, particularly for the construction of key derivation functions. In many of these applications, the defining property of randomness extractors, namely, statistical closeness of their output to a uniform distribution, can be relaxed and replaced with computational indistinguishability. Extractors that provide this form of relaxed guarantee are called 'computational extractors'. In this talk I will cover some recent advances in the understanding and applicability of computational extractors with particular focus on their role in building key derivation functions.
[3]
Krawczyk, H.; Eronen, P. (May 2010). "RFC 5869". Internet Engineering Task Force. doi:10.17487/RFC5869.
[4]
Elaine Barker; Lily Chen; Richard Davis (August 2020). "NIST Special Publication 800-56C: Recommendation for Key-Derivation Methods in Key-Establishment Schemes" (Document). National Institute of Standards and Technology. doi:10.6028/NIST.SP.800-56Cr2.
[5]
Bhati, Amit Singh; Dufka, Antonín; Andreeva, Elena; Roy, Arnab; Preneel, Bart (2024). "Skye: An Expanding PRF based Fast KDF and its Applications". Proceedings of the 19th ACM Asia Conference on Computer and Communications Security. Singapore. pp. 1082–1098. doi:10.1145/3634737.3637673. ISBN 979-8-4007-0482-6.{{cite book}}: CS1 maint: location missing publisher (link)
[6]
Lund, Joshua (2019). "Technology Preview for secure value recovery".
[7]
Lund, Joshua (2018). "Technology Preview: Sealed sender for Signal". signal.org.
[8]
Krawczyk, Hugo (16 Nov 2016). COST/IACR School on Randomness: Extraction and KDFs I. Barcelona: Pompeu Fabra University. Event occurs at 6m – via youtube.com.
[9]
"package hkdf". pkg.go.dev.
[10]
"hash_hkdf — Generate a HKDF key derivation of a supplied key input". php.net.
[11]
Eastlake, Donald; Hansen, Tony (2011). "US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)". IETF.