Skip to content

fix: use runtime probe to select AES-GCM context strategy under FIPS#14819

Open
abbra wants to merge 1 commit into
pyca:mainfrom
abbra:fix/aead-fips-provider-ctx-copy
Open

fix: use runtime probe to select AES-GCM context strategy under FIPS#14819
abbra wants to merge 1 commit into
pyca:mainfrom
abbra:fix/aead-fips-provider-ctx-copy

Conversation

@abbra
Copy link
Copy Markdown
Contributor

@abbra abbra commented May 6, 2026

EvpCipherAead pre-initialises a cipher context with the key and then clones it per operation via EVP_CIPHER_CTX_copy. When an older FIPS provider (e.g. OpenSSL 3.0.7 FIPS module) is loaded alongside a newer main library (>= 3.2), EVP_CIPHER_CTX_copy may fail because that provider version does not support copying a pre-keyed context.

Instead of relying on FIPS as an imperfect proxy, introduce aes_gcm_ctx_copy_supported(), which probes the capability directly by initialising a dummy AES-128-GCM context and attempting to copy it. The result is cached in a OnceLock so the probe runs at most once per process. On OpenSSL >= 3.2, AesGcm::new() consults the cache and falls back to LazyEvpCipherAead (fresh context per operation) when the copy is not supported.

The new AesGcmCtx enum holds either implementation so the choice can be made at runtime rather than baked in at compile time. AES_GCM_TAG_LEN replaces the magic 16 literals throughout the AesGcm methods.

Fixes: #14795

Assisted-by: Claude Code (Sonnet 4.6)

@abbra abbra force-pushed the fix/aead-fips-provider-ctx-copy branch from d864feb to 35f8755 Compare May 6, 2026 14:47
@schwabe
Copy link
Copy Markdown
Contributor

schwabe commented May 6, 2026

I tested this on my RHEL9 distribution running in FIPS mode and this PR fixes the problem for me.

Copy link
Copy Markdown
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole approach seems overly fragile to me. If we want to support this (and honestly I still think this is a dumb thing to support), it seems much simpler to have the EvpCipherAead simply store sufficient parameters that if copy() fails it just allocates a new EvpCipher for it. That'd make the code much simpler, and also remove the needs for these hacky tests

Comment thread src/rust/src/backend/aead.rs Outdated
}
let key = [0u8; 16];
let Ok(mut ctx) = openssl::cipher_ctx::CipherCtx::new() else {
// NO-COVERAGE-START
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never use coverage ignroes for real code -- they are only acceptable for macro-code that's uncoverable.

Comment thread src/rust/src/backend/aead.rs Outdated
Ok(pyo3::types::PyBytes::new_with(
py,
data_bytes.len() + self.ctx.tag_len,
data_bytes.len() + AES_GCM_TAG_LEN,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change at all related?

Comment thread src/rust/src/backend/aead.rs Outdated
use std::sync::OnceLock;
static COPY_SUPPORTED: OnceLock<bool> = OnceLock::new();
*COPY_SUPPORTED.get_or_init(|| {
// Test hook: force the lazy-context fallback without a real FIPS
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an acceptable testing strategy.

@abbra abbra force-pushed the fix/aead-fips-provider-ctx-copy branch 4 times, most recently from 1ac4588 to 29df19a Compare May 9, 2026 14:54
EvpCipherAead pre-initialises a cipher context with the key and
clones it per operation via EVP_CIPHER_CTX_copy.  When an older FIPS
provider (e.g. OpenSSL 3.0.7 FIPS module) is loaded alongside a newer
main library (>= 3.2), EVP_CIPHER_CTX_copy may fail because that
provider version does not support copying a pre-keyed context.

Add a copy_fallback field to EvpCipherAead that stores the cipher type
and key.  If the per-operation copy() fails and a fallback is present,
encrypt_into/decrypt_into re-initialise a fresh context from the stored
parameters instead of propagating the error.

AesGcm on OpenSSL >= 3.2 uses new_with_copy_fallback() to populate the
field.  Other EvpCipherAead users (AESSIV, AESOCB3, AESGCMSIV,
ChaCha20Poly1305) are unaffected: they are never used under FIPS so
copy() never fails for them, and copy_fallback remains None.

AES_GCM_TAG_LEN replaces the magic 16 literals throughout the AesGcm
methods.

Fixes: pyca#14795

Assisted-by: Claude Code (Sonnet 4.6)
Signed-off-by: Alexander Bokovoy <abokovoy@redhat.com>
@abbra abbra force-pushed the fix/aead-fips-provider-ctx-copy branch from 29df19a to 744e967 Compare May 9, 2026 18:14
CRYPTOGRAPHY_IS_LIBRESSL,
CRYPTOGRAPHY_IS_AWSLC
))]
pub(crate) fn new_with_copy_fallback(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this optional behavior instead of just being the default?

If this behavior was comprehensive, we shouldn't need the Lazy type at all, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lazy type is not used with OpenSSL.

FIPS validation is a long process, so for some time old validated tokens will be available along with upgraded OpenSSL version. Some downstreams have their own FIPS validated tokens available before the token passes the validation, so we have to deal with presence of both old and new at the same time. However, it is not something that would change over the runtime of a single application -- multiple app executions could have different tokens in their environments, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

AESGCM fails on RHEL9 in FIPS mode

3 participants