Skip to content

Fix/cross sign cert expired recovery#860

Open
mrnovalles wants to merge 4 commits into
benoitc:masterfrom
mrnovalles:fix/cross-sign-cert-expired-recovery
Open

Fix/cross sign cert expired recovery#860
mrnovalles wants to merge 4 commits into
benoitc:masterfrom
mrnovalles:fix/cross-sign-cert-expired-recovery

Conversation

@mrnovalles
Copy link
Copy Markdown

Problem

This fix came out of a production incident. Our push notification service recently migrated to Let's Encrypt certificates. After the migration, all HTTPS calls made through hackney started failing with:

  {tls_alert, {certificate_expired, "TLS client: ... CLIENT ALERT: Fatal - Certificate Expired"}}

The leaf certificate was not expired. The issue is structural: Let's Encrypt chains include an ISRG Root X2 cross-signed by ISRG Root X1, whose validity period ran 2020-09-04 → 2025-09-15 and is now past.

Root cause

OTP ships recovery logic for exactly this scenario in ssl_certificate:find_cross_sign_root_paths/4. It triggers when path validation reports root_cert_expired: OTP searches the trust store for a cert with the same public key as the expired root and, if found, re-validates the chain anchored at the still-valid self-signed copy.

The recovery never runs with hackney due to the following sequence:

  1. OTP calls the :verify_fun with {bad_cert, cert_expired} for the expired cross-signed anchor.
  2. Hackney's verify_fun is a direct reference to ssl_verify_hostname:verify_fun/3, which returns {fail, {bad_cert, cert_expired}} without any special treatment.
  3. OTP receives {fail, …} and aborts the handshake immediately.
  4. find_cross_sign_root_paths/4 is never reached.

The OTP recovery only triggers on root_cert_expired, not cert_expired.

Fix

Wrap ssl_verify_hostname:verify_fun/3 in check_hostname_opts/1 with a one-clause prefix that rewrites {bad_cert, cert_expired} to {fail, {bad_cert, root_cert_expired}}. All other events are delegated to ssl_verify_hostname unchanged, so hostname checking is unaffected.

OTP's ssl_certificate:find_cross_sign_root_paths/4 recovers from an
expired cross-signed root by locating an alternative valid root with
the same public key in the trust store. It only triggers when path
validation reports root_cert_expired.

ssl_verify_hostname:verify_fun/3 returns {fail, {bad_cert, cert_expired}}
verbatim, which terminates the handshake before OTP's recovery can run.

Wrap the verify_fun in check_hostname_opts/1 to intercept cert_expired
and rewrite it to root_cert_expired. All other events are delegated to
ssl_verify_hostname unchanged, so hostname checking is unaffected.

Confirmed against rest.fra-01.braze.eu (Let's Encrypt chain containing
the ISRG Root X2 cross-signed by ISRG Root X1, expired 2025-09-15)
using hackney 1.25.0, certifi 2.15.0, OTP 27.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant