Add TLS session resumption via SSLSessionCache#789
Add TLS session resumption via SSLSessionCache#789sylwiaszunejko wants to merge 4 commits intoscylladb:masterfrom
Conversation
Such claims would ideally be supported by benchmarks. Could you try to create some? |
That's the goal, but you're right, I don't have any tests to prove that, removed this claim from the PR description. If I manage to create proper benchmarks I will update on that |
|
We could, if it helps, only support this for TLS 1.3. |
Introduce SSLSessionCache in connection.py: a thread-safe OrderedDict-based cache with LRU eviction (max_size, default 100) and TTL expiration (default 3600s), keyed by endpoint tls_session_cache_key. Add tls_session_cache_key property to all EndPoint subclasses: - DefaultEndPoint: (address, port) - SniEndPoint: (address, port, server_name) — prevents proxy collisions - UnixSocketEndPoint: (unix_socket_path,) - ClientRoutesEndPoint: (host_id, address, port) Includes unit tests for basic ops, key isolation, SNI keys, overwrite, thread safety, TTL expiration, LRU eviction, clear/clear_expired, automatic cleanup, custom parameters, and endpoint cache key tests.
7281340 to
4500773
Compare
|
@dkropachev @Lorak-mmk I pushed changes with improvement from older Dmitry's PR, will update PR description soon |
dkropachev
left a comment
There was a problem hiding this comment.
I rechecked the TLS session-resumption path against the current branch. The ssl_options configuration still builds a fresh SSLContext per Connection, and a cached stdlib session from the previous connection is incompatible with that new context. I reproduced the failure locally on Python 3.10.12; the session restore path raises ValueError: Session refers to a different SSLContext. Since the new code only catches AttributeError and ssl.SSLError, reconnects fail instead of falling back to a full handshake, and the regression is enabled by default because Cluster auto-creates SSLSessionCache for ssl_options.
cassandra/connection.py
Outdated
| if cached_session is not None: | ||
| try: | ||
| ssl_sock.session = cached_session | ||
| except (AttributeError, ssl.SSLError): |
There was a problem hiding this comment.
When TLS is configured through ssl_options, each Connection still builds a fresh SSLContext in Connection.__init__, so the cached stdlib ssl session from the previous connection is not compatible with the next one. I rechecked this against the current branch and reproduced it locally on Python 3.10.12: after a successful TLS 1.2 handshake with one context, assigning the cached session to an SSLSocket created from a different context fails with ValueError: Session refers to a different SSLContext. This block only falls back on AttributeError and ssl.SSLError, so reconnects now fail instead of doing a full handshake. Because Cluster auto-enables SSLSessionCache for ssl_options, this regresses an existing supported TLS configuration by default.
There was a problem hiding this comment.
Good catch, thanks. Fixed in the latest push:
-
ValueErroris now caught in_wrap_socket_from_context(), so even if a stale session somehow reaches that part of the code, the connection falls back to a full handshake instead of crashing. -
Auto-creation of
SSLSessionCacheis now limited to thessl_contextpath. Since only providingssl_optionswithoutssl_contextis deprecated I believe this is the right solution
Added a unit tests to make sure it works correctly.
- Add _ssl_session_cache attribute on Connection, set via ssl_session_cache param - Restore cached TLS sessions in _wrap_socket_from_context with error tolerance - Add _cache_tls_session_if_needed helper (delegates to endpoint.tls_session_cache_key) - Cache sessions at 3 points: after connect, ReadyMessage, AuthSuccessMessage (handles TLS 1.3 async ticket delivery) - Add TestConnectionSSLSessionRestore and TestConnectionCacheTLSSession tests
- Import SSLSessionCache in cluster.py - Add ssl_session_cache attribute with comprehensive docstring - Add ssl_session_cache parameter to Cluster.__init__ (default _NOT_SET) - Auto-create SSLSessionCache when ssl_context or ssl_options are set - Pass ssl_session_cache to connection factory via _make_connection_kwargs - Add TestSSLSessionCacheAutoCreation tests (6 tests)
- EventletConnection: restore cached session before handshake via set_session(), store session after do_handshake() via _cache_pyopenssl_session() - TwistedConnection: pass ssl_session_cache to _SSLCreator, restore cached session in clientConnectionForTLS(), store after handshake in info_callback() - All operations wrapped in try/except for error tolerance - Debug logging for session reuse and restore/store failures
4500773 to
d12db4a
Compare
Summary
This PR implements TLS session resumption for the Python driver. After the first
successful TLS handshake with a node, the negotiated session is stored in a
thread-safe cache and reused on subsequent connections, skipping the full
handshake.
Both TLS 1.2 (session IDs) and TLS 1.3 (session tickets / PSK) are supported.
Changes
cassandra/connection.py—SSLSessionCacheclass & endpoint keys_SessionCacheEntrynamedtuple stores(session, timestamp)for TTL tracking.SSLSessionCache: a thread-safeOrderedDict-based cache with LRU eviction,TTL expiration, and periodic cleanup (every 100
set()calls), keyed byendpoint
tls_session_cache_key.max_size(default 100) andttl(default 3600 s).EndPointclass provides a defaulttls_session_cache_keypropertyreturning
(address, port). Subclasses override for context-specific keys:DefaultEndPoint:(address, port)— inherits defaultSniEndPoint:(address, port, server_name)— prevents proxy collisionsUnixSocketEndPoint:(unix_socket_path,)ClientRoutesEndPoint:(host_id, address, port)cassandra/connection.py—ConnectionwiringConnectiongains_ssl_session_cacheattribute, set viassl_session_cachekwarg in
__init__._wrap_socket_from_context()restores a cached session viassl_sock.session = ...afterwrap_socket(); gracefully handlesssl.SSLError/AttributeErrorif the server rejects the session._ssl_session_cache_key()helper delegates toendpoint.tls_session_cache_key._cache_tls_session_if_needed()storessocket.sessionin the cache whenssl_contextis set and the session is non-None._initiate_connection()in_connect_socket()— TLS 1.2 sessionsare available immediately after connect.
ReadyMessagein_handle_startup_response()— TLS 1.3 ticketsarrive asynchronously after the first application-data exchange.
AuthSuccessMessagein_handle_auth_response()— same TLS 1.3coverage for authenticated connections.
cassandra/cluster.py—ClusterintegrationSSLSessionCache.ssl_session_cacheclass attribute with docstring.__init__acceptsssl_session_cache=_NOT_SETparameter.SSLSessionCache()whenssl_contextorssl_optionsareset; no configuration required for the common case.
ssl_session_cache=Noneexplicitly to opt out.SSLSessionCache(max_size=…, ttl=…)can be supplied._make_connection_kwargs()passes the cache to everyConnectionviakwargs_dict.setdefault('ssl_session_cache', self.ssl_session_cache).cassandra/io/eventletreactor.py— Eventlet (PyOpenSSL) support_wrap_socket_from_context()restores cached PyOpenSSL sessions viaset_session()before the handshake._initiate_connection()calls_cache_pyopenssl_session()afterdo_handshake()._cache_pyopenssl_session()helper stores the session viaget_session(), logs whether the session was reused(
session_reused()), and catches all exceptions silently.cassandra/io/twistedreactor.py— Twisted (PyOpenSSL) support_SSLCreator.__init__accepts an optionalssl_session_cacheparameter.clientConnectionForTLS()restores cached sessions viaset_session().info_callback()stores sessions afterSSL_CB_HANDSHAKE_DONEviaget_session(), logs reuse status.TwistedConnection.add_connection()passesssl_session_cache=self._ssl_session_cacheto
_SSLCreator.Tests
tests/unit/test_connection.pyTestSSLSessionCache— empty lookup, set/get, key isolation byaddress/port/SNI, overwrite, thread safety, TTL expiration, LRU eviction,
max_size enforcement,
clear(),clear_expired(), automatic periodiccleanup,
Nonesession handling.TestEndPointTLSSessionCacheKey— cache key correctness forDefaultEndPoint,SniEndPoint,UnixSocketEndPoint,ClientRoutesEndPoint, plus isolation between different paths/addresses.TestConnectionSSLSessionRestore— session restore from cache,tolerance when cache is
None,ssl.SSLErroronsessionsetter,SNI-specific cached session lookup.
TestConnectionCacheTLSSession— session stored after connect,no-op when
session=None, no-op whencache=None, no-op whenssl_context=None, SNI-specific key used for storage.tests/unit/test_cluster.pyTestSSLSessionCacheAutoCreation— auto-create withssl_context,auto-create with
ssl_options, no cache without TLS, explicitNoneopt-out, custom cache injection, cache passed to
connection_factory.Fixes: https://scylladb.atlassian.net/browse/DRIVER-165
Pre-review checklist
./docs/source/.Fixes:annotations to PR description.