ConnectionError from _cancel() during CancelledError not caught, crashes callers

## Environment

- **asyncpg version**: 0.31.0 (also reproduced on 0.30.x)
- **PostgreSQL version**: 16
- **Python version**: 3.11.14
- **Platform**: Linux (Kubernetes)
- **pgbouncer**: No
- **SQLAlchemy**: 2.0.23

## Summary

When an asyncpg operation is cancelled via `asyncio.CancelledError` while mid-query, the cancellation mechanism in `connect_utils._cancel` can raise a **built-in `ConnectionError`** that escapes to the caller. This is problematic because:

1. Callers (e.g. SQLAlchemy) expect asyncpg-specific exception types and don't handle built-in `ConnectionError`
2. The cancel operation is inherently best-effort — if the cancel connection fails, the error should be suppressed or wrapped, not propagated

This is related to #1211 but occurs on **non-`direct_tls`** connections via the cancel request code path.

## Reproduction flow

1. An asyncpg connection is executing a query (e.g. inside SQLAlchemy's `session.execute()`)
2. The asyncio task is cancelled (`task.cancel()`)
3. `CancelledError` propagates into `protocol.query()` / `bind_execute`
4. asyncpg's cancellation handler tries to send a PostgreSQL cancel request by opening a **new** SSL connection via `connect_utils._cancel` → `_create_ssl_connection`
5. The new connection fails (server already closed the original, or network issue)
6. `TLSUpgradeProto.connection_lost()` raises built-in `ConnectionError('unexpected connection_lost() call')`
7. This escapes through `connect_utils._cancel` (which has **no error handling** around `_create_ssl_connection`)
8. Caller receives `ConnectionError` instead of `CancelledError`

## Traceback

```
asyncio.exceptions.CancelledError  (original exception)

During handling of the above exception, another exception occurred:

  File "asyncpg/transaction.py", line 206, in __rollback
    await self._connection.execute(query)
  File "asyncpg/connection.py", line 350, in execute
    result = await self._protocol.query(query, timeout)
  File "asyncpg/connection.py", line 1584, in _cancel
    await connect_utils._cancel(
  File "asyncpg/connect_utils.py", line 1040, in _cancel
    tr, pr = await _create_ssl_connection(
  File "asyncpg/connect_utils.py", line 752, in _create_ssl_connection
    do_ssl_upgrade = await pr.on_data
                     ^^^^^^^^^^^^^^^^
ConnectionError: unexpected connection_lost() call
```

## Root cause

Two issues in `connect_utils.py`:

### 1. `_cancel()` has no error handling around `_create_ssl_connection`

```python
async def _cancel(*, loop, addr, params, backend_pid, backend_secret):
    ...
    if params.ssl and params.sslmode != SSLMode.allow:
        tr, pr = await _create_ssl_connection(...)  # ← no try/except!
    ...
```

The cancel request is best-effort (we're telling PostgreSQL to cancel a query on a connection that may already be dead). If opening the cancel connection fails, the error should be suppressed or wrapped in `asyncpg.InterfaceError`, not propagated as a raw `ConnectionError`.

### 2. `TLSUpgradeProto.connection_lost()` raises built-in `ConnectionError`

```python
def connection_lost(self, exc):
    if not self.on_data.done():
        if exc is None:
            exc = ConnectionError('unexpected connection_lost() call')
        self.on_data.set_exception(exc)
```

This raises a **built-in Python `ConnectionError`**, not an asyncpg exception type. Callers like SQLAlchemy check for `asyncpg.InterfaceError` or `asyncpg.PostgresError` to detect disconnects. A built-in `ConnectionError` bypasses all those checks, which means:
- SQLAlchemy's `is_disconnect()` doesn't recognize it
- SQLAlchemy's pool pre-ping handler (`_do_ping_w_event`) only catches `self.loaded_dbapi.Error`, so `ConnectionError` escapes
- The pool's retry logic (which would create a fresh connection) never triggers

## Suggested fix

**Option A** (minimal): Catch `OSError` (parent of `ConnectionError`) in `connect_utils._cancel()` and suppress it — cancel is best-effort:

```python
async def _cancel(*, loop, addr, params, backend_pid, backend_secret):
    ...
    try:
        if params.ssl and params.sslmode != SSLMode.allow:
            tr, pr = await _create_ssl_connection(...)
        ...
    except OSError:
        # Cancel is best-effort. If we can't reach the server, the
        # connection is dead anyway.
        return
```

**Option B** (comprehensive): Also change `TLSUpgradeProto.connection_lost()` to raise `asyncpg.InterfaceError` instead of built-in `ConnectionError`, so callers can handle it consistently:

```python
def connection_lost(self, exc):
    if not self.on_data.done():
        if exc is None:
            exc = InterfaceError('unexpected connection_lost() call')
        self.on_data.set_exception(exc)
```

## Impact

This causes process crashes in production services. When a task is cancelled during a DB query, the `ConnectionError` escapes all exception handlers (which expect either `CancelledError` or asyncpg-specific exceptions) and terminates the process.

This is 100% correlated with `CancelledError` in our logs — every `ConnectionError: unexpected connection_lost()` we've seen is triggered by task cancellation.

## Additional context

We use Google CloudSQL with SSL connections. The PostgreSQL server is accessed over SSL (non-`direct_tls`), which means the cancel code path goes through `_create_ssl_connection` to establish a new SSL connection for sending the cancel request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConnectionError from _cancel() during CancelledError not caught, crashes callers #1310

Environment

Summary

Reproduction flow

Traceback

Root cause

1. `_cancel()` has no error handling around `_create_ssl_connection`

2. `TLSUpgradeProto.connection_lost()` raises built-in `ConnectionError`

Suggested fix

Impact

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

ConnectionError from _cancel() during CancelledError not caught, crashes callers #1310

Description

Environment

Summary

Reproduction flow

Traceback

Root cause

1. _cancel() has no error handling around _create_ssl_connection

2. TLSUpgradeProto.connection_lost() raises built-in ConnectionError

Suggested fix

Impact

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `_cancel()` has no error handling around `_create_ssl_connection`

2. `TLSUpgradeProto.connection_lost()` raises built-in `ConnectionError`