Summary
Confirmed production occurrence of the bug described in #2644 (self-closed by the original reporter before they could follow through). Filing to provide a second validated repro and draw attention to the existing draft fix in #2660.
Environment
mcp==1.26.0
anyio==4.13.0
httpx==0.28.1
- Python 3.11.14, macOS
What happens
When a gateway session starts and connects to multiple OAuth-authenticated MCP servers concurrently (Notion + TinyFish, both OAuth 2.1 PKCE), Notion fails intermittently:
ERROR asyncio: Task exception was never retrieved
RuntimeError: The current task is not holding this lock
File ".../mcp/client/auth/oauth2.py", line 503, in async_auth_flow
File ".../mcp/client/auth/oauth2.py", line 484, in async_auth_flow
raise RuntimeError("The current task is not holding this lock")
WARNING tools.mcp_tool: MCP server 'notion' connection lost (attempt 1/5), reconnecting in 1s
WARNING tools.mcp_tool: Failed to connect to MCP server 'notion': CancelledError
INFO tools.mcp_tool: MCP: registered 114 tool(s) from 4 server(s) (1 failed)
Notion is affected more often than TinyFish because its OAuth token refreshes frequently (~every 15–60 min), consistently triggering the refresh yield path in async_auth_flow. TinyFish tokens expire less often and typically take the happy path (add header, yield once).
Root cause
OAuthContext.lock is anyio.Lock, which records task identity at acquire() and enforces same-task release(). async_auth_flow is an async generator that holds this lock across yield points. When httpx drives the generator from a different task during concurrent connections, anyio.Lock.release() throws.
Existing draft fix
PR #2660 addresses this correctly by narrowing lock scope so no lock is held across yields — GET SSE long-polls and token refresh yields both run outside any lock. It has full test coverage (100% on oauth2.py, 1177 passed) and no breaking changes, but has been sitting as a draft without maintainer review since May 22.
Workaround applied locally
Replacing anyio.Lock with asyncio.Lock in OAuthContext stops the error since asyncio.Lock does not enforce task identity on release. This is a bandaid — it loses trio portability — but unblocks asyncio deployments. Note: this is a mechanical fix and has not been load-tested to exhaustion; the intermittent nature of the bug means full verification requires sustained concurrent load.
Request
Could a maintainer review and merge PR #2660? The fix is principled and well-tested.
Summary
Confirmed production occurrence of the bug described in #2644 (self-closed by the original reporter before they could follow through). Filing to provide a second validated repro and draw attention to the existing draft fix in #2660.
Environment
mcp==1.26.0anyio==4.13.0httpx==0.28.1What happens
When a gateway session starts and connects to multiple OAuth-authenticated MCP servers concurrently (Notion + TinyFish, both OAuth 2.1 PKCE), Notion fails intermittently:
Notion is affected more often than TinyFish because its OAuth token refreshes frequently (~every 15–60 min), consistently triggering the refresh
yieldpath inasync_auth_flow. TinyFish tokens expire less often and typically take the happy path (add header, yield once).Root cause
OAuthContext.lockisanyio.Lock, which records task identity atacquire()and enforces same-taskrelease().async_auth_flowis an async generator that holds this lock acrossyieldpoints. When httpx drives the generator from a different task during concurrent connections,anyio.Lock.release()throws.Existing draft fix
PR #2660 addresses this correctly by narrowing lock scope so no lock is held across yields — GET SSE long-polls and token refresh yields both run outside any lock. It has full test coverage (100% on
oauth2.py, 1177 passed) and no breaking changes, but has been sitting as a draft without maintainer review since May 22.Workaround applied locally
Replacing
anyio.Lockwithasyncio.LockinOAuthContextstops the error sinceasyncio.Lockdoes not enforce task identity on release. This is a bandaid — it loses trio portability — but unblocks asyncio deployments. Note: this is a mechanical fix and has not been load-tested to exhaustion; the intermittent nature of the bug means full verification requires sustained concurrent load.Request
Could a maintainer review and merge PR #2660? The fix is principled and well-tested.