Skip to content

4.0.3 pooled request race can still leak exit:{normal,_} to caller #861

@mashton

Description

@mashton

On 4.0.3 we are seeing regular caller crashes caused by normal EXITs bubbling from gen_statem:call paths, instead of receiving an error tuple.

This appears to be related to the pooled connection race handling described here:

hackney/src/hackney_conn.erl

Lines 1448 to 1453 in c93ce0a

%% Pooled connections used to stop immediately here, but that made
%% late-arriving {call, From, {request, _}} messages from workers that
%% raced the pool checkout race a terminating gen_statem — which
%% surfaces as `exit:{normal, _}` in the caller (issue #836). Stay
%% alive briefly so those late calls get a proper `{error, {closed, _}}`
%% reply via handle_common's invalid_state fallback, then stop.

This became visible immediately after upgrading from 4.0.2 -> 4.0.3. As mitigation, we were able to simply disable pooling for the requests in question. This seems to have been a successful mitigation.

Could 4.0.3 timing or dependency changes make this race more visible?

hackney 4.0.3
erlang 27.3

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions