On 4.0.3 we are seeing regular caller crashes caused by normal EXITs bubbling from gen_statem:call paths, instead of receiving an error tuple.
This appears to be related to the pooled connection race handling described here:
|
%% Pooled connections used to stop immediately here, but that made |
|
%% late-arriving {call, From, {request, _}} messages from workers that |
|
%% raced the pool checkout race a terminating gen_statem — which |
|
%% surfaces as `exit:{normal, _}` in the caller (issue #836). Stay |
|
%% alive briefly so those late calls get a proper `{error, {closed, _}}` |
|
%% reply via handle_common's invalid_state fallback, then stop. |
This became visible immediately after upgrading from 4.0.2 -> 4.0.3. As mitigation, we were able to simply disable pooling for the requests in question. This seems to have been a successful mitigation.
Could 4.0.3 timing or dependency changes make this race more visible?
hackney 4.0.3
erlang 27.3
On 4.0.3 we are seeing regular caller crashes caused by normal EXITs bubbling from gen_statem:call paths, instead of receiving an error tuple.
This appears to be related to the pooled connection race handling described here:
hackney/src/hackney_conn.erl
Lines 1448 to 1453 in c93ce0a
This became visible immediately after upgrading from 4.0.2 -> 4.0.3. As mitigation, we were able to simply disable pooling for the requests in question. This seems to have been a successful mitigation.
Could 4.0.3 timing or dependency changes make this race more visible?
hackney 4.0.3
erlang 27.3