Skip to content

bytes.join()/bytearray.join() crash (use-after-free) when an item's __buffer__ mutates the joined sequence #151295

@tonghuaroot

Description

@tonghuaroot

Bug report

Bug description

bytes.join() / bytearray.join() (sharing STRINGLIB(bytes_join) in
Objects/stringlib/join.h) can crash with a use-after-free when a non-bytes
item's __buffer__() runs Python code that drops the last reference to that
item.

In the pre-pass that collects buffers, the fast path for exact bytes
borrows a reference with Py_NewRef(item), but the general buffer path calls
PyObject_GetBuffer(item, ...) on a borrowed item without holding a
reference of its own. PyObject_GetBuffer() invokes item.__buffer__()
(PEP 688, overridable in pure Python since 3.12). If that callback mutates the
sequence being joined so that item's last reference is dropped (e.g.
list.clear()), item is freed while the buffer machinery is still using it.

The existing "sequence changed size during iteration" recheck at the end of
the loop body does not help: the use-after-free happens inside the current
iteration's PyObject_GetBuffer() call, before the recheck runs.

Reproducer

class Item:
    def __buffer__(self, flags):
        L.clear()                 # drops the only reference to this Item()
        return memoryview(b'x')

L = [Item()]
b"".join(L)                       # crash (SIGSEGV) on an affected build

bytearray().join([b'a', Item(), b'c']) crashes the same way.

On a debug build the faulting access lands on freed-memory fill bytes
(0xdd...) in bufferwrapper_releasebuf, confirming a use-after-free rather
than a benign out-of-bounds read.

Impact

Crash-robustness only. The caller has to feed an object with a malicious
__buffer__ into its own join() call, so this is a self-inflicted crash of
the caller's own process, not a security/trust-boundary issue. It is the same
class of re-entrant __buffer__ mutation bug as gh-143988 (sendmsg/recvmsg).

Fix

Hold a reference to item across PyObject_GetBuffer() in the buffer path
(Py_INCREF before, Py_DECREF on both success and failure), mirroring the
Py_NewRef(item) already used by the exact-bytes fast path. After the fix
the reproducer raises RuntimeError: sequence changed size during iteration
instead of crashing, and benign __buffer__ items still join correctly.

Your environment

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions