Bug report
marshal.dumps() can crash the interpreter when serializing a list, dict
or set containing an item that supports the buffer protocol, if the item's
__buffer__() (PEP 688) concurrently mutates the container being serialized.
w_complex_object() hands each item to w_object(), which for a buffer item
reaches PyObject_GetBuffer() and runs the item's __buffer__() — arbitrary
Python that can clear, shrink, grow, or drop the last reference to the
container (or to a borrowed key/value) while it is still being iterated.
Reproducer
import marshal
class Evil:
def __buffer__(self, flags):
container.clear() # mutate the container mid-serialization
return memoryview(bytearray(4))
container = {Evil(), 1, 2, 3} # also reproduces with list and dict
marshal.dumps(container)
On a debug build the set case aborts at assert(i == n); the dict and
list cases segfault through a use-after-free or an out-of-bounds read, and a
set whose element instead grows the set writes past the pairs buffer that
was pre-sized to the original length.
Notes
This is a robustness issue, not a security vulnerability: triggering it
requires a custom __buffer__() — i.e. the ability to run arbitrary in-process
Python — and marshal is
documented as not intended
for serializing untrusted data.
Same family as the recently fixed bytes.join crash in gh-151295.
Linked PRs
Bug report
marshal.dumps()can crash the interpreter when serializing alist,dictor
setcontaining an item that supports the buffer protocol, if the item's__buffer__()(PEP 688) concurrently mutates the container being serialized.w_complex_object()hands each item tow_object(), which for a buffer itemreaches
PyObject_GetBuffer()and runs the item's__buffer__()— arbitraryPython that can clear, shrink, grow, or drop the last reference to the
container (or to a borrowed
key/value) while it is still being iterated.Reproducer
On a debug build the
setcase aborts atassert(i == n); thedictandlistcases segfault through a use-after-free or an out-of-bounds read, and asetwhose element instead grows the set writes past thepairsbuffer thatwas pre-sized to the original length.
Notes
This is a robustness issue, not a security vulnerability: triggering it
requires a custom
__buffer__()— i.e. the ability to run arbitrary in-processPython — and
marshalisdocumented as not intended
for serializing untrusted data.
Same family as the recently fixed
bytes.joincrash in gh-151295.Linked PRs