Skip to content

marshal.dumps() crashes when an item's __buffer__ concurrently mutates the container #151370

@tonghuaroot

Description

@tonghuaroot

Bug report

marshal.dumps() can crash the interpreter when serializing a list, dict
or set containing an item that supports the buffer protocol, if the item's
__buffer__() (PEP 688) concurrently mutates the container being serialized.

w_complex_object() hands each item to w_object(), which for a buffer item
reaches PyObject_GetBuffer() and runs the item's __buffer__() — arbitrary
Python that can clear, shrink, grow, or drop the last reference to the
container (or to a borrowed key/value) while it is still being iterated.

Reproducer

import marshal

class Evil:
    def __buffer__(self, flags):
        container.clear()        # mutate the container mid-serialization
        return memoryview(bytearray(4))

container = {Evil(), 1, 2, 3}    # also reproduces with list and dict
marshal.dumps(container)

On a debug build the set case aborts at assert(i == n); the dict and
list cases segfault through a use-after-free or an out-of-bounds read, and a
set whose element instead grows the set writes past the pairs buffer that
was pre-sized to the original length.

Notes

This is a robustness issue, not a security vulnerability: triggering it
requires a custom __buffer__() — i.e. the ability to run arbitrary in-process
Python — and marshal is
documented as not intended
for serializing untrusted data.

Same family as the recently fixed bytes.join crash in gh-151295.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions