Skip to content

MemoryError silently swallowed in 20 sites: 18 PyObject_HasAttrString + 2 unguarded PyErr_Clear after PyObject_GetBuffer (silent data loss in flush() / close()) #299

@devdanzin

Description

@devdanzin

Summary

PyObject_HasAttrString returns 0 for both "attribute absent" and "an exception was raised during lookup" — making it unsafe in the presence of MemoryError, KeyboardInterrupt, or a user-defined __getattr__ that raises. zstandard uses it at 18 sites. Two observable consequences:

  • flush() / close() on the underlying writer's methods are silently skipped on MemoryError → data silently lost.
  • copy_stream argument validation replaces MemoryError with ValueError("first argument must have a read() method").

Two additional sites in the buffer-protocol path call PyErr_Clear() indiscriminately after PyObject_GetBuffer, replacing MemoryError from __buffer__ with a generic TypeError("item N not a bytes like object").

pythoncapi_compat.h (already included in the codebase!) provides PyObject_HasAttrStringWithError, which returns -1 on error — exactly the missing piece.

Impact

  • Severity: Silent data loss (skipped flush() / close()); clobbered exception class/message elsewhere.
  • Reachability: Any program that can raise MemoryError during attribute access or buffer-protocol operations — OOM, memory-constrained environments, code that explicitly raises via __getattr__ / __buffer__.
  • Version: 0.25.0 (commit 7a77a75).

Pattern 1: 18 PyObject_HasAttrString sites

Reproducer (silent data loss):

import zstandard, io

class EvilWriter:
    def __init__(self):
        self._data = io.BytesIO()
    def write(self, data):
        return self._data.write(data)
    def __getattr__(self, name):
        if name in ('flush', 'close'):
            raise MemoryError("OOM in __getattr__")
        raise AttributeError(name)

comp = zstandard.ZstdCompressor()
writer = comp.stream_writer(EvilWriter())
writer.write(b'hello world' * 100)
writer.flush()
# Silently skips EvilWriter.flush (which would raise).
# CPython 3.14 prints "Exception ignored in PyObject_HasAttrString()" to stderr.

Fix:

int r = PyObject_HasAttrStringWithError(writer, "flush");
if (r < 0) {
    return NULL;              /* propagate the exception */
}
if (r) {
    /* has flush — call it */
} else {
    /* no flush — skip */
}

18 sites across compressor, decompressor, reader, writer, and copy_stream validation. I'll enumerate them explicitly in the PR; a grep PyObject_HasAttrString c-ext/*.c gives the full list.

Pattern 2: Unguarded PyErr_Clear after PyObject_GetBuffer — 2 sites

Clears any error from the buffer-protocol call indiscriminately, including MemoryError from __buffer__.

Sites: c-ext/compressor.c:1454, c-ext/decompressor.c:1660.

Reproducer:

import zstandard

class OOMBuffer:
    def __buffer__(self, flags):
        raise MemoryError("OOM in __buffer__")

zstandard.ZstdCompressor().multi_compress_to_buffer([OOMBuffer()])
# TypeError: item 0 not a bytes like object
# Expected: MemoryError: OOM in __buffer__

Fix — only clear if the exception is a genuine buffer-protocol error:

if (PyObject_GetBuffer(obj, &buf, PyBUF_SIMPLE) != 0) {
    if (PyErr_ExceptionMatches(PyExc_TypeError) ||
        PyErr_ExceptionMatches(PyExc_BufferError)) {
        PyErr_Clear();
    } else {
        goto except;         /* MemoryError etc. — propagate */
    }
    /* fall through to set a specific TypeError with the item index */
}

Suggested PR shape

One PR for Pattern 1 (18 mechanical HasAttrStringHasAttrStringWithError migrations) + one PR for Pattern 2 (2 guarded PyErr_Clear calls). They can also land together — the common thread is "don't let the error-handling infrastructure swallow genuine exceptions".

Methodology

Found via cext-review-toolkit (Tree-sitter-based static analysis with structured naive/informed review passes). Pattern 1 reproducer verified live on CPython 3.14.3 debug — writer.flush() silently skips the EvilWriter.flush call; CPython's own "Exception ignored" warning is the only user-visible hint that something went wrong. Pattern 2 reproducer also verified live — MemoryError is replaced by TypeError. Happy to open a PR.

Discovery, root-cause analysis, and issue drafting were performed by Claude Code and reviewed by a human before filing.

Full report

Complete multi-agent analysis (48 FIX findings across 13 categories, plus a reproducer appendix): https://gist.github.com/devdanzin/b86039ac097141579590c1a0f3a43605

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions