Skip to content

Heap-type refcount imbalance: 19 dealloc sites miss Py_DECREF(Py_TYPE(self)) + 4 factories bypass tp_alloc (type-object leak, latent crash) #294

@devdanzin

Description

@devdanzin

Summary

Three interlocking heap-type refcount bugs, all in the same family and all requiring a single atomic fix:

  1. All 19 heap-type deallocs miss Py_DECREF(Py_TYPE(self)). tp_alloc implicitly Py_INCREFs the type when creating an instance; the mirror Py_DECREF in dealloc is absent everywhere. Every instance destruction leaks one type reference. The type object and its slot/method tables are never freed.
  2. All 19 deallocs call PyObject_Del(self) instead of Py_TYPE(self)->tp_free(self). PyObject_Del is a raw memory-free that bypasses any tp_free a subclass might install, breaking subclass correctness.
  3. 4 factory functions allocate with PyObject_New(T, type) instead of type->tp_alloc(type, 0). PyObject_New skips the implicit Py_INCREF(type) that tp_alloc performs. Currently masked by (1) — the two bugs cancel out for factory-produced instances. Fixing only (1) will crash those factories with a negative type refcount.

Fixing (1) + (2) + (3) together is the only safe shape.

Impact

  • Severity: Heap-type object leak — the type object, its tp_methods, tp_slots, and any captured descriptors stay alive for the lifetime of the interpreter. Not a crash today, but becomes a crash (negative refcount abort) if only (1) is patched.
  • Reachability: Any program that creates and destroys one of the 19 types — essentially every use of zstandard. The leak-per-instance is small (one PyObject *), but the type object itself includes large pre-computed structures (tp_cache, method tables), and heap types are per-interpreter in multi-interpreter setups.
  • Version: 0.25.0 (commit 7a77a75).
  • Platform: Language-level bug, all platforms.

Reproducer

import zstandard, sys

T = zstandard.ZstdCompressionParameters
rc0 = sys.getrefcount(T)
for _ in range(1000):
    p = zstandard.ZstdCompressionParameters.from_level(3)
    del p
rc1 = sys.getrefcount(T)
print(f"type refcount growth: {rc1 - rc0}")   # 1000

The refcount grows by 1 per instance creation. from_level() is chosen because it goes through one of the 4 PyObject_New factories — directly-constructed types show the same leak through their tp_alloc path.

Root cause

Heap types (created via PyType_FromSpec) are themselves reference-counted Python objects, unlike static types. The ownership protocol is:

  • tp_alloc does Py_INCREF(type) implicitly (instance holds a strong ref to its type).
  • dealloc must do Py_DECREF(Py_TYPE(self)) after freeing the instance.

The canonical heap-type dealloc:

static void T_dealloc(T *self) {
    PyTypeObject *tp = Py_TYPE(self);
    /* Py_CLEAR members, free library state ... */
    tp->tp_free(self);
    Py_DECREF(tp);
}

zstandard's 19 deallocs follow the pattern:

static void T_dealloc(T *self) {
    /* Py_CLEAR members, free library state ... */
    PyObject_Del(self);          /* bug 2: bypasses tp_free */
                                 /* bug 1: no Py_DECREF(tp) */
}

And the 4 factories do:

T *obj = PyObject_New(T, type);   /* bug 3: skips Py_INCREF(type) */

Bug (1) leaks the type; bug (3) silently under-refs the type, and they happen to cancel for factory-produced instances only. Directly-constructed instances (via __call__) leak one type ref each with no compensation.

Affected sites

19 dealloc sites (bugs 1 + 2)

File Line Function
c-ext/compressor.c 248 ZstdCompressor_dealloc
c-ext/decompressor.c 117 Decompressor_dealloc
c-ext/compressionwriter.c 13 ZstdCompressionWriter_dealloc
c-ext/decompressionwriter.c 13 ZstdDecompressionWriter_dealloc
c-ext/compressionreader.c 13 compressionreader_dealloc
c-ext/decompressionreader.c 13 decompressionreader_dealloc
c-ext/compressoriterator.c 15 ZstdCompressorIterator_dealloc
c-ext/decompressoriterator.c 15 ZstdDecompressorIterator_dealloc
c-ext/compressobj.c 13 ZstdCompressionObj_dealloc
c-ext/decompressobj.c 13 DecompressionObj_dealloc
c-ext/compressionchunker.c 14 ZstdCompressionChunkerIterator_dealloc
c-ext/compressionchunker.c 156 ZstdCompressionChunker_dealloc
c-ext/compressiondict.c 209 ZstdCompressionDict_dealloc
c-ext/compressionparams.c 390 ZstdCompressionParameters_dealloc
c-ext/frameparams.c 58 FrameParameters_dealloc
c-ext/bufferutil.c 13 BufferWithSegments_dealloc
c-ext/bufferutil.c 271 BufferSegments_dealloc
c-ext/bufferutil.c 307 BufferSegment_dealloc
c-ext/bufferutil.c 362 BufferWithSegmentsCollection_dealloc

4 factory sites (bug 3)

File Line Factory
c-ext/compressionparams.c 353 ZstdCompressionParameters.from_level
c-ext/compressiondict.c 127 train_dictionary
c-ext/frameparams.c 43 get_frame_parameters
c-ext/bufferutil.c 134 BufferWithSegments.FromMemory

Suggested fix

Both sides must land in the same commit.

Fix 1 + 2 — every dealloc

static void
ZstdCompressor_dealloc(ZstdCompressor *self)
{
    PyTypeObject *tp = Py_TYPE(self);
    /* existing cleanup: Py_CLEAR(...), ZSTD_freeCCtx(...), etc. */
    tp->tp_free(self);    /* was: PyObject_Del(self) */
    Py_DECREF(tp);        /* NEW — mirrors tp_alloc's implicit INCREF */
}

Apply to all 19 sites.

Fix 3 — every PyObject_New factory

/* was: */
    ZstdCompressionParameters *obj = PyObject_New(ZstdCompressionParameters, type);
/* use: */
    ZstdCompressionParameters *obj =
        (ZstdCompressionParameters *)type->tp_alloc(type, 0);

tp_alloc handles the Py_INCREF(type), zero-initialization, and (if the type later adopts GC) the GC registration. An alternative is keeping PyObject_New and adding an explicit Py_INCREF(type) immediately after — slightly less idiomatic but behaviorally equivalent.

Verification

A one-liner test can be added to the suite to prevent regression:

def test_type_refcount_stable():
    import sys, zstandard
    T = zstandard.ZstdCompressionParameters
    rc0 = sys.getrefcount(T)
    for _ in range(1000):
        zstandard.ZstdCompressionParameters.from_level(3)
    assert sys.getrefcount(T) - rc0 < 10   # allow some slack for caches

Related / follow-ups

  • If zstandard eventually adopts GC support for these types (Py_TPFLAGS_HAVE_GC + tp_traverse / tp_clear), tp_alloc handles GC registration whereas PyObject_New does not — another reason to prefer the tp_alloc shape.

Methodology

Found via cext-review-toolkit (Tree-sitter-based static analysis with structured naive/informed review passes). Leak verified live on CPython 3.14.3 debug build — sys.getrefcount(ZstdCompressionParameters) grows by exactly 1 per instance, regardless of whether the instance is created via __call__ or via the from_level factory. Happy to open a PR — I'd propose a single PR combining all 23 sites so the fix lands atomically (a split would introduce a window where one half of the fix exists without the other, triggering the negative-refcount crash).

Discovery, root-cause analysis, and issue drafting were performed by Claude Code and reviewed by a human before filing.

Full report

Complete multi-agent analysis (48 FIX findings across 13 categories, plus a reproducer appendix): https://gist.github.com/devdanzin/b86039ac097141579590c1a0f3a43605

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions