-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix memory leak: clear managed dict in pybind11_object_dealloc on Python 3.13+ #5999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -11,6 +11,12 @@ | |||||||||||||||||
| #include "constructor_stats.h" | ||||||||||||||||||
| #include "pybind11_tests.h" | ||||||||||||||||||
|
|
||||||||||||||||||
| #if !defined(PYPY_VERSION) | ||||||||||||||||||
| // Flag set by the capsule destructor in test_dynamic_attr_dealloc_frees_dict_contents. | ||||||||||||||||||
| // File scope so the captureless capsule destructor (void(*)(void*)) can access it. | ||||||||||||||||||
| static bool s_dynamic_attr_capsule_freed = false; | ||||||||||||||||||
| #endif | ||||||||||||||||||
|
|
||||||||||||||||||
| #if !defined(PYBIND11_OVERLOAD_CAST) | ||||||||||||||||||
| template <typename... Args> | ||||||||||||||||||
| using overload_cast_ = pybind11::detail::overload_cast_impl<Args...>; | ||||||||||||||||||
|
|
@@ -388,6 +394,24 @@ TEST_SUBMODULE(methods_and_attributes, m) { | |||||||||||||||||
|
|
||||||||||||||||||
| class CppDerivedDynamicClass : public DynamicClass {}; | ||||||||||||||||||
| py::class_<CppDerivedDynamicClass, DynamicClass>(m, "CppDerivedDynamicClass").def(py::init()); | ||||||||||||||||||
|
|
||||||||||||||||||
| // test_dynamic_attr_dealloc_frees_dict_contents | ||||||||||||||||||
| // Regression test: pybind11_object_dealloc() must call PyObject_ClearManagedDict() | ||||||||||||||||||
| // before tp_free() so that objects stored in a py::dynamic_attr() instance __dict__ | ||||||||||||||||||
| // have their refcounts decremented when the pybind11 object is freed. On Python 3.14+ | ||||||||||||||||||
| // tp_free no longer implicitly clears the managed dict, causing permanent leaks. | ||||||||||||||||||
| m.def("make_dynamic_attr_with_capsule", []() -> py::object { | ||||||||||||||||||
| s_dynamic_attr_capsule_freed = false; | ||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was wondering if there is an easy way to avoid the global variable (and the subtle pitfalls that come with globals). Cursor then pointed out this:
def test_dynamic_attr_dealloc_frees_dict_contents():
freed = [False]
class Flag:
def __del__(self):
freed[0] = True
obj = m.DynamicClass()
obj.flag = Flag()
del obj
pytest.gc_collect()
assert freed[0]This would remove the need for It should work because Could you please try this out?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nope, this won't work, this test always passes, event if I comment out fix. Here is claude summary: Here's what happens with the commented-out fix on Python 3.14:After del obj → pybind11_object_dealloc:
Now pytest.gc_collect() runs the GC. The GC initializes gc_refs = ob_refcnt for all tracked objects and traverses them to subtract internal references:
After this phase:
The dict stays perpetually "reachable" (gc_refs=1 from the dangling reference in freed memory), so it leaks permanently along with the capsule. The Flag gets rescued as a side effect of the GC's traversal decrementing its to 0 — which masks the exact bug we're testing for. The pure Python test is therefore not a valid regression test for this bug. We need to revert to the capsule approach.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Confirmed (Cursor Opus 4.6 1M Thinking): |
||||||||||||||||||
| auto *dummy = new int(0); | ||||||||||||||||||
| py::capsule cap(dummy, [](void *ptr) { | ||||||||||||||||||
| delete static_cast<int *>(ptr); | ||||||||||||||||||
| s_dynamic_attr_capsule_freed = true; | ||||||||||||||||||
| }); | ||||||||||||||||||
| py::object obj = py::cast(new DynamicClass(), py::return_value_policy::take_ownership); | ||||||||||||||||||
| obj.attr("data") = cap; | ||||||||||||||||||
| return obj; | ||||||||||||||||||
| }); | ||||||||||||||||||
| m.def("is_dynamic_attr_capsule_freed", []() { return s_dynamic_attr_capsule_freed; }); | ||||||||||||||||||
| #endif | ||||||||||||||||||
|
|
||||||||||||||||||
| // test_bad_arg_default | ||||||||||||||||||
|
|
||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -383,6 +383,23 @@ def test_cyclic_gc(): | |
| assert cstats.alive() == 0 | ||
|
|
||
|
|
||
| @pytest.mark.xfail("env.PYPY", strict=False) | ||
| @pytest.mark.skipif("env.GRAALPY", reason="Cannot reliably trigger GC") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't this be xfail strict=False for flakey tests?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Skipping it is fine if passing vs. failing doesn't mean anything. If you hope it always passes some day, then yes.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, should I modify it or not?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For GRAALPY, it's fine. If the comment was that the PYPY xfail needed a strict=False, that would be fine, though I think due to the way the GC works in PyPy this will always be failed. And PyPy is dead now so we don't need to worry about that changing in the future.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, done, |
||
| def test_dynamic_attr_dealloc_frees_dict_contents(): | ||
| """Regression: py::dynamic_attr() objects must free __dict__ contents on dealloc. | ||
|
|
||
| pybind11_object_dealloc() did not call PyObject_ClearManagedDict() before tp_free(), | ||
| causing objects stored in __dict__ to have their refcounts permanently abandoned on | ||
| Python 3.14+ (where tp_free no longer implicitly clears the managed dict). | ||
| This caused capsule destructors to never run, leaking the underlying C++ data. | ||
| """ | ||
| instance = m.make_dynamic_attr_with_capsule() | ||
| assert not m.is_dynamic_attr_capsule_freed() | ||
| del instance | ||
| pytest.gc_collect() | ||
| assert m.is_dynamic_attr_capsule_freed() | ||
|
|
||
|
|
||
| def test_bad_arg_default(msg): | ||
| from pybind11_tests import detailed_error_messages_enabled | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Opus 4.6 1M Thinking prompt and response:
Question
There is a small asymmetry between:
include/pybind11/detail/class.hline 507:tests/test_methods_and_attributes.cppline 401+:Should it be Python 3.13+ also in
test_methods_and_attributes.cpp?Analysis
The code fix in
class.his gated on 3.13+ (#if PY_VERSION_HEX >= 0x030D0000), which is correct becausePyObject_ClearManagedDictandPy_TPFLAGS_MANAGED_DICTwere introduced in 3.13, and calling it is safe (idempotent) even if 3.13 still does the implicit clearing. The PR description itself says "On Python <= 3.13,PyObject_GC_Del... cleared the managed dict as a side effect. On Python 3.14 this implicit clearing was removed."So the comment in the test is more precisely correct about when the bug manifests (3.14+), while the code fix in
class.his correct about when the guard is needed (3.13+, since that's when the API exists).The test comment should say "3.13+" to be consistent with the fix's comment and the
#ifguard. The leak behavior might only be observable on 3.14+, but the comment is describing the fix rationale, not the exact Python version where the bug appears. Saying "3.13+" keeps both comments aligned and avoids confusing a reader who sees0x030D0000(3.13) in the code but "3.14+" in the test comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I made the comments aligned. I do not know your policy: should I just add new commits or squash them and force-push?