Skip to content

Fix memory leak in OTEL traces#1639

Open
michael-rubel wants to merge 3 commits intolangfuse:mainfrom
michael-rubel:feat/fix-memory-leak-in-otel-traces
Open

Fix memory leak in OTEL traces#1639
michael-rubel wants to merge 3 commits intolangfuse:mainfrom
michael-rubel:feat/fix-memory-leak-in-otel-traces

Conversation

@michael-rubel
Copy link
Copy Markdown

@michael-rubel michael-rubel commented Apr 23, 2026

Fixes #12164

Disclaimer: Experimental PR review

Greptile Summary

This PR fixes a memory issue in _find_and_process_media by switching from a "permanently mark all visited nodes" strategy to a DFS path-based tracking strategy: each container (list, dict, Pydantic model) now adds its id to seen before recursing and discards it in a finally block on return. This keeps seen bounded to the current call-stack depth rather than growing monotonically. The PR also adds explicit Pydantic v1 and v2 model handling, using __pydantic_fields__ (Pydantic v2-specific) as a precise guard.

Confidence Score: 5/5

PR is safe to merge — the memory fix is logically correct and prior review concerns have been addressed.

The DFS backtracking approach (add on enter, discard on exit via try/finally) is the correct fix for keeping the seen set bounded. Pydantic v1/v2 detection is precise. No P0 or P1 issues remain; the previous review concerns about cycle protection and the model_dump heuristic have both been addressed in this revision.

No files require special attention.

Important Files Changed

Filename Overview
langfuse/_task_manager/media_manager.py Switches seen-set tracking to DFS backtracking (add on enter, discard on exit via try/finally) for list, dict, and Pydantic models; adds explicit Pydantic v1/v2 serialisation paths; overall logic is sound

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_process_data_recursively] --> B{level > max_levels?}
    B -- yes --> Z[return data]
    B -- no --> C{LangfuseMedia?}
    C -- yes --> D[process_media + return]
    C -- no --> E{base64 string?}
    E -- yes --> F[create LangfuseMedia + return]
    E -- no --> G{Anthropic dict pattern?}
    G -- yes --> H[copy dict, replace data field + return]
    G -- no --> I{Vertex dict pattern?}
    I -- yes --> J[copy dict, replace data field + return]
    I -- no --> K{isinstance list?}
    K -- yes --> L{id in seen?}
    L -- yes --> Z
    L -- no --> M[seen.add, recurse items, seen.discard in finally]
    M --> Z
    K -- no --> O{isinstance dict?}
    O -- yes --> P{id in seen?}
    P -- yes --> Z
    P -- no --> Q[seen.add, recurse values, seen.discard in finally]
    Q --> Z
    O -- no --> S{Pydantic v2 model?}
    S -- yes --> T{id in seen?}
    T -- yes --> Z
    T -- no --> U[seen.add, model_dump, recurse, seen.discard in finally]
    U --> Z
    S -- no --> W{Pydantic v1 model?}
    W -- yes --> X{id in seen?}
    X -- yes --> Z
    X -- no --> Y[seen.add, model.dict, recurse, seen.discard in finally]
    Y --> Z
    W -- no --> Z
Loading

Reviews (2): Last reviewed commit: "fix: memory leak in OTEL traces" | Re-trigger Greptile

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 23, 2026

CLA assistant check
All committers have signed the CLA.

Comment thread langfuse/_task_manager/media_manager.py
Comment thread langfuse/_task_manager/media_manager.py Outdated
- address AI review
@hassiebp
Copy link
Copy Markdown
Contributor

@claude review

Comment thread langfuse/_task_manager/media_manager.py
- apply fix after claude review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python SDK: Not all base64 inputs are processed by media manager

3 participants