Skip to content

fix(polardb): escape user inputs in get_children_with_embeddings to prevent Cypher/SQL injection (CWE-89)#1638

Open
sebastiondev wants to merge 1 commit intoMemTensor:mainfrom
sebastiondev:fix/cwe89-polardb-get-e626
Open

fix(polardb): escape user inputs in get_children_with_embeddings to prevent Cypher/SQL injection (CWE-89)#1638
sebastiondev wants to merge 1 commit intoMemTensor:mainfrom
sebastiondev:fix/cwe89-polardb-get-e626

Conversation

@sebastiondev
Copy link
Copy Markdown

Summary

Fixes a Cypher/SQL injection vulnerability (CWE-89) in PolarDBGraphDB.get_children_with_embeddings in src/memos/graph_dbs/polardb.py. The id and user_name parameters were interpolated directly into a Cypher query string via f-strings without escaping, allowing an attacker who controls either value to break out of the quoted literal and inject arbitrary Cypher.

Vulnerable code (before)

where_user = f"AND p.user_name = '{user_name}' AND c.user_name = '{user_name}'"

query = f"""
    ...
    MATCH (p:Memory)-[r:PARENT]->(c:Memory)
    WHERE p.id = '{id}' {where_user}
    ...
"""

A value such as ' OR '1'='1 supplied as id or user_name closes the literal and changes the query semantics. In a multi-tenant deployment this would allow a user to read children of memories belonging to other users (cross-tenant data exposure), or to alter the query in other ways depending on what subsequent SQL the wrapper executes.

Fix

Apply the existing module-level escape_sql_string() helper (single-quote doubling) to both interpolated values before they are placed into the query. This matches the pattern already used in 30+ other call sites in the same file.

safe_id = escape_sql_string(id)
safe_user = escape_sql_string(user_name) if user_name else user_name
where_user = f"AND p.user_name = '{safe_user}' AND c.user_name = '{safe_user}'"
...
WHERE p.id = '{safe_id}' {where_user}

Diff is 4 lines added, 2 removed — minimal and consistent with existing project conventions.

Why this is exploitable

  • get_children_with_embeddings is part of the BaseGraphDB interface and is invoked by higher-level memory APIs where id (a memory identifier) and user_name (a tenant identifier) may be derived from user-influenced input — particularly in multi-tenant setups.
  • No prior validation strips quotes from these values along the call paths I traced; the function trusts its inputs.
  • The other PolarDB methods in this file already use escape_sql_string for the same kinds of values, which both confirms the project considers this the right mitigation and means this function was an inconsistency / oversight.

Adversarial review

Before submitting, I tried to disprove this finding. I checked whether: (a) the input values are constrained by an upstream schema or sanitizer (they aren't — they flow through as plain strings), (b) the AGE/Cypher layer would itself reject the injected payload (it doesn't — quote-breakout payloads parse as valid Cypher), and (c) other PolarDB methods already protect themselves so the fix would be redundant (they do, but this specific method does not, which is the entire problem). I also considered that escape_sql_string only handles ' and not backslash escapes, so it isn't full defense in depth — parameterized queries would be stronger — but it does close the trivially exploitable single-quote breakout, which is the realistic attack, and keeps the change consistent with the rest of the file. A follow-up to move the whole module to parameterized AGE queries would be welcome but is out of scope here.

Testing

  • Verified the changed function still compiles and imports cleanly.
  • Confirmed escape_sql_string is defined at module scope (line 96) and is the same helper used elsewhere in polardb.py.
  • Manually re-derived the resulting query with a malicious input (id="x' OR '1'='1") and confirmed the escaped form produces 'x'' OR ''1''=''1', which AGE parses as a literal string and not as injected Cypher.
  • The diff touches only this one function; no other behavior changes.

References

  • CWE-89: Improper Neutralization of Special Elements used in an SQL Command
  • File: src/memos/graph_dbs/polardb.py, function get_children_with_embeddings (around line 1171)

cc @lewiswigmore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant