[Repo Assist] fix: use string key for column lookup in conditional_MI (fixes KeyError with multi-char column names) by github-actions[bot] · Pull Request #1455 · py-why/dowhy

github-actions · 2026-04-14T13:20:10Z

🤖 This is an automated fix from Repo Assist.

Closes #949

Root Cause

In dowhy/utils/cit.py, conditional_MI used data[list(x)] and data[list(y)] to access columns. When x and y are string column names (as they always are when called from GraphRefuter.conditional_mutual_information), list(x) iterates over individual characters:

list('Foo')  # → ['F', 'o', 'o']
data[['F', 'o', 'o']]  # → KeyError: "None of [Index(['F', 'o', 'o'], ...)] are in the [columns]"

This means any column name longer than one character would trigger the error.

Fix

Change data[list(x)] → data[x] and data[list(y)] → data[y], which performs standard single-column Series lookup. This is correct since x and y are scalar string column names.

Trade-offs

The fix is minimal and surgical — no behaviour change for the entropy calculation (iterating a Series yields values, which is what the downstream zip / format-string logic expects).
Single-character column names continue to work correctly.

Test Status

New tests added in tests/causal_refuters/test_graph_refuter.py:

test_conditional_mi_multi_char_column_names — regression test confirming multi-char names no longer raise KeyError
test_conditional_mi_single_char_column_names — ensures single-char names still work
test_graph_refuter_with_multi_char_columns — end-to-end refute_model call with multi-char columns

All new tests pass locally. Formatting (black, isort) and linting (flake8) pass on changed files.

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run
gh aw add githubnext/agentics/workflows/repo-assist.md@11c9a2c442e519ff2b427bf58679f5a525353f76

When x or y are string column names passed to conditional_MI, calling list(x) iterates over individual characters (e.g. 'Foo' becomes ['F','o','o']) rather than treating the string as a key. Fix: use data[x] and data[y] (Series lookup) instead of data[list(x)] and data[list(y)] (multi-key DataFrame lookup). Add tests to cover multi-character column names in GraphRefuter. Closes #949 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

github-actions bot added automation repo-assist labels Apr 14, 2026

github-actions bot mentioned this pull request Apr 14, 2026

Error handling column names in CIT as used by CausalModel.graph_refute method #949

Open

emrekiciman marked this pull request as ready for review April 14, 2026 23:14

github-actions bot mentioned this pull request Apr 15, 2026

[Repo Assist] Monthly Activity 2026-04 #1433

Open

41 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Repo Assist] fix: use string key for column lookup in conditional_MI (fixes KeyError with multi-char column names)#1455

[Repo Assist] fix: use string key for column lookup in conditional_MI (fixes KeyError with multi-char column names)#1455
github-actions[bot] wants to merge 1 commit intomainfrom
repo-assist/fix-issue-949-graph-refute-column-names-cit-a62690f05e7d1267

github-actions bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

github-actions bot commented Apr 14, 2026

Root Cause

Fix

Trade-offs

Test Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants