[Repo Assist] fix: correct categorical groupby KeyError and pandas FutureWarning in dummy_outcome_refuter by github-actions[bot] · Pull Request #1460 · py-why/dowhy

github-actions · 2026-04-16T13:28:38Z

🤖 This is an automated pull request from Repo Assist, an AI assistant. Please review all changes carefully before merging.

Fixes a KeyError bug and a pandas FutureWarning in DummyOutcomeRefuter._get_treatment_groups().

Closes #1316

Root Cause

In dummy_outcome_refuter.py, the _get_treatment_groups() method has two branches for treatment type: continuous/float (uses pd.cut → "bins" column) and categorical (uses the treatment column directly). The categorical branch contained a copy-paste error:

# Before (categorical branch) — BUGGY
groups = data.groupby(treatment_variable_name)   # ✓ correct
groups = data.groupby("bins")                    # ✗ "bins" never exists here → KeyError

This caused an immediate KeyError whenever DummyOutcomeRefuter was used with a categorical treatment variable.

Additionally, both groupby calls used the default observed=False (deprecated in pandas ≥ 1.5), triggering a FutureWarning about the changing default when grouping CategoricalDtype columns (created by pd.cut).

Fix

Removed the erroneous second data.groupby("bins") in the categorical branch.
Added observed=True to both groupby calls ("bins" for continuous and treatment column for categorical) to suppress the FutureWarning and adopt the modern pandas default. This retains correct behaviour since only non-empty bins are relevant.

# After — fixed
if treatment_is_continuous:
    groups = data.groupby("bins", observed=True)
else:
    groups = data.groupby(treatment_variable_name, observed=True)

Test Status

A regression test test_refutation_dummy_outcome_refuter_default_categorical_treatment was added to tests/causal_refuters/test_dummy_outcome_refuter.py. This test uses the existing categorical_treatment_testsuite infrastructure and would previously raise KeyError: 'bins'. The test now passes correctly.

Format/lint checks:

black --check ✅
flake8 --select=E9,F63,F7,F82 ✅

Full test suite could not be run due to environment constraints (missing system-level graphviz for pygraphviz), but the fix is surgical with no logic changes beyond removing a dead line and adding observed=True to two groupby calls.

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run
gh aw add githubnext/agentics/workflows/repo-assist.md@11c9a2c442e519ff2b427bf58679f5a525353f76

…g in dummy_outcome_refuter Two issues fixed in _get_treatment_groups(): 1. Bug (categorical branch): line 767 overwrote the correct groupby result with data.groupby('bins') even though 'bins' was never created for categorical treatments. This caused a KeyError whenever the refuter was run with a categorical treatment variable. Fixed by removing the dead line so the function correctly returns groupby on the treatment column. 2. FutureWarning (pandas): groupby calls on columns with CategoricalDtype (pd.cut output 'bins', and categorical treatment) did not set observed=True, triggering a pandas FutureWarning about the changing default. Added observed=True to both groupby calls; this retains existing behaviour because only non-empty bins are meaningful for the dummy-outcome estimator. Test added: test_refutation_dummy_outcome_refuter_default_categorical_treatment — regression test that previously raised KeyError. Refs #1316 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes DummyOutcomeRefuter._get_treatment_groups() for categorical treatments by removing an erroneous groupby("bins") call and explicitly setting observed=... on groupby to address a pandas FutureWarning.

Changes:

Remove incorrect categorical groupby("bins") that could raise KeyError.
Add observed=True to groupby(...) for both continuous ("bins") and categorical (treatment column) branches.
Add regression test covering categorical treatment refutation path (issue #1316).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
`dowhy/causal_refuters/dummy_outcome_refuter.py`	Fixes categorical grouping `KeyError` and sets `observed=True` on groupby to avoid the pandas warning.
`tests/causal_refuters/test_dummy_outcome_refuter.py`	Adds a regression test to ensure categorical treatments don’t trigger the previous `KeyError`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T04:05:38Z

        refuter_tester.binary_treatment_testsuite(tests_to_run="atleast-one-common-cause", num_samples=num_samples)

+    @mark.parametrize(
+        ["error_tolerence", "estimator_method", "num_samples"],


The parameter name error_tolerence is misspelled (tolerence vs tolerance). Since this is newly added test code, consider renaming the local test parameter to error_tolerance for readability (keeping the passed value the same).

@copilot apply changes based on this feedback

Copilot · 2026-04-17T04:05:38Z

+        [(0.1, "backdoor.propensity_score_matching", 1000)],
+    )
+    def test_refutation_dummy_outcome_refuter_default_categorical_treatment(
+        self, error_tolerence, estimator_method, num_samples


The parameter name error_tolerence is misspelled (tolerence vs tolerance). Since this is newly added test code, consider renaming the local test parameter to error_tolerance for readability (keeping the passed value the same).

@copilot apply changes based on this feedback

Copilot · 2026-04-17T04:05:38Z

+    ):
+        # Regression test for #1316: the categorical branch had a copy-paste error that tried to
+        # groupby("bins") which doesn't exist for categorical treatments, raising KeyError.
+        refuter_tester = SimpleRefuter(error_tolerence, estimator_method, "dummy_outcome_refuter")


The parameter name error_tolerence is misspelled (tolerence vs tolerance). Since this is newly added test code, consider renaming the local test parameter to error_tolerance for readability (keeping the passed value the same).

@copilot apply changes based on this feedback

Fixed in b7d0e71 — renamed the parameter from error_tolerence to error_tolerance in the newly added categorical treatment test (parametrize list, function signature, and body).

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Emre Kıcıman <emrek@microsoft.com>

…al test Signed-off-by: Copilot <copilot@github.com> Agent-Logs-Url: https://github.com/py-why/dowhy/sessions/227e1739-c29d-4256-9664-7e83d209eb69 Co-authored-by: emrekiciman <5982160+emrekiciman@users.noreply.github.com>

github-actions bot added automation bug Something isn't working repo-assist labels Apr 16, 2026

github-actions bot mentioned this pull request Apr 17, 2026

[Repo Assist] Monthly Activity 2026-04 #1433

Open

41 tasks

emrekiciman marked this pull request as ready for review April 17, 2026 03:58

emrekiciman requested a review from Copilot April 17, 2026 03:58

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Copilot started reviewing on behalf of emrekiciman April 17, 2026 04:10 View session

Copilot started work on behalf of emrekiciman April 19, 2026 09:19 View session

emrekiciman and others added 3 commits April 19, 2026 02:23

Update dowhy/causal_refuters/dummy_outcome_refuter.py

9fc4daf

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Emre Kıcıman <emrek@microsoft.com>

Update dowhy/causal_refuters/dummy_outcome_refuter.py

5698184

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Emre Kıcıman <emrek@microsoft.com>

Copilot finished work on behalf of emrekiciman April 19, 2026 10:08

Copilot AI requested a review from emrekiciman April 19, 2026 10:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Repo Assist] fix: correct categorical groupby KeyError and pandas FutureWarning in dummy_outcome_refuter#1460

[Repo Assist] fix: correct categorical groupby KeyError and pandas FutureWarning in dummy_outcome_refuter#1460
github-actions[bot] wants to merge 4 commits intomainfrom
repo-assist/fix-issue-1316-pandas-groupby-warnings-0b1ea40ed2c81e89

github-actions bot commented Apr 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

emrekiciman Apr 19, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

emrekiciman Apr 19, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

emrekiciman Apr 19, 2026

Uh oh!

Copilot AI Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

github-actions bot commented Apr 16, 2026

Root Cause

Fix

Test Status

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

emrekiciman Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

emrekiciman Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

emrekiciman Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants