Skip to content

[Repo Assist] fix: handle list-form treatment_variable in estimate_effect_naive#1461

Open
github-actions[bot] wants to merge 3 commits intomainfrom
repo-assist/fix-issue-416-estimate-effect-naive-ca2d9bc3d2269e11
Open

[Repo Assist] fix: handle list-form treatment_variable in estimate_effect_naive#1461
github-actions[bot] wants to merge 3 commits intomainfrom
repo-assist/fix-issue-416-estimate-effect-naive-ca2d9bc3d2269e11

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This PR was created by Repo Assist, an automated AI assistant. Please review carefully before merging.

Closes #416


Root Cause

estimate_effect_naive() in CausalEstimator had two related bugs:

  1. List indexing: self._target_estimand.treatment_variable and outcome_variable are always lists (returned by parse_state()). The old code did data[list_of_names] == 1, which produces a 2D boolean DataFrame. Using that as a .loc index key raises:

    ValueError: Cannot index with multidimensional key
    
  2. Hardcoded treatment/control values: The method used == 1 / == 0, ignoring the actual treatment_value / control_value that the caller passed to estimate_effect(). For non-binary or custom-valued treatments this returned a wrong or empty estimate.

Fix

  • Extract the scalar variable name from the list (treatment_variable[0]), falling back gracefully if already a scalar.
  • Use self._treatment_value / self._control_value (set by update_input() / estimate_effect()) instead of the hardcoded constants. Falls back to 1/0 if not yet set.

The fix is surgical — 4 changed lines in estimate_effect_naive() only.

Test Status

⚠️ The test runner environment did not have the required Python packages installed (pandas, numpy, etc.), preventing automated test execution. This is an infrastructure limitation of this run, not a code issue.

Code quality checks:

  • black --check passed (no formatting changes needed)
  • ✅ No new flake8 violations introduced (pre-existing E501 violations in the file are unrelated)

Manual verification: The logic was verified by code inspection:

  • treatment_variable[0] correctly extracts the single column name
  • data[scalar_name] == value produces a 1D boolean Series (correct for .loc)
  • self._treatment_value is set in update_input() which is called by all concrete estimators' estimate_effect() before evaluate_effect_strength is called

Generated by 🌈 Repo Assist, see workflow run.

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@11c9a2c442e519ff2b427bf58679f5a525353f76

estimate_effect_naive() failed with a ValueError when treatment_variable
or outcome_variable were lists (as returned by parse_state), because
data[[list]] == 1 produces a 2D boolean DataFrame that cannot be used
as a .loc index key.

Also fix hardcoded treatment_value=1 / control_value=0: now uses
self._treatment_value and self._control_value (set by update_input())
so non-binary treatments work correctly.

Fixes #416

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@github-actions github-actions bot added automation bug Something isn't working repo-assist labels Apr 17, 2026
@emrekiciman emrekiciman marked this pull request as ready for review April 17, 2026 03:56
@emrekiciman emrekiciman requested a review from Copilot April 17, 2026 03:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes estimate_effect_naive() to correctly handle list-form treatment_variable / outcome_variable and to respect caller-provided treatment/control values (per #416).

Changes:

  • Extract scalar treatment/outcome column names from list-form variables to avoid 2D boolean indexing.
  • Use self._treatment_value / self._control_value (with 1/0 fallback) instead of hardcoded values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dowhy/causal_estimator.py Outdated
Comment thread dowhy/causal_estimator.py Outdated
emrekiciman and others added 2 commits April 17, 2026 00:14
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Emre Kıcıman <emrek@microsoft.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Emre Kıcıman <emrek@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automation bug Something isn't working repo-assist

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Problems with multi-valued treatment and parameter 'evaluate_effect_strength'

2 participants