[Repo Assist] feat: auto-select default estimation method when method_name is omitted by github-actions[bot] · Pull Request #1464 · py-why/dowhy

github-actions · 2026-04-18T01:25:08Z

🤖 This PR was created by Repo Assist, an automated AI assistant. Please review carefully before merging.

Closes #365

Root Cause

CausalModel.estimate_effect() would raise UnboundLocalError: local variable 'identifier_name' referenced before assignment whenever method_name=None (the default) was passed. This happened because the if method_name is None: branch was a bare pass, leaving identifier_name and causal_estimator undefined before they were consumed by the call to the standalone estimate_effect() helper.

A companion bug: when there is no directed path from treatment to outcome and method_name=None, the method would also crash (same unset variables), rather than returning the zero-effect estimate that the downstream helper would produce.

Fix

Implements the design agreed by @amit-sharma in the original issue comment (Jan 2022):

New private helper _select_default_method_name(identified_estimand):

Identified estimand	Treatment	Auto-selected method
backdoor	binary (≤ 2 unique values)	`backdoor.propensity_score_stratification`
backdoor	continuous	`backdoor.linear_regression`
IV	any	`iv.instrumental_variable`
frontdoor	any	`frontdoor.two_stage_regression`
none valid	any	`ValueError` (actionable message)

Early return for no_directed_path: when method_name=None and there is no causal path, the method now returns a CausalEstimate(value=0) directly (matching the existing behaviour when a method_name was provided).

Logging: an INFO-level message records the auto-selected method so users can see what was chosen without having to inspect the source.

The docstring for estimate_effect is updated to document the new auto-selection behaviour.

Tests Added

Three new tests in TestCausalModel:

test_estimate_effect_default_method_binary_treatment – binary treatment → PropensityScoreStratificationEstimator
test_estimate_effect_default_method_continuous_treatment – continuous treatment → LinearRegressionEstimator
test_estimate_effect_default_method_no_valid_estimand_raises – all estimands None → ValueError with "method_name explicitly" in message

Trade-offs

The auto-selection logic is heuristic (binary ≤ 2 unique values). Power users who want a specific estimator should always supply method_name explicitly; the INFO log reminds them.
Frontdoor default (two_stage_regression) is included for completeness but untested here since frontdoor identification is less common. Maintainers may choose to limit auto-selection to backdoor/IV only.

Test Status

⚠️ The test runner could not be executed in this environment (no Poetry/virtual env available). The changes were validated by careful code review.

Code quality:

✅ All new lines ≤ 127 characters (flake8 limit)
✅ No new imports required
✅ No changes to public API signatures — existing callers unaffected
✅ Logic of _select_default_method_name is straightforward and low-risk

Generated by 🌈 Repo Assist, see workflow run. Learn more.

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run
gh aw add githubnext/agentics/workflows/repo-assist.md@11c9a2c442e519ff2b427bf58679f5a525353f76

@amit-sharma

When method_name is None, choose a sensible default: - binary treatment + backdoor estimand → propensity_score_stratification - continuous treatment + backdoor estimand → linear_regression - IV-only graph → iv.instrumental_variable - frontdoor-only graph → frontdoor.two_stage_regression - nothing identified → raise ValueError with actionable message - no directed path → return zero estimate immediately (avoids NameError) An INFO log message records the auto-selected method name. Previously method_name=None caused an UnboundLocalError on identifier_name because the if/pass block left it unset. This fulfils the feature design agreed in #365 (comment by @amit-sharma, Jan 2022). Adds three tests covering the binary-treatment, continuous-treatment, and no-valid-estimand paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Copilot

Pull request overview

Adds automatic default estimator selection in CausalModel.estimate_effect() when method_name is omitted, fixing a crash (UnboundLocalError) and aligning behavior with the long-standing TODO/issue discussion.

Changes:

Implemented _select_default_method_name() and wired it into CausalModel.estimate_effect() when method_name is None.
Added an early-return zero-effect CausalEstimate for the no_directed_path + method_name is None case.
Added tests covering default selection for binary vs continuous treatment and the “no valid estimand” error path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`dowhy/causal_model.py`	Implements default-method auto-selection, early return for no-directed-path, and updates the `estimate_effect` docstring.
`tests/test_causal_model.py`	Adds tests validating default estimator choice and the explicit error when no valid estimand exists.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-18T02:42:49Z

+            treatment_col = self._treatment[0]
+            if self._data[treatment_col].nunique() <= 2:
+                return "backdoor.propensity_score_stratification"
+            return "backdoor.linear_regression"


_select_default_method_name infers binary/continuous treatment using only self._treatment[0]. For multivariate treatments (supported by parse_state and dataset generators), this can misclassify the treatment type and auto-select an estimator that’s inappropriate or fails later. Consider either (a) requiring method_name explicitly when len(self._treatment) > 1, or (b) checking all treatment columns and treating the treatment as binary only if every treatment has ≤2 unique values (and otherwise falling back to a continuous-safe default).

Copilot · 2026-04-18T02:42:49Z

        if method_name is None:
-            # TODO add propensity score as default backdoor method, iv as default iv method, add an informational message to show which method has been selected.
-            pass
-        else:
-            # TODO add dowhy as a prefix to all dowhy estimators
-            num_components = len(method_name.split("."))
-            str_arr = method_name.split(".", maxsplit=1)
-            identifier_name = str_arr[0]
-            estimator_name = str_arr[1]
-            # This is done as all dowhy estimators have two parts and external ones have two or more parts
-            if num_components > 2:
-                estimator_package = estimator_name.split(".")[0]
-                if estimator_package == "dowhy":  # For updated dowhy methods
-                    estimator_method = estimator_name.split(".", maxsplit=1)[
-                        1
-                    ]  # discard dowhy from the full package name
-                    causal_estimator_class = causal_estimators.get_class_object(estimator_method + "_estimator")
-                else:
-                    third_party_estimator_package = estimator_package
-                    causal_estimator_class = causal_estimators.get_class_object(
-                        third_party_estimator_package, estimator_name
-                    )
-                    if method_params is None:
-                        method_params = {}
-                    # Define the third-party estimation method to be used
-                    method_params[third_party_estimator_package + "_estimator"] = estimator_name
-            else:  # For older dowhy methods
-                self.logger.info(estimator_name)
-                # Process the dowhy estimators
-                causal_estimator_class = causal_estimators.get_class_object(estimator_name + "_estimator")
-
-            if method_params is not None and (num_components <= 2 or estimator_package == "dowhy"):
-                extra_args = method_params.get("init_params", {})
-            else:
-                extra_args = {}
-            if method_params is None:
-                method_params = {}
-
-            identified_estimand.set_identifier_method(identifier_name)
-
-            # If not fit_estimator, attempt to retrieve existing estimator.
-            # Keep original behaviour to create new estimator if none found.
-            causal_estimator = None
-            if not fit_estimator:
-                causal_estimator = self.get_estimator(method_name)
-
-            if causal_estimator is None:
-                causal_estimator = causal_estimator_class(
+            if identified_estimand.no_directed_path:
+                self.logger.warning(
+                    "No directed path from %s to %s. Causal effect is zero.", self._treatment, self._outcome
+                )
+                return CausalEstimate(
+                    None,
+                    None,
+                    None,
+                    0,
                    identified_estimand,
-                    test_significance=test_significance,
-                    evaluate_effect_strength=evaluate_effect_strength,
-                    confidence_intervals=confidence_intervals,
-                    **method_params,
-                    **extra_args,
+                    None,
+                    control_value=control_value,
+                    treatment_value=treatment_value,
                )


The new early-return path for method_name is None + identified_estimand.no_directed_path is behaviorally important (it avoids the previous UnboundLocalError), but it isn’t covered by the added tests. Adding a focused test that constructs a graph with no directed path and asserts estimate.value == 0 (and no exception) would prevent regressions in this branch.

Signed-off-by: Emre Kiciman <emrek@microsoft.com>

github-actions bot added automation bug Something isn't working enhancement New feature or request repo-assist labels Apr 18, 2026

github-actions bot mentioned this pull request Apr 18, 2026

[Repo Assist] Monthly Activity 2026-04 #1433

Open

41 tasks

emrekiciman marked this pull request as ready for review April 18, 2026 02:36

emrekiciman requested a review from Copilot April 18, 2026 02:36

Copilot started reviewing on behalf of emrekiciman April 18, 2026 02:37 View session

Copilot AI reviewed Apr 18, 2026

View reviewed changes

fix formatting

5b8dc07

Signed-off-by: Emre Kiciman <emrek@microsoft.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Repo Assist] feat: auto-select default estimation method when method_name is omitted#1464

[Repo Assist] feat: auto-select default estimation method when method_name is omitted#1464
github-actions[bot] wants to merge 2 commits intomainfrom
repo-assist/feature-365-smart-method-defaults-20260418-6cc4ff3716959e80

github-actions bot commented Apr 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 18, 2026

Uh oh!

Copilot AI Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

github-actions bot commented Apr 18, 2026

Root Cause

Fix

Tests Added

Trade-offs

Test Status

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants