Causal Effect Estimation

The Neyman-Rubin Potential Outcomes Framework is an approach for estimating causal effects (also known as treatment effects) in causal inference. It defines causality through the tensor outcomes $Y$ of a binary intervention $T$ . The causal effect, defined as the difference between these tensor outcomes, is the core focus of this framework. Rubin [Rub05]

However, since only one of the tensor outcomes is observed—either the unit receives the intervention or it does not—the difference in tensor outcomes is unobservable. This is known as the “Fundamental Problem of Causal Inference”.

Recent advancements have developed improved statistical estimators for causal effects, each associated with specific causal assumptions. This module integrates these advancements with foundational causal identification through Bayesian networks. Pearl [Pea95] It provides a pipeline for detecting suitable adjustment sets and applying the appropriate estimators to achieve accurate causal effect estimations.

class pyagrum.causal.CausalEffectEstimation(df, causal_model)

Estimates causal effects using a dataset and a causal graph within the Neyman-Rubin Tensor Outcomes framework.

This class performs causal identification based on user-specified datasets and causal graphical models. It determines the appropriate adjustment method — suchas backdoor, front-door, or instrumental variables (IV) — to optimally estimate the causal effect (or treatment effect) between the intervention (treatment assignment) and the outcome.

The class integrates domain-specific statistical estimators and recent advancements in machine learning techniques to estimate various causal effects, including the Average Causal Effect (ACE), Conditional Average Causal Effect (CACE), Individual Causal Effect (ICE), and Local Average Treatment Effect (LATE), among others.

This module is inspired by the works of Wager [Wag20] and Neal [Nea20].

Raises:
- AssertionError – If the input dataframe is empty, indicating that predictions cannot be made.
- ValueError – If the provided estimator_string does not correspond to any supported estimator.
Parameters:
- df (DataFrame)
- causal_model (CausalModel)

estimateCausalEffect(conditional=None, **estimation_params)

Estimate the causal or treatment effect based on the initialized data.

Parameters:
- conditional (pd.DataFrame , str , or None , optional) –
  
  Specifies conditions for estimating causal effects.
  - If pd.DataFrame, estimates the Individual Causal Effect (ICE) : for each row.
  - If str, estimates the Conditional Average Causal Effect (CACE). : The string must be a valid pandas query.
  - If None, estimates the Average Causal Effect (ACE). : Default is None.
- estimation_params (dict of str to Any , optional) – Additional parameters for the estimation method. Keys are parameter names, and values are the corresponding parameter values. Default is None.
Returns: If return_ci is False, returns the estimated causal effect : as a float.

If return_ci is True, returns a tuple containing: : - The estimated causal effect (float)
- The lower and upper bounds of the confidence interval (tuple of floats)
Return type: float or np.ndarray
Raises: ValueError – No adjustment have been selected before making the estimate.

fitCausalBNEstimator()

Fit the Causal Bayesian Network Estimator.

This class utilizes do-calculus identification and lazy propagation inference, implemented via the pyAgrum library’s causal module, to determine the causal effect within Bayesian Networks.

Note: In the case of instrumental variables, the causal effect is estimated using heuristic methods, as this adjustment is not identifiable through do-calculus.

Return type: Any

fitCustomEstimator(estimator)

Fits the specified estimator object, which must implement .fit(), .predict(), and .estimate_ate() methods consistent with CausalML estimators. Chen et al. [CHL+20].

Note: Compatibility with the current adjustment is not guarenteed.

Parameters: estimator (Any) – The estimator object to be fitted, adhering to the CausalML method declarations.
Return type: Any

fitDM()

Fits the Difference in Means (DM) Estimator.

The DM estimator computes the Average Causal Effect (ACE) under the ignorability assumption in Randomized Controlled Trials (RCT) by taking the difference of the mean values among the treated and untreated population.

Return type: Any

fitGeneralizedPlugIn(**estimator_params)

Fit the Generalized plug-in Estimator.

Basic implementation of the second plug-in estimator in Guo et al. (2023). Must provide covariates. Fulcher et al. [FSMTT20], Guo et al. [GBN23].

Parameters: estimator_params (Any) – The parameters of the estimator.
Return type: Any

fitIPW(**estimator_params)

Fit the Inverse Propensity score Weighting Estimator.

A basic implementation of the Inverse Propensity Score Weighting (IPW) estimator based on Lunceford et al. (2004) Lunceford and Davidian [LD04].

Parameters: propensity_score_learner (str or Any , optional) – Estimator for propensity score. If not provided, defaults to LogisticRegression.
Return type: Any

fitNormalizedWaldIPW(**estimator_params)

Fit the Normalized Wald Inverse Probability Weighting Estimator.

A basic implementation of the normalized Wald estimator with Inverse Propensity Score Weighting which computes the Local Average Causal Effect (LACE). Only Supports binary instruments. Choi [Cho21].

Parameters: iv_probability_learner (str or Any , optional) – Estimator for instrumental variable probability. If not provided, defaults to LogisticRegression.
Return type: Any

fitPStratification(**estimator_params)

Fit the Propensity score Stratification Estimator.

A basic implementation of Propensity Stratification estimator based on Lunceford et al. (2004) Lunceford and Davidian [LD04].

Parameters:
- propensity_score_learner (str or Any , optional) – Estimator for propensity score. If not provided, defaults to LogisticRegression.
- num_strata (int , optional) – The number of strata. Default is 100.
Return type: Any

fitSLearner(**estimator_params)

Fit the S-Learner Estimator.

A basic implementation of the S-learner based on Kunzel et al. (2018) Künzel et al. [KunzelSBY19].

Parameters: learner (str or Any , optional) – Base estimator for all learners. If not provided, defaults to LinearRegression.
Return type: Any

fitSimplePlugIn(**estimator_params)

Fit the Plug-in Estimator.

Uses the (original) Frontdoor Adjustment Formula to derive the plug-in estimator. Does not account for covariates. Inspired by Guo et al. (2023).

Fulcher et al. [FSMTT20], Guo et al. [GBN23].

Parameters:
- learner (str or object , optional) – Estimator for outcome variable. If not provided, defaults to LinearRegression.
- propensity_learner (str or object , optional) – Estimator for treatment probability. If not provided, defaults to LogisticRegression.
Return type: Any

fitTLearner(**estimator_params)

Fit the T-Learner Estimator.

A basic implementation of the T-learner based on Kunzel et al. (2018) Künzel et al. [KunzelSBY19].

Parameters:
- learner (str or Any , optional) – Base estimator for all learners. If not provided, defaults to LinearRegression.
- control_learner (str or Any , optional) – Estimator for control group outcome. Overrides learner if specified.
- treatment_learner (str or Any , optional) – Estimator for treatment group outcome. Overrides learner if specified.
Return type: Any

fitTSLS(**estimator_params)

Fit the Two Stage Least Squares Estimator.

A basic implementation of the Two Stage Least-Squares Estimator. Only supports Linear Models, must have .coef_ attribute. Angrist and Imbens [AI95].

Parameters:
- learner (str or Any , optional) – Base estimator for all learners. If not provided, defaults to LinearRegression.
- treatment_learner (str or Any , optional) – Estimator for treatment assignment. Overrides learner if specified.
- outcome_learner (str or Any , optional) – Estimator for outcome. Overrides learner if specified.
Return type: Any

fitWald()

Fit the Wald Estimator.

An implementation of the Wald estimator which computes the Local Average Causal Effect (LACE), also know as the Local Average Treatment Effect (LATE). Only Supports binary instruments.

Return type: Any

fitWaldIPW(**estimator_params)

Fit the Wald Inverse Probability Weighting Estimator.

A basic implementation of the Wald estimand with Inverse Propensity Score Weighting which computes the Local Average Causal Effect (LACE). Only Supports binary instruments. Choi [Cho21].

Parameters: iv_probability_learner (str or Any , optional) – Estimator for instrumental variable probability. If not provided, defaults to LogisticRegression.
Return type: Any

fitXLearner(**estimator_params)

Fit the X-Learner Estimator.

A basic implementation of the X-learner based on Kunzel et al. (2018) Künzel et al. [KunzelSBY19].

Parameters:
- learner (str or Any , optional) – Base estimator for all learners. If not provided, defaults to LinearRegression.
- control_outcome_learner (str or Any , optional) – Estimator for control group outcome. Overrides learner if specified.
- treatment_outcome_learner (str or Any , optional) – Estimator for treatment group outcome. Overrides learner if specified.
- control_effect_learner (str or Any , optional) – Estimator for control group effect. Overrides learner if specified.
- treatment_effect_learner (str or Any , optional) – Estimator for treatment group effect. Overrides learner if specified.
- propensity_score_learner (str or Any , optional) – Estimator for propensity score. If not provided, defaults to LogisticRegression.
Return type: Any

identifyAdjustmentSet(intervention, outcome, verbose=True)

Identify the sufficent adjustment set of covariates.

Parameters:
- intervention (str) – Intervention (treatment) variable.
- outcome (str) – Outcome variable.
- verbose (bool) – If True, prints the estimators that can be used using the found adjustment. Default is True.
Raises: ValueError – The tratment isn’t binary or no adjustment set was found.
Return type: None