zepid.causal.doublyrobust.AIPW.AIPTW¶
-
class
zepid.causal.doublyrobust.AIPW.
AIPTW
(df, exposure, outcome, weights=None, alpha=0.05)¶ Augmented inverse probability of treatment weight estimator. This implementation calculates AIPTW for a time-fixed exposure and a single time-point outcome. AIPTW supports correcting for informative censoring (missing outcome data) through inverse probability of censoring/missingness weights.
AIPTW is a doubly robust estimator, with a desirable property. Both of the the g-formula and IPTW require that our parametric regression models are correctly specified. Instead, AIPTW allows us to have two ‘chances’ at getting the model correct. If either our outcome-model or treatment-model is correctly specified, then our estimate will be unbiased. This property does not hold for the variance (i.e. the variance will not be doubly robust)
The augment-inverse probability weight estimator is calculated from the following formula
\[\widehat{DR}(a) = \frac{YA}{\widehat{\Pr}(A=a|L)} - \frac{\hat{Y}^a*(A-\widehat{\Pr}(A=a|L)}{ \widehat{\Pr}(A=a|L)}\]The risk difference and risk ratio are calculated using the following formulas, respectively
\[\widehat{RD} = \widehat{DR}(a=1) - \widehat{DR}(a=0)\]\[\widehat{RR} = \frac{\widehat{DR}(a=1)}{\widehat{DR}(a=0)}\]Confidence intervals for the risk difference come from the influence curve. Confidence intervals for the risk ratio are less straight-forward. To get confidence intervals for the risk ratio, a bootstrap procedure should be used.
Parameters: - df (DataFrame) – Pandas DataFrame object containing all variables of interest
- exposure (str) – Column name of the exposure variable. Currently only binary is supported
- outcome (str) – Column name of the outcome variable. Currently only binary is supported
- weights (str, optional) – Column name of weights. Weights allow for items like sampling weights to be used to estimate effects
- alpha (float, optional) – Alpha for confidence interval level. Default is 0.05, returning the 95% CL
Examples
Set up the environment and the data set
>>> from zepid import load_sample_data, spline >>> from zepid.causal.doublyrobust import AIPTW >>> df = load_sample_data(timevary=False).drop(columns=['cd4_wk45']) >>> df[['cd4_rs1','cd4_rs2']] = spline(df,'cd40',n_knots=3,term=2,restricted=True) >>> df[['age_rs1','age_rs2']] = spline(df,'age0',n_knots=3,term=2,restricted=True)
Estimate the base AIPTW model
>>> aipw = AIPTW(df, exposure='art', outcome='dead') >>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.outcome_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.fit() >>> aipw.summary()
Estimate AIPTW accounting for missing outcome data
>>> aipw = AIPTW(df, exposure='art', outcome='dead') >>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.missing_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.outcome_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.fit() >>> aipw.summary()
AIPTW for continuous outcomes
>>> df = load_sample_data(timevary=False).drop(columns=['dead']) >>> df[['cd4_rs1','cd4_rs2']] = spline(df,'cd40',n_knots=3,term=2,restricted=True) >>> df[['age_rs1','age_rs2']] = spline(df,'age0',n_knots=3,term=2,restricted=True)
>>> aipw = AIPTW(df, exposure='art', outcome='cd4_wk45') >>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.missing_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.outcome_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.fit() >>> aipw.summary()
>>> aipw = AIPTW(df, exposure='art', outcome='cd4_wk45') >>> ymodel = 'art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0' >>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0') >>> aipw.missing_model(ymodel) >>> aipw.outcome_model(ymodel, continuous_distribution='poisson') >>> aipw.fit() >>> aipw.summary()
References
Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, & Davidian M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761-767.
Lunceford JK, Davidian M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statistics in medicine, 23(19), 2937-2960.
-
__init__
(df, exposure, outcome, weights=None, alpha=0.05)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(df, exposure, outcome[, weights, alpha])Initialize self. exposure_model
(model[, custom_model, bound, …])Specify the propensity score / inverse probability weight model. fit
()Calculate the augmented inverse probability weights and effect measures from the predicted exposure probabilities and predicted outcome values. missing_model
(model[, custom_model, bound, …])Estimation of Pr(M=0|A,L), which is the missing data mechanism for the outcome. outcome_model
(model[, custom_model, …])Specify the outcome model. plot_kde
(to_plot[, bw_method, fill, color, …])Generates density plots that can be used to check predictions qualitatively. plot_love
([color_unweighted, …])Generates a Love-plot to detail covariate balance based on the IPTW weights. positivity
([decimal])Use this to assess whether positivity is a valid assumption for the exposure model / calculated IPTW. run_diagnostics
([decimal])Run all currently implemented diagnostics for the exposure and outcome models. standardized_mean_differences
()Calculates the standardized mean differences for all variables based on the inverse probability weights. summary
([decimal])Prints a summary of the results for the doubly robust estimator.