# zepid.causal.doublyrobust.AIPW.AIPTW¶

class zepid.causal.doublyrobust.AIPW.AIPTW(df, exposure, outcome, weights=None, alpha=0.05)

Augmented inverse probability of treatment weight estimator. This implementation calculates AIPTW for a time-fixed exposure and a single time-point outcome. AIPTW supports correcting for informative censoring (missing outcome data) through inverse probability of censoring/missingness weights.

AIPTW is a doubly robust estimator, with a desirable property. Both of the the g-formula and IPTW require that our parametric regression models are correctly specified. Instead, AIPTW allows us to have two ‘chances’ at getting the model correct. If either our outcome-model or treatment-model is correctly specified, then our estimate will be unbiased. This property does not hold for the variance (i.e. the variance will not be doubly robust)

The augment-inverse probability weight estimator is calculated from the following formula

$\widehat{DR}(a) = \frac{YA}{\widehat{\Pr}(A=a|L)} - \frac{\hat{Y}^a*(A-\widehat{\Pr}(A=a|L)}{ \widehat{\Pr}(A=a|L)}$

The risk difference and risk ratio are calculated using the following formulas, respectively

$\widehat{RD} = \widehat{DR}(a=1) - \widehat{DR}(a=0)$
$\widehat{RR} = \frac{\widehat{DR}(a=1)}{\widehat{DR}(a=0)}$

Confidence intervals for the risk difference come from the influence curve. Confidence intervals for the risk ratio are less straight-forward. To get confidence intervals for the risk ratio, a bootstrap procedure should be used.

Parameters: df (DataFrame) – Pandas DataFrame object containing all variables of interest exposure (str) – Column name of the exposure variable. Currently only binary is supported outcome (str) – Column name of the outcome variable. Currently only binary is supported weights (str, optional) – Column name of weights. Weights allow for items like sampling weights to be used to estimate effects alpha (float, optional) – Alpha for confidence interval level. Default is 0.05, returning the 95% CL

Examples

Set up the environment and the data set

>>> from zepid import load_sample_data, spline
>>> from zepid.causal.doublyrobust import AIPTW
>>> df[['cd4_rs1','cd4_rs2']] = spline(df,'cd40',n_knots=3,term=2,restricted=True)
>>> df[['age_rs1','age_rs2']] = spline(df,'age0',n_knots=3,term=2,restricted=True)


Estimate the base AIPTW model

>>> aipw = AIPTW(df, exposure='art', outcome='dead')
>>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.outcome_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.fit()
>>> aipw.summary()


Estimate AIPTW accounting for missing outcome data

>>> aipw = AIPTW(df, exposure='art', outcome='dead')
>>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.missing_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.outcome_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.fit()
>>> aipw.summary()


AIPTW for continuous outcomes

>>> df = load_sample_data(timevary=False).drop(columns=['dead'])
>>> df[['cd4_rs1','cd4_rs2']] = spline(df,'cd40',n_knots=3,term=2,restricted=True)
>>> df[['age_rs1','age_rs2']] = spline(df,'age0',n_knots=3,term=2,restricted=True)

>>> aipw = AIPTW(df, exposure='art', outcome='cd4_wk45')
>>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.missing_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.outcome_model('art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.fit()
>>> aipw.summary()

>>> aipw = AIPTW(df, exposure='art', outcome='cd4_wk45')
>>> ymodel = 'art + male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0'
>>> aipw.exposure_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> aipw.missing_model(ymodel)
>>> aipw.outcome_model(ymodel, continuous_distribution='poisson')
>>> aipw.fit()
>>> aipw.summary()


References

Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, & Davidian M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761-767.

Lunceford JK, Davidian M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statistics in medicine, 23(19), 2937-2960.

__init__(df, exposure, outcome, weights=None, alpha=0.05)

Initialize self. See help(type(self)) for accurate signature.

Methods

 __init__(df, exposure, outcome[, weights, alpha]) Initialize self. exposure_model(model[, custom_model, bound, …]) Specify the propensity score / inverse probability weight model. fit() Calculate the augmented inverse probability weights and effect measures from the predicted exposure probabilities and predicted outcome values. missing_model(model[, custom_model, bound, …]) Estimation of Pr(M=0|A,L), which is the missing data mechanism for the outcome. outcome_model(model[, custom_model, …]) Specify the outcome model. plot_kde(to_plot[, bw_method, fill, color, …]) Generates density plots that can be used to check predictions qualitatively. plot_love([color_unweighted, …]) Generates a Love-plot to detail covariate balance based on the IPTW weights. positivity([decimal]) Use this to assess whether positivity is a valid assumption for the exposure model / calculated IPTW. run_diagnostics([decimal]) Run all currently implemented diagnostics for the exposure and outcome models. standardized_mean_differences() Calculates the standardized mean differences for all variables based on the inverse probability weights. summary([decimal]) Prints a summary of the results for the doubly robust estimator.