zepid.causal.ipw.IPTW.IPTW

class zepid.causal.ipw.IPTW.IPTW(df, treatment, outcome, weights=None, standardize='population')

Calculates inverse probability of treatment weights. Both stabilized or unstabilized weights are implemented. By default, stabilized weights are stabilized by the prevalence of the treatment in the population. IPTW will also now fit the marginal structural model and estimate inverse probability of censoring weights if requested. Confidence intervals are calculated using robust standard errors.

The formula for stabilized IPTW is

\[\pi_i = \frac{\Pr(A=a)}{\Pr(A=a|L=l)}\]

For unstabilized IPTW

\[\pi_i = \frac{1}{\Pr(A=a|L=l)}\]

SMR unstabilized weights for weighting to exposed (A=1)

\[\begin{split}\pi_i &= 1 \;\;\text{ if}\;\; A = 1 \\ &= \frac{\Pr(A=1|L=l)}{\Pr(A=0|L=l)} \;\;\text{if}\;\; A = 0\end{split}\]

For SMR weighted to the unexposed (A=0) the equation becomes

\[\begin{split}\pi_i &= \frac{\Pr(A=0|L=l)}{\Pr(A=1|L=l)} \;\;\text{if}\;\; A=1 \\ &= 1 \;\;\text{ if} \;\;A = 0\end{split}\]

Diagnostics are also available for generated IPTW. For a full list of diagnostics, see specific function documentation below. Additionally, review the references listed for an in-depth explanation

Parameters:
  • df (DataFrame) – Pandas dataframe object containing all variables of interest
  • treatment (str) – Variable name of treatment of interest. Must be coded as binary
  • outcome (str) – Variable name of outcome of interest. Can be either binary or continuous
  • standardize (str, optional) – Who to standardize the estimate to. Options are the entire population, the exposed, or the unexposed. See Sato & Matsuyama Epidemiology (2003) for details on weighting to exposed/unexposed. Weighting to the exposed or unexposed is also referred to as SMR weighting. Options for standardization are: * ‘population’ : weight to entire population * ‘exposed’ : weight to exposed individuals * ‘unexposed’ : weight to unexposed individuals
  • weights (str, optional) – Optional column for weights. If specified, a weighted regression model is instead used to estimate the inverse probability of treatment weights. This optional is useful in the following scenario; some confounder information is missing and IPMW was used to correct for missing data. IPTW should be estimated with the IPMW to standardize to the correct pseudo-population.

Examples

Setting up environment

>>> import matplotlib.pyplot as plt
>>> from zepid import load_sample_data, spline
>>> from zepid.causal.ipw import IPTW
>>> df = load_sample_data(timevary=False).drop(columns=['cd4_wk45'])
>>> df[['cd4_rs1','cd4_rs2']] = spline(df, 'cd40', n_knots=3, term=2, restricted=True)
>>> df[['age_rs1','age_rs2']] = spline(df, 'age0', n_knots=3, term=2, restricted=True)

Calculate stabilized IPTW

>>> ipt = IPTW(df, treatment='art', outcome='dead')
>>> ipt.treatment_model('male + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> ipt.marginal_structural_model('art')
>>> ipt.fit()
>>> ipt.summary()

Diagnostics:

>>> ipt.run_diagnostics()

Calculate unstabilized IPTW weights

>>> ipt = IPTW(df, treatment='art', outcome='dead')
>>> ipt.treatment_model('male + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0', stabilized=False)
>>> ipt.marginal_structural_model('art')
>>> ipt.fit()
>>> ipt.summary()

Calculate SMR weight to the exposed population

>>> ipt = IPTW(df, treatment='art', outcome='dead', standardize='exposed')
>>> ipt.treatment_model('male + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> ipt.marginal_structural_model('art')
>>> ipt.fit()
>>> ipt.summary()

Stabilized IPTW with IPCW

>>> ipt = IPTW(df, treatment='art', outcome='dead')
>>> ipt.treatment_model('male + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> ipt.missing_model('art + male + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0')
>>> ipt.marginal_structural_model('art')
>>> ipt.fit()
>>> ipt.summary()

Stabilized IPTW with effect measure modifier

>>> ipt = IPTW(df, treatment='art', outcome='dead')
>>> ipt.treatment_model('male + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0', model_numerator='male')
>>> ipt.marginal_structural_model('art + male + art:male')
>>> ipt.fit()
>>> ipt.summary()

References

Robins JM, Hernan MA, Brumback B. (2000). Marginal structural models and causal inference in epidemiology.

Hernán MÁ, Brumback B, Robins JM. (2000). Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology, 561-570.

Bodnar LM, Davidian M, Siega-Riz AM, Tsiatis AA. (2004). Marginal structural models for analyzing causal effects of time-dependent treatments: an application in perinatal epidemiology. American Journal of Epidemiology, 159(10), 926-934.

Cole SR, Hernán MA. (2008). Constructing inverse probability weights for marginal structural models. American journal of epidemiology, 168(6), 656-664.

Austin PC, Stuart EA. (2015). Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Statistics in medicine, 34(28), 3661-3679.

Sato T, Matsuyama Y. (2003). Marginal structural models as a tool for standardization. Epidemiology, 14(6), 680-686.

Love T. (2004). Graphical Display of Covariate Balance. Presentation, See http://chrp.org/love/JSM2004RoundTableHandout. pdf, 1364.

__init__(df, treatment, outcome, weights=None, standardize='population')

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(df, treatment, outcome[, weights, …]) Initialize self.
fit([continuous_distribution]) Fit the specified marginal structural model using the calculated inverse probability of treatment weights.
marginal_structural_model(model) Specify the marginal structural model to estimate using the inverse probability of treatment weights.
missing_model(model_denominator[, …]) Estimation of Pr(M=0|A=a,L), which is the missing data mechanism for the outcome.
plot_boxplot([measure]) Generates a stratified boxplot that can be used to visually check whether positivity may be violated, qualitatively.
plot_kde([measure, bw_method, fill, …]) Generates a density plot that can be used to check whether positivity may be violated qualitatively.
plot_love([color_unweighted, …]) Generates a Love-plot to detail covariate balance based on the IPTW weights.
positivity([decimal, iptw_only]) Use this to assess whether positivity is a valid assumption.
run_diagnostics([iptw_only]) Run all currently implemented diagnostics for inverse probability of treatment weights available.
standardized_mean_differences([iptw_only]) Calculates the standardized mean differences for all variables.
summary([decimal]) Prints a summary of the results for the IPTW estimator.
treatment_model(model_denominator[, …]) Logistic regression model(s) for propensity score models.