zepid.causal.generalize.estimators.GTransportFormula¶

class
zepid.causal.generalize.estimators.
GTransportFormula
(df, exposure, outcome, selection, outcome_type='binary', generalize=True, weights=None)¶ Calculate the gtransportformula using a observed study sample and a sample from the target population. Broadly, the process for fitting the gtransportformula is similar to the gformula (as implemented in TimeFixedGFormula). Instead of predicting the potential outcomes of only the sample, the gtransportformula predicts potential outcomes for the full target population
For generalizability, we first fit a Qmodel predicting the outcome as a function of the treatment and any modifiers (along with confounders if in observation data). Afterwards, we predict the potential outcomes for the entire population (S=1 and S=0). To obtain the marginal effect measure, we take the mean of the entire population (S=1 and S=0)
For transportability, we similarly fit a Qmodel in the observed sample and generate predictions for the entire sample. However, for transportability our sample is not part of the target population. Therefore, we only take the marginal of the S=0 group.
Confidence intervals should be obtained by using a nonparametric bootstrapping procedure
Parameters:  df (DataFrame) – Pandas dataframe containing all variables required for generalization/transportation. Should include all features related to sample selection, indicator for selection into the sample, and treatment/outcome information for the sample (selection == 1)
 exposure (str) – Column label for exposure/treatment of interest. Can be nan for all those not in sample. Only binary exposures are currently supported
 outcome (str) – Column label for outcome of interest. Can be nan for all those not in sample
 selection (str) – Column label for indicator of selection into the sample. Should be 1 if individual comes from the study sample and 0 if individual is from random sample of source population
 outcome_type (str, optional) – Outcome variable type. Currently only ‘binary’, ‘normal’, and ‘poisson variable types are supported
 generalize (bool, optional) – Whether the problem is a generalizability (True) problem or a transportability (False) problem. See notes for further details on the difference between the two estimation methods
 weights (None, str, optional) – Optional argument for weights. Can be used to input inverse probability of missing weights
Note
There are two related concepts; generalizability and transportability. Generalizability is when your study sample is part of your target population. For example, you want to generalize results from California to the entire United States. Transportability is when your study sample is not part of your target population. For example, we want to apply our results from California to Canada. Depending on the scenario, how the marginal risk difference is calculated is slightly different. GTransportFormula allows for both of these problems
Examples
Setting up the environment
>>> from zepid import load_generalize_data >>> from zepid.causal.generalize import GTransportFormula >>> df = load_generalize_data(False)
Generalizability
>>> gtf = GTransportFormula(df, exposure='A', outcome='Y', selection='S', generalize=True) >>> gtf.outcome_model('A + L + L:A + W + W:A + W:A:L') >>> gtf.fit() >>> gtf.summary()
Transportability
>>> gtf = GTransportFormula(df, exposure='A', outcome='Y', selection='S', generalize=False) >>> gtf.outcome_model('A + L + L:A + W + W:A + W:A:L') >>> gtf.fit() >>> gtf.summary()
For observational studies, confounders should be included in the Qmodel
References
Lesko CR, Buchanan AL, Westreich D, Edwards JK, Hudgens MG, & Cole SR. (2017). Generalizing study results: a potential outcomes perspective. Epidemiology (Cambridge, Mass.), 28(4), 553.
Dahabreh IJ, Robertson SE, Stuart EA, Hernan MA (2018). Transporting inferences from a randomized trial to a new target population. arXiv preprint arXiv:1805.00550.

__init__
(df, exposure, outcome, selection, outcome_type='binary', generalize=True, weights=None)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(df, exposure, outcome, selection[, …])Initialize self. fit
()Uses the gtransport formula to obtain the risk difference and risk ratio from the sample. outcome_model
(model[, print_results])Build the model for the outcome. summary
([decimal])Prints a summary of the results for the gtransport estimator