zepid.datasets.load_sample_data

zepid.datasets.load_sample_data(timevary)

Load data that is part of the zepid package. This data set comes from simulated data from Jessie Edwards (thanks Jess!). This data is used for examples on zepid.readthedocs

Parameters:timevary (bool) – Whether to return the time-varying data set or the time fixed. If True then returns data set with repeated visits. If False then a data set with single observation per subject representing the 45-week risk is returned

Notes

For the time-varying data set, the following variables are returned;
  • id - participant unique ID
  • enter - start of follow-up period
  • out - end of time period
  • male - indicator variable for male (1 = yes)
  • age0 - age at enter = 0
  • cd40 - CD4 T cell count at enter = 0
  • dvl0 - detectable viral load data at enter = 0
  • cd4 - CD4 T cell count at enter = t
  • dvl - viral load at enter = t
  • art - indicator of whether ART was prescribed at enter = t
  • drop - indicator of whether individual dropped out of the study at enter = t (1 = yes)
  • dead - indicator for death at out = t (1 = yes)
For the time-fixed data set, the following variables are returned
  • id - participant unique ID
  • male - indicator variable for male (1 = yes)
  • age0 - age at enter = 0
  • cd40 - CD4 T cell count at enter = 0
  • dvl0 - detectable viral load data at enter = 0
  • art - indicator of whether ART was prescribed at enter = 0
  • t - total time contributed
Returns:Returns either a time-varying or time-fixed pandas DataFrame
Return type:DataFrame

Examples

Load the time-fixed exposure data set

>>> from zepid import load_sample_data
>>> load_sample_data(timevary=False)

Load the time-varying exposure data set

>>> load_sample_data(timevary=True)