zepid.datasets.load_gvhd_data¶
-
zepid.datasets.
load_gvhd_data
()¶ Loads bone marrow transplant recipient data from Keil AP, Edwards JK, Richardson DB, Naimi AI, Cole SR. The parametric g-formula for time-to-event data: intuition and a worked example. Epidemiology. 2014;25(6):889-97. Patients were followed until death or administrative censoring at 5-years.
Notes
- Variables are formatted exactly as described in Keil et al. 2014
- id: unique ID for each participant
- age: participant baseline age
- agesq: squared baseline age
- agecurs1: restricted cubic spline knot 1 for baseline age
- agecurs2: restricted cubic spline knot 2 for basline age
- male: participant gender (1 is male, 0 is female)
- cmv: cytomegalovirus baseline immune status (1 is yes, 0 is no)
- all: at this time, I am unsure what this variable indicates (1, 0)
- wait: wait time from diagnosis to transplantation (months)
- day: day since transplantation
- daysq: squared day since transplantation
- daycu: cubic day since transplantation
- daycurs1: restricted cubic spline knot 1 for days since transplantation
- daycurs2: restricted cubic spline knot 2 for days since transplantation
- yesterday: previous day
- tomorrow: day after
- gvhd: indicator for Graph-versus-Host Disease (1 is yes, 0 is no)
- d: indicator of death (1 is yes, 0 is no)
- relapse: indicator for relapse (1 is yes, 0 is no)
- platnorm: indicator for normal platelet count (1 is yes, 0 is no)
- censlost: indicator for censoring due to loss-to-follow-up (1 is yes, 0 is no)
- gvhdm1: indicator for previous day diagnosis of GvHD (1 is yes, 0 is no)
- relapsem1: indicator for previous day relapse (1 is yes, 0 is no)
- platnormm1: indicator for previous day normal platelet count (1 is yes, 0 is no)
- daysnogvhd: number of consecutive days without a GvHD diagnosis
- daysnorelapse: number of consecutive days without relapse
- daysnoplatnorm: number of consecutive days without normal platelet count
- daysgvhd: number of consecutive days with GvHD
- daysrelapse: number of consecutive days after relapse
- daysplatnorm: number of consecutive days with normal platelet count
Returns: Returns pandas DataFrame Return type: DataFrame