tempor.datasources.mivdp.datagen.data_generation_icu module

class tempor.datasources.mivdp.datagen.data_generation_icu.ICUDataGenerator(cohort_output: str, root_dir: str, if_mort: bool, if_admn: bool, if_los: bool, feat_cond: bool, feat_proc: bool, feat_out: bool, feat_chart: bool, feat_med: bool, impute: Literal[Mean] | Literal[Median] | False, include_time: int = 24, bucket: int = 1, predW: int = 6, silence_warnings: bool = True)[source]

Bases: object

The data generator object that handles the final data processing aspects of the pipeline.

Determines how to process and represent the time-series data. - You will choose the length of time-series data you want to include for this study (include_time). - You will select the bucket size which tells in what size time windows you want to divide your time-series.

For example, if you select a 2 bucket size, it wil aggregate data for every 2 hours and a time-series of length 24 hours will be represented as time-series with 12 time-windows >where data for every 2 hours is aggregated from original raw time-series.

  • You can also choose if you want to impute chart values. The imputation will be done by froward fill and

    mean or median imputation. Values will be forward fill first and if no value exists for that admission we will use mean or median value for the patient.

Parameters:
cohort_output : str

Cohort output file name.

root_dir : str

Root directory of the MIMIC-IV dataset.

if_mort : bool

Whether the mortality task (target) is selected.

if_admn : bool

Whether the readmission task (target) is selected.

if_los : bool

Whether the length of stay task (target) is selected.

feat_cond : bool

Whether the diagnosis features are selected.

feat_proc : bool

Whether the procedure features are selected.

feat_out : bool

Whether the output event features are selected.

feat_chart : bool

Whether the chart features are selected.

feat_med : bool

Whether the medication features are selected.

impute : ImputeOption

The imputation method to use for missing values. One of "Mean", "Median", or False.

include_time : int, optional

Number of timesteps to include. Defaults to 24.

bucket : int, optional

Time bucket size (in hours). Defaults to 1.

predW : int, optional

Applicable to mortality task only - the mortality prediction window. Defaults to 6.

silence_warnings : bool, optional

Whether to silence warnings. Defaults to True.

generate_feat()[source]
generate_adm()[source]
generate_cond()[source]
generate_proc()[source]
generate_out()[source]
generate_chart()[source]
generate_meds()[source]
mortality_length(include_time, predW)[source]
los_length(include_time)[source]
readmission_length(include_time)[source]
smooth_meds(bucket)[source]
create_chartDict(chart, los)[source]
create_Dict(meds, proc, out, chart, los)[source]