tempor.datasources.mivdp.preproc.features.feature_preproc_icu module

ICU feature preprocessing module.

Based on: https://github.com/healthylaife/MIMIC-IV-Data-Pipeline preprocessing/hosp_module_preproc/feature_selection_icu.py

tempor.datasources.mivdp.preproc.features.feature_preproc_icu.feature_icu(cohort_output: str, root_dir: str, version: str, diag_flag: bool = True, out_flag: bool = True, chart_flag: bool = True, proc_flag: bool = True, med_flag: bool = True) tuple[DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None][source]

Extracts features from ICU data.

Parameters:
cohort_output : str

Cohort output file name.

root_dir : str

Root directory of the MIMIC-IV dataset.

version : str

MIMIC-IV version string, e.g. "v1_0".

diag_flag : bool, optional

Whether to extract diagnosis data. Defaults to True.

out_flag : bool, optional

Whether to extract output events data. Defaults to True.

chart_flag : bool, optional

Whether to extract chart events data. Defaults to True.

proc_flag : bool, optional

Whether to extract procedures data. Defaults to True.

med_flag : bool, optional

Whether to extract medications data. Defaults to True.

Returns:

Output dataframes diag, out, chart, proc, med, depending on the flags.

Return type:

OutDfs

tempor.datasources.mivdp.preproc.features.feature_preproc_icu.preprocess_features_icu(cohort_output: str, root_dir: str, diag_flag: bool, group_diag: Literal[both] | Literal[convert] | Literal[convert_group], chart_flag: bool, clean_chart: bool, impute_outlier_chart: bool, thresh: int, left_thresh: int) tuple[DataFrame | None, DataFrame | None][source]

Performs grouping on diagnosis data and/or outlier removal and imputation on chart events data.

Parameters:
cohort_output : str

Cohort output file name.

root_dir : str

Root directory of the MIMIC-IV dataset.

dia_flag : bool

Whether to process diagnosis data.

group_diag : GroupOption

Grouping option for diagnosis data. "both": Keep both ICD-9 and ICD-10 codes. "convert": Convert ICD-9 to ICD-10 codes. "convert_group": Convert ICD-9 to ICD-10 and group ICD-10 codes. Only applicable if diag_flag is True.

chart_flag : bool

Whether to process chart events data.

clean_chart : bool

Whether to clean chart events data. Only applicable if chart_flag is True.

impute_outlier_chart : bool

Whether to impute outliers in chart events data. Only applicable if chart_flag is True.

thresh : int

(Right/upper) threshold for outlier removal. Only applicable if chart_flag is True.

left_thresh : int

(Left/lower) threshold for outlier removal. Only applicable if chart_flag is True.

Returns:

Dataframes diag, chart, depending on the flags.

Return type:

Tuple[Optional[pd.DataFrame], Optional[pd.DataFrame]]

tempor.datasources.mivdp.preproc.features.feature_preproc_icu.generate_summary_icu(cohort_output: str, root_dir: str, diag_flag: bool, proc_flag: bool, med_flag: bool, out_flag: bool, chart_flag: bool) tuple[DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None][source]

Generates summary of features.

Parameters:
cohort_output : str

Cohort output file name.

root_dir : str

Root directory of the MIMIC-IV dataset.

diag_flag : bool

Whether to generate summary of diagnosis data.

proc_flag : bool

Whether to generate summary of procedures data.

med_flag : bool

Whether to generate summary of medications data.

out_flag : bool

Whether to generate summary of output events data.

chart_flag : bool

Whether to generate summary of chart events data.

Returns:

Output dataframes summary_diag, summary_med, summary_proc, summary_out, summary_chart, depending on the flags.

Return type:

OutDfs

tempor.datasources.mivdp.preproc.features.feature_preproc_icu.features_selection_icu(cohort_output: str, root_dir: str, diag_flag: bool, proc_flag: bool, med_flag: bool, out_flag: bool, chart_flag: bool, select_diag: bool, select_med: bool, select_proc: bool, select_out: bool, select_chart: bool)[source]

Selects features based on the summary.

This currently requires that the user manually edit the summary files (<root_dir>/data/summary/{diag,proc,med,out,chart}_features.csv) to select the features.

Parameters:
cohort_output : str

Cohort output file name.

root_dir : str

Root directory of the MIMIC-IV dataset.

diag_flag : bool

Whether to select diagnosis data.

proc_flag : bool

Whether to select procedures data.

med_flag : bool

Whether to select medications data.

out_flag : bool

Whether to select output events data.

chart_flag : bool

Whether to select chart events data.

select_diag : bool

Whether to select diagnosis data based on the summary.

select_med : bool

Whether to select medications data based on the summary.

select_proc : bool

Whether to select procedures data based on the summary.

select_out : bool

Whether to select output events data based on the summary.

select_chart : bool

Whether to select chart events data based on the summary.

Returns:

Output dataframes diag, out, chart, proc, med, depending on the flags.

Return type:

OutDfs