tempor.datasources.mivdp.preproc.features.feature_preproc_icu module¶
ICU feature preprocessing module.
Based on:
https://github.com/healthylaife/MIMIC-IV-Data-Pipeline
preprocessing/hosp_module_preproc/feature_selection_icu.py
-
tempor.datasources.mivdp.preproc.features.feature_preproc_icu.feature_icu(cohort_output: str, root_dir: str, version: str, diag_flag: bool =
True, out_flag: bool =True, chart_flag: bool =True, proc_flag: bool =True, med_flag: bool =True) tuple[DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None][source]¶ Extracts features from ICU data.
- Parameters:¶
- cohort_output : str¶
Cohort output file name.
- root_dir : str¶
Root directory of the MIMIC-IV dataset.
- version : str¶
MIMIC-IV version string, e.g.
"v1_0".- diag_flag : bool, optional¶
Whether to extract diagnosis data. Defaults to
True.- out_flag : bool, optional¶
Whether to extract output events data. Defaults to
True.- chart_flag : bool, optional¶
Whether to extract chart events data. Defaults to
True.- proc_flag : bool, optional¶
Whether to extract procedures data. Defaults to
True.- med_flag : bool, optional¶
Whether to extract medications data. Defaults to
True.
- Returns:¶
Output dataframes
diag, out, chart, proc, med, depending on the flags.- Return type:¶
OutDfs
- tempor.datasources.mivdp.preproc.features.feature_preproc_icu.preprocess_features_icu(cohort_output: str, root_dir: str, diag_flag: bool, group_diag: Literal[both] | Literal[convert] | Literal[convert_group], chart_flag: bool, clean_chart: bool, impute_outlier_chart: bool, thresh: int, left_thresh: int) tuple[DataFrame | None, DataFrame | None][source]¶
Performs grouping on diagnosis data and/or outlier removal and imputation on chart events data.
- Parameters:¶
- cohort_output : str¶
Cohort output file name.
- root_dir : str¶
Root directory of the MIMIC-IV dataset.
- dia_flag : bool
Whether to process diagnosis data.
- group_diag : GroupOption¶
Grouping option for diagnosis data.
"both": Keep both ICD-9 and ICD-10 codes."convert": Convert ICD-9 to ICD-10 codes."convert_group": Convert ICD-9 to ICD-10 and group ICD-10 codes. Only applicable ifdiag_flagisTrue.- chart_flag : bool¶
Whether to process chart events data.
- clean_chart : bool¶
Whether to clean chart events data. Only applicable if
chart_flagisTrue.- impute_outlier_chart : bool¶
Whether to impute outliers in chart events data. Only applicable if
chart_flagisTrue.- thresh : int¶
(Right/upper) threshold for outlier removal. Only applicable if
chart_flagisTrue.- left_thresh : int¶
(Left/lower) threshold for outlier removal. Only applicable if
chart_flagisTrue.
- Returns:¶
Dataframes
diag, chart, depending on the flags.- Return type:¶
Tuple[Optional[pd.DataFrame], Optional[pd.DataFrame]]
- tempor.datasources.mivdp.preproc.features.feature_preproc_icu.generate_summary_icu(cohort_output: str, root_dir: str, diag_flag: bool, proc_flag: bool, med_flag: bool, out_flag: bool, chart_flag: bool) tuple[DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None, DataFrame | None][source]¶
Generates summary of features.
- Parameters:¶
- cohort_output : str¶
Cohort output file name.
- root_dir : str¶
Root directory of the MIMIC-IV dataset.
- diag_flag : bool¶
Whether to generate summary of diagnosis data.
- proc_flag : bool¶
Whether to generate summary of procedures data.
- med_flag : bool¶
Whether to generate summary of medications data.
- out_flag : bool¶
Whether to generate summary of output events data.
- chart_flag : bool¶
Whether to generate summary of chart events data.
- Returns:¶
Output dataframes
summary_diag, summary_med, summary_proc, summary_out, summary_chart, depending on the flags.- Return type:¶
OutDfs
- tempor.datasources.mivdp.preproc.features.feature_preproc_icu.features_selection_icu(cohort_output: str, root_dir: str, diag_flag: bool, proc_flag: bool, med_flag: bool, out_flag: bool, chart_flag: bool, select_diag: bool, select_med: bool, select_proc: bool, select_out: bool, select_chart: bool)[source]¶
Selects features based on the summary.
This currently requires that the user manually edit the summary files (
<root_dir>/data/summary/{diag,proc,med,out,chart}_features.csv) to select the features.- Parameters:¶
- cohort_output : str¶
Cohort output file name.
- root_dir : str¶
Root directory of the MIMIC-IV dataset.
- diag_flag : bool¶
Whether to select diagnosis data.
- proc_flag : bool¶
Whether to select procedures data.
- med_flag : bool¶
Whether to select medications data.
- out_flag : bool¶
Whether to select output events data.
- chart_flag : bool¶
Whether to select chart events data.
- select_diag : bool¶
Whether to select diagnosis data based on the summary.
- select_med : bool¶
Whether to select medications data based on the summary.
- select_proc : bool¶
Whether to select procedures data based on the summary.
- select_out : bool¶
Whether to select output events data based on the summary.
- select_chart : bool¶
Whether to select chart events data based on the summary.
- Returns:¶
Output dataframes
diag, out, chart, proc, med, depending on the flags.- Return type:¶
OutDfs