Neural Language Modeling of Unstructured Clinical Notes for Automated Patient Phenotyping

dc.contributor.authorPrabhakar, A.
dc.contributor.authorShidharth, S.
dc.contributor.authorKamath S․, S.
dc.date.accessioned2026-02-06T06:35:40Z
dc.date.issued2022
dc.description.abstractThe availability of huge volume and variety of healthcare data provides a wide scope for designing cutting-edge clinical decision support systems (CDSS) that can improve the quality of patient care. Identifying patients suffering from certain conditions/symptoms, commonly referred to as phenotyping, is a fundamental problem that can be addressed using the rich health-related data collected for generation of Electronic Health Records (EHRs). Phenotyping forms the foundation for translational research, effectiveness studies, and is used for analyzing population health using regularly collected EHR data. Also, determining if a patient has a particular medical condition is crucial for secondary analysis, such as in critical care situations to predict potential drug interactions and adverse events. In this paper, we consider all categories of unstructured clinical notes of patients, typically stored as part of EHRs in the raw form. The standard MIMIC-III dataset is considered for benchmark experiments for patient phenotyping. Experiments revealed that our proposed models outperformed state-of-the art works built on vanilla BERT ClinicalBERT models on the patient cohort considered, measured in terms of standard multi-label classification metrics like AUROC score (improvement by 6%), F1-score (by 4%), and Hamming Loss (by 17%) when we considered only patient discharge summaries and radiology notes. Further experiments with other note categories showed that using discharge summaries and physician notes yields significant improvements on the entire dataset giving 0.8 AUROC score, 0.72 F1 score, 0.09 Hamming loss. © 2022 IEEE.
dc.identifier.citation2022 56th Annual Conference on Information Sciences and Systems, CISS 2022, 2022, Vol., , p. 142-147
dc.identifier.urihttps://doi.org/10.1109/CISS53076.2022.9751198
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/29997
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.subjectclinical decision support systems
dc.subjecthealthcare analytics
dc.subjectpatient phenotyping
dc.subjectunstructured text modeling
dc.titleNeural Language Modeling of Unstructured Clinical Notes for Automated Patient Phenotyping

Files