Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Integrating Structured and Unstructured Patient Data for ICD9 Disease Code Group Prediction
    (Association for Computing Machinery, 2020) Prabhakar, P.; Shidharth, S.; S. Krishnan, G.S.; Kamath S․, S.
    The large-scale availability of healthcare data provides significant opportunities for development of advanced Clinical Decision Support Systems that can enhance patient care. One such essential application is automated ICD-9 diagnosis group prediction, useful for a variety of healthcare delivery related tasks including documenting, billing and insurance claims. Past attempts considered patients' multivariate lab events data and clinical text notes independently. To the best of our knowledge, ours is the first attempt to investigate the efficacy of integration of both these aspects for this task. Experiments on MIMIC-III dataset showed promising results. © 2021 Owner/Author.
  • Item
    Diagnostic Code Group Prediction by Integrating Structured and Unstructured Clinical Data
    (Springer Science and Business Media Deutschland GmbH, 2021) Prabhakar, A.; Shidharth, S.; S. Krishnan, G.S.; Kamath S․, S.
    Diagnostic coding is a process by which written, verbal and other patient-case related documentation are used for enabling disease prediction, accurate documentation, and insurance settlements. It is a prevalently manual process even in countries that have successfully adopted Electronic Health Record (EHR) systems. The problem is exacerbated in developing countries where widespread adoption of EHR systems is still not at par with Western counterparts. EHRs contain a wealth of patient information embedded in numerical, text, and image formats. A disease prediction model that exploits all this information, enabling accurate and faster diagnosis would be quite beneficial. We address this challenging task by proposing mixed ensemble models consisting of boosting and deep learning architectures for the task of diagnostic code group prediction. The models are trained on a dataset created by integrating features from structured (lab test reports) as well as unstructured (clinical text) data. We analyze the proposed model’s performance on MIMIC-III, an open dataset of clinical data using standard multi-label metrics. Empirical evaluations underscored the significant performance of our approach for this task, compared to state-of-the-art works which rely on a single data source. Our novelty lies in effectively integrating relevant information from both data sources thereby ensuring larger ICD-9 code coverage, handling the inherent class imbalance, and adopting a novel approach to form the ensemble models. © 2021, Springer Nature Switzerland AG.
  • Item
    Neural Language Modeling of Unstructured Clinical Notes for Automated Patient Phenotyping
    (Institute of Electrical and Electronics Engineers Inc., 2022) Prabhakar, A.; Shidharth, S.; Kamath S․, S.
    The availability of huge volume and variety of healthcare data provides a wide scope for designing cutting-edge clinical decision support systems (CDSS) that can improve the quality of patient care. Identifying patients suffering from certain conditions/symptoms, commonly referred to as phenotyping, is a fundamental problem that can be addressed using the rich health-related data collected for generation of Electronic Health Records (EHRs). Phenotyping forms the foundation for translational research, effectiveness studies, and is used for analyzing population health using regularly collected EHR data. Also, determining if a patient has a particular medical condition is crucial for secondary analysis, such as in critical care situations to predict potential drug interactions and adverse events. In this paper, we consider all categories of unstructured clinical notes of patients, typically stored as part of EHRs in the raw form. The standard MIMIC-III dataset is considered for benchmark experiments for patient phenotyping. Experiments revealed that our proposed models outperformed state-of-the art works built on vanilla BERT ClinicalBERT models on the patient cohort considered, measured in terms of standard multi-label classification metrics like AUROC score (improvement by 6%), F1-score (by 4%), and Hamming Loss (by 17%) when we considered only patient discharge summaries and radiology notes. Further experiments with other note categories showed that using discharge summaries and physician notes yields significant improvements on the entire dataset giving 0.8 AUROC score, 0.72 F1 score, 0.09 Hamming loss. © 2022 IEEE.