Faculty Publications
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736
Publications by NITK Faculty
Browse
14 results
Search Results
Item Dynamic and temporal user profiling for personalized recommenders using heterogeneous data sources(Institute of Electrical and Electronics Engineers Inc., 2017) S. Krishnan, G.S.; Kamath S․, S.In modern Web applications, the process of user-profiling provides a way to capture user-specific information, which then serves as a source for designing personalized user experiences. Currently, such information about a particular user is available from multiple online sources/services, like social media applications, professional/social networking sites, location based service providers or even from simple Web-pages. The nature of this data being truly heterogeneous, high in volume and also highly dynamic over time, the problem of collecting these data artifacts from disparate sources, to enable complete user-profiling can be challenging. In this paper, we present an approach to dynamically build a structured user profile, that emphasizes the temporal nature to capture dynamic user behavior. The user profile is compiled from multiple, heterogeneous data sources which capture dynamic user actions over time, to capture changing preferences accurately. Natural language processing techniques, machine learning and concepts of the semantic Web were used for capturing relevant user data and implement the proposed '3D User Profile'. Our technique also supports the representation of the generated user profiles as structured data so that other personalized recommendation systems and Semantic Web/Linked Open Data applications can consume them for providing intelligent, personalized services. © 2017 IEEE.Item A supervised learning approach for ICU mortality prediction based on unstructured electrocardiogram text reports(Springer Verlag service@springer.de, 2018) S. Krishnan, G.S.; Kamath S․, S.Extracting patient data documented in text-based clinical records into a structured form is a predominantly manual process, both time and cost-intensive. Moreover, structured patient records often fail to effectively capture the nuances of patient-specific observations noted in doctors’ unstructured clinical notes and diagnostic reports. Automated techniques that utilize such unstructured text reports for modeling useful clinical information for supporting predictive analytics applications can thus be highly beneficial. In this paper, we propose a neural network based method for predicting mortality risk of ICU patients using unstructured Electrocardiogram (ECG) text reports. Word2Vec word embedding models were adopted for vectorizing and modeling textual features extracted from the patients’ reports. An unsupervised data cleansing technique for identification and removal of anomalous data/special cases was designed for optimizing the patient data representation. Further, a neural network model based on Extreme Learning Machine architecture was proposed for mortality prediction. ECG text reports available in the MIMIC-III dataset were used for experimental validation. The proposed model when benchmarked against four standard ICU severity scoring methods, outperformed all by 10–13%, in terms of prediction accuracy. © 2018, Springer International Publishing AG, part of Springer Nature.Item A Supervised Approach for Patient-Specific ICU Mortality Prediction Using Feature Modeling(Springer Verlag service@springer.de, 2019) S. Krishnan, G.S.; Kamath S․, S.K.Intensive Care Units (ICUs) are one of the most essential, but expensive healthcare services provided in hospitals. Modern monitoring machines in critical care units continuously generate huge amount of data, which can be used for intelligent decision-making. Prediction of mortality risk of patients is one such predictive analytics application, which can assist hospitals and healthcare personnel in making informed decisions. Traditional scoring systems currently in use are parametric scoring methods which often suffer from low accuracy. In this paper, an empirical study on the effect of feature selection on the feature set of traditional scoring methods for modeling an optimal feature set to represent each patient’s profile along with a supervised learning approach for ICU mortality prediction have been presented. Experimental evaluation of the proposed approach in comparison to standard severity scores like SAPS-II, SOFA and OASIS showed that the proposed model outperformed them by a margin of 12–16% in terms of prediction accuracy. © 2019, Springer Nature Singapore Pte Ltd.Item TAGS: Towards Automated Classification of Unstructured Clinical Nursing Notes(Springer Verlag service@springer.de, 2019) Gangavarapu, T.; Jayasimha, A.; S. Krishnan, G.S.; Kamath S․, S.K.Accurate risk management and disease prediction are vital in intensive care units to channel prompt care to patients in critical conditions and aid medical personnel in effective decision making. Clinical nursing notes document subjective assessments and crucial information of a patient’s state, which is mostly lost when transcribed into Electronic Medical Records (EMRs). The Clinical Decision Support Systems (CDSSs) in the existing body of literature are heavily dependent on the structured nature of EMRs. Moreover, works which aim at benchmarking deep learning models are limited. In this paper, we aim at leveraging the underutilized treasure-trove of patient-specific information present in the unstructured clinical nursing notes towards the development of CDSSs. We present a fuzzy token-based similarity approach to aggregate voluminous clinical documentations of a patient. To structure the free-text in the unstructured notes, vector space and coherence-based topic modeling approaches that capture the syntactic and latent semantic information are presented. Furthermore, we utilize the predictive capabilities of deep neural architectures for disease prediction as ICD-9 code group. Experimental validation revealed that the proposed Term weighting of nursing notes AGgregated using Similarity (TAGS) model outperformed the state-of-the-art model by 5% in AUPRC and 1.55% in AUROC. © 2019, Springer Nature Switzerland AG.Item Hybrid text feature modeling for disease group prediction using unstructured physician notes(Springer Science and Business Media Deutschland GmbH, 2020) S. Krishnan, G.S.; Kamath S․, S.Existing Clinical Decision Support Systems (CDSSs) largely depend on the availability of structured patient data and Electronic Health Records (EHRs) to aid caregivers. However, in case of hospitals in developing countries, structured patient data formats are not widely adopted, where medical professionals still rely on clinical notes in the form of unstructured text. Such unstructured clinical notes recorded by medical personnel can also be a potential source of rich patient-specific information which can be leveraged to build CDSSs, even for hospitals in developing countries. If such unstructured clinical text can be used, the manual and time-consuming process of EHR generation will no longer be required, with huge person-hours and cost savings. In this article, we propose a generic ICD9 disease group prediction CDSS built on unstructured physician notes modeled using hybrid word embeddings. These word embeddings are used to train a deep neural network for effectively predicting ICD9 disease groups. Experimental evaluation showed that the proposed approach outperformed the state-of-the-art disease group prediction model built on structured EHRs by 15% in terms of AUROC and 40% in terms of AUPRC, thus proving our hypothesis and eliminating dependency on availability of structured patient data. © Springer Nature Switzerland AG 2020.Item Deep neural learning for automated diagnostic code group prediction using unstructured nursing notes(Association for Computing Machinery, 2020) Jayasimha, A.; Gangavarapu, T.; Kamath S․, S.; S. Krishnan, G.S.Disease prediction, a central problem in clinical care and management, has gained much significance over the last decade. Nursing notes documented by caregivers contain valuable information concerning a patient's state, which can aid in the development of intelligent clinical prediction systems. Moreover, due to the limited adaptation of structured electronic health records in developing countries, the need for disease prediction from such clinical text has garnered substantial interest from the research community. The availability of large, publicly available databases such as MIMIC-III, and advancements in machine and deep learning models with high predictive capabilities have further facilitated research in this direction. In this work, we model the latent knowledge embedded in the unstructured clinical nursing notes, to address the clinical task of disease prediction as a multi-label classification of ICD-9 code groups. We present EnTAGS, which facilitates aggregation of the data in the clinical nursing notes of a patient, by modeling them independent of one another. To handle the sparsity and high dimensionality of clinical nursing notes effectively, our proposed EnTAGS is built on the topics extracted using Non-negative matrix factorization. Furthermore, we explore the applicability of deep learning models for the clinical task of disease prediction, and assess the reliability of the proposed models using standard evaluation metrics. Our experimental evaluation revealed that the proposed approach consistently exceeded the state-of-the-art prediction model by 1.87% in accuracy, 12.68% in AUPRC, and 11.64% in MCC score. © 2020 Association for Computing Machinery.Item Integrating Structured and Unstructured Patient Data for ICD9 Disease Code Group Prediction(Association for Computing Machinery, 2020) Prabhakar, P.; Shidharth, S.; S. Krishnan, G.S.; Kamath S․, S.The large-scale availability of healthcare data provides significant opportunities for development of advanced Clinical Decision Support Systems that can enhance patient care. One such essential application is automated ICD-9 diagnosis group prediction, useful for a variety of healthcare delivery related tasks including documenting, billing and insurance claims. Past attempts considered patients' multivariate lab events data and clinical text notes independently. To the best of our knowledge, ours is the first attempt to investigate the efficacy of integration of both these aspects for this task. Experiments on MIMIC-III dataset showed promising results. © 2021 Owner/Author.Item Analysis and Prediction of Fantasy Cricket Contest Winners Using Machine Learning Techniques(Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2021) Karthik, K.; S. Krishnan, G.S.; Shetty, S.; Bankapur, S.; Kolkar, R.; Ashwin, T.S.; Vanahalli, M.K.Cricket is one of the well-known sports across the world. The increasing interest of cricket in recent years resulted in different forms like T20, T10 from test and one day format. The craze of all these formats of cricket matches today has come into online fantasy cricket league games. Dream11 is one such app that is most popular in this context, along with many similar apps. Creating a dream team of 11 players from playing 11 of both teams involves skills, ideas and luck. Predicting a winner among all the joined contestants based on the previous historical data is a challenging task. In this paper, we used a feed-forward deep neural network (DNN) classifier for predicting the winning contestant for the top three positions in a fantasy league cricket contest. The performance of the DNN approach was compared against that of state-of-the-art machine learning approaches like k-nearest neighbours (KNN), logistic regression (LR), Naive Bayes (NB), random forest (RF), support vector machines (SVM) and in predicting the fantasy cricket contest winners. Among the methods used, DNN showed the best results for all three positions, showing its consistency in predicting the winners and outperforms the state-of-the-art machine learning classifiers by 13%, 8% and 9%, respectively, for first, second and third winning positions, respectively. © 2021, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.Item Predicting Vaccine Hesitancy and Vaccine Sentiment Using Topic Modeling and Evolutionary Optimization(Springer Science and Business Media Deutschland GmbH, 2021) S. Krishnan, G.S.; Kamath S․, S.; Sugumaran, V.The ongoing COVID-19 pandemic has posed serious threats to the world population, affecting over 219 countries with a staggering impact of over 162 million cases and 3.36 million casualties. With the availability of multiple vaccines across the globe, framing vaccination policies for effectively inoculating a country’s population against such diseases is currently a crucial task for public health agencies. Social network users post their views and opinions on vaccines publicly and these posts can be put to good use in identifying vaccine hesitancy. In this paper, a vaccine hesitancy identification approach is proposed, built on novel text feature modeling based on evolutionary computation and topic modeling. The proposed approach was experimentally validated on two standard tweet datasets – the flu vaccine dataset and UK COVID-19 vaccine tweets. On the first dataset, the proposed approach outperformed the state-of-the-art in terms of standard metrics. The proposed model was also evaluated on the UKCOVID dataset and the results are presented in this paper, as our work is the first to benchmark a vaccine hesitancy model on this dataset. © 2021, Springer Nature Switzerland AG.Item Diagnostic Code Group Prediction by Integrating Structured and Unstructured Clinical Data(Springer Science and Business Media Deutschland GmbH, 2021) Prabhakar, A.; Shidharth, S.; S. Krishnan, G.S.; Kamath S․, S.Diagnostic coding is a process by which written, verbal and other patient-case related documentation are used for enabling disease prediction, accurate documentation, and insurance settlements. It is a prevalently manual process even in countries that have successfully adopted Electronic Health Record (EHR) systems. The problem is exacerbated in developing countries where widespread adoption of EHR systems is still not at par with Western counterparts. EHRs contain a wealth of patient information embedded in numerical, text, and image formats. A disease prediction model that exploits all this information, enabling accurate and faster diagnosis would be quite beneficial. We address this challenging task by proposing mixed ensemble models consisting of boosting and deep learning architectures for the task of diagnostic code group prediction. The models are trained on a dataset created by integrating features from structured (lab test reports) as well as unstructured (clinical text) data. We analyze the proposed model’s performance on MIMIC-III, an open dataset of clinical data using standard multi-label metrics. Empirical evaluations underscored the significant performance of our approach for this task, compared to state-of-the-art works which rely on a single data source. Our novelty lies in effectively integrating relevant information from both data sources thereby ensuring larger ICD-9 code coverage, handling the inherent class imbalance, and adopting a novel approach to form the ensemble models. © 2021, Springer Nature Switzerland AG.
