Faculty Publications
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736
Publications by NITK Faculty
Browse
3 results
Search Results
Item Identifying Similar Questions in the Medical Domain Using a Fine-tuned Siamese-BERT Model(Institute of Electrical and Electronics Engineers Inc., 2022) Merchant, A.; Shenoy, N.; Bharali, A.; Anand Kumar, A.M.A large number of people search about their health related problems on the web. However, the number of sites with qualified and verified people answering their queries is quite low in comparison to the number of questions being put up. The rate of queries being searched on such sites has further increased due to the COVID-19 pandemic. The main reason people find it difficult to find solutions to their queries is due to ineffective identification of semantically similar questions in the medical domain. For most cases, answers to the queries people ask would be present, the only caveat being the question may be present in a different form than the one asked by the particular user. In this research, we propose a Siamese-based BERT model to detect similar questions using a fine-tuning approach. The network is fine-tuned with medical question-answer pairs and then with question-question pairs to get a better question similarity prediction. © 2022 IEEE.Item Effective Information Retrieval, Question Answering and Abstractive Summarization on Large-Scale Biomedical Document Corpora(Springer Science and Business Media Deutschland GmbH, 2023) Shenoy, N.; Nayak, P.; Jain, S.; Kamath S․, S.; Sugumaran, V.During the COVID-19 pandemic, a concentrated effort was made to collate published literature on SARS-Cov-2 and other coronaviruses for the benefit of the medical community. One such initiative is the COVID-19 Open Research Dataset which contains over 400,000 published research articles. To expedite access to relevant information sources for health workers and researchers, it is vital to design effective information retrieval and information extraction systems. In this article, an IR approach leveraging transformer-based models to enable question-answering and abstractive summarization is presented. Various keyword-based and neural-network-based models are experimented with and incorporated to reduce the search space and determine relevant sentences from the vast corpus for ranked retrieval. For abstractive summarization, candidate sentences are determined using a combination of various standard scoring metrics. Finally, the summary and the user query are utilized for supporting question answering. The proposed model is evaluated based on standard metrics on the standard CovidQA dataset for both natural language and keyword queries. The proposed approach achieved promising performance for both query classes, while outperforming various unsupervised baselines. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.Item Ensemble neural models for ICD code prediction using unstructured and structured healthcare data(Elsevier Ltd, 2024) Merchant, A.M.; Shenoy, N.; Lanka, S.; Kamath S․, S.Disease coding is the process of assigning one or more standardized diagnostic codes to clinical notes that are maintained by health practitioners (e.g. clinicians) to track patient condition. Such a coding process is often expensive and error-prone, as human medical coders primarily perform it. Automating diagnostic coding using Artificial Intelligence is seen as an essential solution in Hospital Information Management Systems and approaches built on Convolutional Neural Networks currently perform best. In this work, a neural model built on unstructured clinical text for enabling automatic diagnostic coding for given patient discharge summaries is proposed. We incorporate a structured self-attention mechanism designed to boost learning of label-specific vectors and the significant clinical text snippets associated with a certain label for this purpose. These vectors are then combined with a novel code description pipeline leveraging the descriptions provided for each standardized diagnostic code. The proposed model achieved best performance in terms of standard metrics over the MIMIC-III dataset, outperforming models based on Longformers and Knowledge graphs. Furthermore, to leverage structured clinical data to enhance the proposed model, and to enable improved diagnostic code prediction, model ensembling is considered. A neural model built on structured data by leveraging supervised machine learning algorithms such as random forest and boosting, is designed for multi-class code classification. Experimental results revealed that the proposed ensemble models show promising performance compared to traditional models that rely solely on unstructured or structured clinical data, emphasizing their suitability for real-world deployment. © 2024 The Author(s)
