Effective Information Retrieval, Question Answering and Abstractive Summarization on Large-Scale Biomedical Document Corpora

dc.contributor.authorShenoy, N.
dc.contributor.authorNayak, P.
dc.contributor.authorJain, S.
dc.contributor.authorKamath S․, S.
dc.contributor.authorSugumaran, V.
dc.date.accessioned2026-02-06T06:34:50Z
dc.date.issued2023
dc.description.abstractDuring the COVID-19 pandemic, a concentrated effort was made to collate published literature on SARS-Cov-2 and other coronaviruses for the benefit of the medical community. One such initiative is the COVID-19 Open Research Dataset which contains over 400,000 published research articles. To expedite access to relevant information sources for health workers and researchers, it is vital to design effective information retrieval and information extraction systems. In this article, an IR approach leveraging transformer-based models to enable question-answering and abstractive summarization is presented. Various keyword-based and neural-network-based models are experimented with and incorporated to reduce the search space and determine relevant sentences from the vast corpus for ranked retrieval. For abstractive summarization, candidate sentences are determined using a combination of various standard scoring metrics. Finally, the summary and the user query are utilized for supporting question answering. The proposed model is evaluated based on standard metrics on the standard CovidQA dataset for both natural language and keyword queries. The proposed approach achieved promising performance for both query classes, while outperforming various unsupervised baselines. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
dc.identifier.citationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, Vol.13913 LNCS, , p. 404-415
dc.identifier.issn3029743
dc.identifier.urihttps://doi.org/10.1007/978-3-031-35320-8_29
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/29500
dc.publisherSpringer Science and Business Media Deutschland GmbH
dc.subjectAbstractive Summarization
dc.subjectInformation retrieval
dc.subjectPageRank
dc.subjectQuestion-answering
dc.subjectTransformer models
dc.titleEffective Information Retrieval, Question Answering and Abstractive Summarization on Large-Scale Biomedical Document Corpora

Files