Gaining Actionable Insights in COVID-19 Dataset Using Word Embeddings

dc.contributor.authorJha, R.A.
dc.contributor.authorAnanthanarayana, V.S.
dc.date.accessioned2026-02-08T16:50:13Z
dc.date.issued2022
dc.description.abstractThe field of unsupervised natural language processing (NLP) is gradually growing in prominence and popularity due to the overwhelming amount of scientific and medical data available as text, such as published journals and papers. To make use of this data, several techniques are used to extract information from these texts. Here, in this paper, we have made use of COVID-19 corpus (https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge ) related to the deadly corona virus, SARS-CoV-2, to extract useful information which can be invaluable in finding the cure of the disease. We make use of two word-embeddings model, Word2Vec and global vector for word representation (GloVe), to efficiently encode all the information available in the corpus. We then follow some simple steps to find the possible cures of the disease. We got useful results using these word-embeddings models, and also, we observed that Word2Vec model performed better than GloVe model on the used dataset. Another point highlighted by this work is that latent information about potential future discoveries are significantly contained in past papers and publications. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
dc.identifier.citationLecture Notes in Electrical Engineering, 2022, Vol.888, , p. 459-466
dc.identifier.isbn9789819680023
dc.identifier.isbn9789819542734
dc.identifier.isbn9789819540440
dc.identifier.isbn9789819658473
dc.identifier.isbn9789819600571
dc.identifier.isbn9783032147417
dc.identifier.isbn9789819540488
dc.identifier.isbn9789819644292
dc.identifier.isbn9789819637577
dc.identifier.isbn9789819663392
dc.identifier.issn18761100
dc.identifier.urihttps://doi.org/10.1007/s12008-025-02321-7
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/33697
dc.publisherSpringer Science and Business Media Deutschland GmbH
dc.subjectCOVID-19
dc.subjectEmbeddings
dc.subjectGloVe
dc.subjectNLP
dc.subjectWord2Vec
dc.titleGaining Actionable Insights in COVID-19 Dataset Using Word Embeddings

Files

Collections