Faculty Publications
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736
Publications by NITK Faculty
Browse
14 results
Search Results
Item An unified approach for multimedia document representation and document similarity(Institute of Electrical and Electronics Engineers Inc., 2015) Pushpalatha, K.; Ananthanarayana, V.S.In the recent years, the evolution in multimedia technology has accelerated the growth of multimedia data. Even though the multimedia data are heterogeneous, the rich information they carry has made a high demand for sophisticated multimedia knowledge discovery systems. To mine the knowledge from multimedia document, each type of multimedia data has to undergo unique processing and knowledge discovery processes because of its uniqueness. However, this procedure of processing and analyzing each type of multimedia data separately may make the system more complicated in case of large databases. Alternatively, it will be more advantageous if heterogeneous objects are represented in a common domain, such that the similar processing and knowledge discovery methods can be used. Motivated by this concept, a method known as domain converter is proposed to represent the heterogeneous multimedia objects in a spatial domain. Also based on information theory, a similarity measure is proposed to find the similarity between the documents. To evaluate the proposed framework, the experiments have been conducted for the retrieval of multimedia documents. The proposed domain converter represents the multimedia document in a homogeneous domain, and with the proposed similarity measure, better document retrieval rate has been achieved. © 2014 IEEE.Item Intelligent Data Mining for Collaborative Information Seeking(Springer Nature, 2020) Kumar, A.; Chandrasekaran, K.; Shukla, A.; Usha, D.World Wide Web (WWW) contains different kinds of information whether it be social, educational, historical, sports, news, financial, weather, technology, politics etc. Most of the people spend time on the internet to access data for information seeking purposes. Information provided on the web is available in different formats like in text format, image format or video format, and they can be accessed through different access interfaces. Accessing information from such a large place i.e. World Wide Web through so many websites would become a very cumbersome process, therefore, in this paper, we present a new method which will produce information based on the user input using appropriate keywords. The data will be retrieved from the internet using Data Mining approach without the need for rules and training of pages. The main focus will be to extract or retrieve data of a person like educational qualifications, gender, contact information, contributions in his work, his/her social nature, etc. The query to be searched on the platform or model should have meaningful keywords attached to it best describing the person or else data of some different person might be fetched. © 2020, Springer Nature Switzerland AG.Item Improving convergence in Irgan with PPO(Association for Computing Machinery, 2020) Jain, M.; Kamath S․, S.Information retrieval modeling aims to optimise generative and discriminative retrieval strategies, where, generative retrieval focuses on predicting query-specific relevant documents and discriminative retrieval tries to predict relevancy given a query-document pair. IRGAN unifies the generative and discriminative retrieval approaches through a minimax game. However, training IRGAN is unstable and varies largely with the random initialization of parameters. In this work, we propose improvements to IRGAN training through a novel optimization objective based on proximal policy optimisation and gumbel-softmax based sampling for the generator, along with a modified training algorithm which performs the gradient update on both the models simultaneously for each training iteration. We benchmark our proposed approach against IRGAN on three different information retrieval tasks and present empirical evidence of improved convergence. © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.Item A Probabilistic Precision Information Retrieval Model for Personalized Clinical Trial Recommendation based on Heterogeneous Data(Institute of Electrical and Electronics Engineers Inc., 2021) Kamath S․, S.; Veena Mayya; Priyadarshini, R.In modern healthcare practices, diagnosis and treatment for certain complex illnesses require specific information on the. patients' background, genealogy, heredity, demographic data etc. Even with a similar diagnosis, treatments may need to designed specifically to adapt well to the patients' genetic, cultural, and lifestyle aspects. Precision medicine mainly deals with enabling personalized care based on a given patient's conditions in a scientifically rigorous way. Because this entails recommending personalized therapies to patients and has the potential to affect the health of other people, the performance of a designed system must be accurate and exact. In this paper, a precision information retrieval system is proposed that leverages structured and unstructured data to retrieve. relevant knowledge for enabling personalized recommendations, The. proposed pipeline is validated with the cllnlcal trial dataset of the Precision medicine track of TREe 2017. A set of relevant ranked clinical trials for a given condition/disease that could not be cured using any of the traditional treatments suggested are retrieved using structured and unstructured patient data. 'We employ multiple IR techniques like Best Match 25, query reformulation and rearanking facilitated through deep neural networks, focusing on extracting highly accurate and relevant trials. The proposed pipeline achieved a high score of 0.58 in terms of Normalized Discounted Cumulative Gain (NDCG) score for ranking the relevant clinical trials, outperforming the state-of-the-art approaches. © 2021 IEEE.Item Sketch-Based Image Retrieval Using Convolutional Neural Networks Based on Feature Adaptation and Relevance Feedback(Springer Science and Business Media Deutschland GmbH, 2022) Kumar, N.; Ahmed, R.; B Honnakasturi, V.; Kamath S․, S.; Mayya, V.Sketch-based Image Retrieval (SBIR) is an approach where natural images are retrieved according to the given input sketch query. SBIR has many applications, for example, searching for a product given the sketch pattern in digital catalogs, searching for missing people given their prominent features from a digital people photo repository etc. The main challenge involved in implementing such a system is the absence of semantic information in the sketch query. In this work, we propose a combination of image prepossessing and deep learning-based methods to tackle this issue. A binary image highlighting the edges in the natural image is obtained using Canny-Edge detection algorithm. The deep features were extracted by an ImageNet based CNN model. Cosine similarity and Euclidean distance measures are adopted to generate the rank list of candidate natural images. Relevance feedback using Rocchio’s method is used to adapt the query of sketch images and feature weights according to relevant images and non-relevant images. During the experimental evaluation, the proposed approach achieved a Mean average precision (MAP) of 71.84%, promising performance in retrieving relevant images for the input query sketch images. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.Item Effective Information Retrieval, Question Answering and Abstractive Summarization on Large-Scale Biomedical Document Corpora(Springer Science and Business Media Deutschland GmbH, 2023) Shenoy, N.; Nayak, P.; Jain, S.; Kamath S․, S.; Sugumaran, V.During the COVID-19 pandemic, a concentrated effort was made to collate published literature on SARS-Cov-2 and other coronaviruses for the benefit of the medical community. One such initiative is the COVID-19 Open Research Dataset which contains over 400,000 published research articles. To expedite access to relevant information sources for health workers and researchers, it is vital to design effective information retrieval and information extraction systems. In this article, an IR approach leveraging transformer-based models to enable question-answering and abstractive summarization is presented. Various keyword-based and neural-network-based models are experimented with and incorporated to reduce the search space and determine relevant sentences from the vast corpus for ranked retrieval. For abstractive summarization, candidate sentences are determined using a combination of various standard scoring metrics. Finally, the summary and the user query are utilized for supporting question answering. The proposed model is evaluated based on standard metrics on the standard CovidQA dataset for both natural language and keyword queries. The proposed approach achieved promising performance for both query classes, while outperforming various unsupervised baselines. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.Item Transformer and Knowledge Based Siamese Models for Medical Document Retrieval(Institute of Electrical and Electronics Engineers Inc., 2023) Dash, A.; Merchant, A.M.; Chintawar, S.; Kamath S․, S.Vocabulary mismatch is a significant issue when it comes to query-based document retrieval in the medical field. Since the documents are typically authored by professionals, they may contain many specialized terms that are not widely understood or used. Traditional information retrieval (IR) models like vector space and best match-based models fail in this regard. Neural Learning to Rank (NLtR) and transformer models have attracted significant research attention in the field of IR. Recent works in the medical field utilize medical knowledge bases (KB) that map words to concepts and aid in connecting several words to the same concept. In this paper, we present various Siamese-structured transformer and knowledge-based retrieval models designed to address the retrieval issues in the medical domain. The experimental evaluation highlighted the superior performance of the proposed retrieval model, and the best one, based on the UMLSBert ENG transformer, achieved best-in-class performance with respect to all evaluation metrics. © 2023 IEEE.Item Semantic similarity based context-aware web service discovery using NLP techniques(Rinton Press Inc. sales@rintonpress.com, 2016) Kamath S?, S.S.; Ananthanarayana, V.S.Due to the high availability and also the distributed nature of published web services on the Web, efficient discovery and retrieval of relevant services that meet user requirements can be a challenging task. In this paper, we present a semantics based web service retrieval framework that uses natural language processing techniques to extract a service’s functional information. The extracted information is used to compute the similarity between any given service pair, for generating additional metadata for each service and for classifying the services based on their functional similarity. The framework also adds natural language querying capabilities for supporting exact and approximate matching of relevant services to a given user query. We present experimental results that show that the semantic analysis & automatic tagging effectively captured the inherent functional details of a service and also the similarity between different services. Also, a significant improvement in precision and recall was observed during Web service retrieval when compared to simple keyword matching search, using the natural language querying interface provided by the proposed framework. © Rinton Press.Item Feature pattern based representation of multimedia documents for efficient knowledge discovery(Springer New York LLC barbara.b.bertram@gsk.com, 2016) Pushpalatha, K.; Ananthanarayana, V.S.The rapid growth of multimedia documents has raised huge demand for sophisticated multimedia knowledge discovery systems. The knowledge extraction of the documents mainly relies on the data representation model and the document representation model. As the multimedia document comprised of multimodal multimedia objects, the data representation depends on modality of the objects. The multimodal objects require distinct processing and feature extraction methods resulting in different features with different dimensionalities. Managing multiple types of features is challenging for knowledge extraction tasks. The unified representation of multimedia document benefits the knowledge extraction process, as they are represented by same type of features. The appropriate document representation will benefit the overall decision making process by reducing the search time and memory requirements. In this paper, we propose a domain converting method known as Multimedia to Signal converter (MSC) to represent the multimodal multimedia document in an unified representation by converting multimodal objects as signal objects. A tree based approach known as Multimedia Feature Pattern (MFP) tree is proposed for the compact representation of multimedia documents in terms of features of multimedia objects. The effectiveness of the proposed framework is evaluated by performing the experiments on four multimodal datasets. Experimental results show that the unified representation of multimedia documents helped in improving the classification accuracy for the documents. The MFP tree based representation of multimedia documents not only reduces the search time and memory requirements, also outperforms the competitive approaches for search and retrieval of multimedia documents. © 2016, Springer Science+Business Media New York.Item Keyword-based private searching on cloud data along with keyword association and dissociation using cuckoo filter(Springer Verlag service@springer.de, 2019) Vora, A.V.; Hegde, S.Outsourcing of data is a very common scenario in the present-day world and quite often we need to outsource confidential data whose privacy is of utmost concern. Performing encryption before outsourcing the data is a simple solution to preserve privacy. Preferably a public-key encryption technique is used to encrypt the data. A demerit of encrypting data is that while requesting the data from the cloud we need to have some technique which supports search functionality on encrypted data. Without the searchable encryption technique, the cloud is forced to send the whole database, which is highly inefficient and impractical. To address this problem, we consider the email scenario, in which the sender of the email will encrypt email contents using receiver’s public key; hence, only the receiver can decrypt email contents. We propose a scheme that will have encrypted emails stored on the cloud and have capabilities that support searching through the encrypted database. This enables the cloud to reply to a request with a more precise response without compromising any privacy in terms of email contents and also in terms of access patterns. We provide a solution for the email scenario in which we can tag or associate emails with some keywords, and during retrieval, the email owner can request all the emails associated with a particular keyword. Although attempts are seen in the literature to solve this issue they do not have the flexibility of dissociating keywords from an email. Keyword dissociation is essential to modify the association between keywords and emails to enable better filtering of emails. Our technique also supports the functionality of keyword dissociation. The solution allows single-database private information retrieval writing in an oblivious way with sublinear communication cost. We have theoretically proved the correctness and security of our technique. © 2018, Springer-Verlag GmbH Germany, part of Springer Nature.
