2. Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/1/7

Browse

Search Results

Now showing 1 - 10 of 22

Novel hybrid feature selection models for unsupervised document categorization
(2017) Bhopale, A.P.; Sowmya, Kamath S.
Dealing with high dimensional data is a challenging and computationally complex task in the data pre-processing phase of text clustering. Conventionally, union and intersection approaches have been used to combine results of different feature selection methods to optimize relevant feature space for document collection. Union method selects all features from considered sub-models, whereas, intersection method selects only common features identified by sub-models. However, in reality, any type of feature selection can cause a loss of some potentially important features. In this paper, a hybrid feature selection model called Modified Hybrid Union (MHU) is proposed, which selects features by considering the individual strengths and weaknesses of each constituent component of the model. A comparative evaluation of its performance for K-means clustering and Bio-inspired Flockbased clustering is also presented on standard data sets such as OWL-S TC and Reuters-21578. � 2017 IEEE.
Medical Image Retrieval Using Manifold Ranking with Relevance Feedback
(2018) Soundalgekar, P.; Kulkarni, M.; Nagaraju, D.; Sowmya, Kamath S.
Medical image retrieval (MedIR) is a challenging field in Visual information retrieval, due to the multi-dimensional and multi-modal context of the underlying content. Traditional models do not take the intrinsic characteristics of data into consideration and have achieved limited accuracy in application to medical images. Manifold Ranking (MR) is a technique that can be used in further optimizing precision and recall in MedIR applications as it ranks items by traversing a dynamically constructed content-specific information graph. In this paper, a MedIR approach based on Manifold Ranking is proposed. Medical images being multi-dimensional, exhibit underlying cluster and manifold information which enhances semantic relevance and allows for label uniformity. Hence, when adapted for MedIR, MR can help in achieving large-scale ranking across datasets as is the case in most medical imaging applications. In addition, a relevance feedback mechanism was also incorporated to support a learning based system. We show that MR achieved significant improvement in retrieval results with relevance feedback as compared to the Euclidean Distance (ED) rankings. This showcases the importance of analyzing the inherent latent structure in medical image data for better performance over traditional methods. � 2018 IEEE.
Jamura: A Conversational Smart Home Assistant Built on Telegram and Google Dialogflow
(2019) Salvi, S.; Geetha, V.; Sowmya, Kamath S.
With an ever-increasing number of smart connected devices for various applications, there is a need for finding a new, smarter way of communicating with all the homogeneous and heterogeneous devices in a particular network. Conversational Bots, also known as Chatbots, are currently a popular solution in many applications, as they provide a user-friendly interface and more intuitive recommendations to user queries. In this work, the domain of home automation is considered from the area of the Internet of Things, and a Chatbot application built using technologies like Natural Language Processing, Machine Learning, and Service-Oriented Computing is designed as an intuitive user-interface for Smart home products. The aim of this paper is to build easy to implement and integrate DIY Smart Home Assistant using available technologies. The proposed Conversational Artificial Intelligence system can aid the user in smart decision making, predictive and preventive analytics, and showed promising results during experimental evaluation. � 2019 IEEE.
Enhancing web service discovery using meta-heuristic CSO and PCA based clustering
(2018) Kotekar, S.; Sowmya, Kamath S.
Web service discovery is one of the crucial tasks in service-oriented applications and workflows. For a targeted objective to be achieved, it is still challenging to identify all appropriate services from a repository containing diverse service collections. To identify the most suitable services, it is necessary to capture service-specific terms that comply with its natural language documentation. Clustering available Web services as per their domain, based on functional similarities would enhance a service search engine�s ability to recommend relevant services. In this paper, we propose a novel approach for automatically categorizing the Web services available in a repository into functionally similar groups. Our proposed approach is based on the Meta-heuristic Cat Swarm Optimization (CSO) Algorithm, further optimized by Principle Component Analysis (PCA) dimension reduction technique. Results obtained by experiments show that the proposed approach was useful and enhanced the service discovery process, when compared to traditional approaches. � Springer Nature Singapore Pte Ltd. 2018.
Improving convergence in Irgan with PPO
(2020) Jain, M.; Sowmya, Kamath S.
Information retrieval modeling aims to optimise generative and discriminative retrieval strategies, where, generative retrieval focuses on predicting query-specific relevant documents and discriminative retrieval tries to predict relevancy given a query-document pair. IRGAN unifies the generative and discriminative retrieval approaches through a minimax game. However, training IRGAN is unstable and varies largely with the random initialization of parameters. In this work, we propose improvements to IRGAN training through a novel optimization objective based on proximal policy optimisation and gumbel-softmax based sampling for the generator, along with a modified training algorithm which performs the gradient update on both the models simultaneously for each training iteration. We benchmark our proposed approach against IRGAN on three different information retrieval tasks and present empirical evidence of improved convergence. � 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Frame instance extraction and clustering for default knowledge building
(2017) Shah, A.; Basile, V.; Cabrio, E.; Sowmya, Kamath S.
Obtaining and representing common-sense knowledge, useful in a robotics scenario for planning and making inference about the robots' surroundings, is a challenging problem, because such knowledge is typically found in unstructured repositories such as text corpora or small handmade resources. The work described in this paper presents a methodology for automatically creating a default knowledge base about real-world objects for the robotics domain. The proposed method relies on clustering frame instances extracted from natural language text as a way of distilling default knowledge. We collect and parse a natural language corpus using the Web as a source, then perform an agglomerative clustering of frame instances according to an appropriately defined similarity measure, and finally extract prototypical frame instances from each cluster and publish them in LOD-complaint format to promote reuse and interoperability.
Dynamic and temporal user profiling for personalized recommenders using heterogeneous data sources
(2017) Krishnan, G.S.; Sowmya, Kamath S.
In modern Web applications, the process of user-profiling provides a way to capture user-specific information, which then serves as a source for designing personalized user experiences. Currently, such information about a particular user is available from multiple online sources/services, like social media applications, professional/social networking sites, location based service providers or even from simple Web-pages. The nature of this data being truly heterogeneous, high in volume and also highly dynamic over time, the problem of collecting these data artifacts from disparate sources, to enable complete user-profiling can be challenging. In this paper, we present an approach to dynamically build a structured user profile, that emphasizes the temporal nature to capture dynamic user behavior. The user profile is compiled from multiple, heterogeneous data sources which capture dynamic user actions over time, to capture changing preferences accurately. Natural language processing techniques, machine learning and concepts of the semantic Web were used for capturing relevant user data and implement the proposed '3D User Profile'. Our technique also supports the representation of the generated user profiles as structured data so that other personalized recommendation systems and Semantic Web/Linked Open Data applications can consume them for providing intelligent, personalized services. � 2017 IEEE.
Domain-specific sentiment analysis approaches for code-mixed social network data
(2017) Pravalika, A.; Oza, V.; Meghana, N.P.; Sowmya, Kamath S.
Sentiment Analysis is one of the prominent research fields in Natural Language Processing because of its widespread real-world applications. Customer preferences, options and experiences can be analyzed through social media, reviews, blogs and other online social networking site data. However, due to increasing informal usage of local languages in social media platforms, multi-lingual or code-mixed data is fast becoming a common occurrence. Mixed code is generated when users use more than a single language in social network comments. Such data presents a significant challenge for applications using sentiment analysis and is yet to be fully explored by researchers. Existing sentiment analysis methods applied to monolingual social data are not suitable for code-mixed data due to the inconsistency in the grammatical structure in these sentences. In this paper, a novel method focused on performing effective sentiment analysis of bilingual sentences written in Hindi and English is proposed, that takes into account linguistic code switching and the grammatical transitions between the two considered languages. Experimental evaluation using real-world, code-mixed datasets obtained from Facebook showed that the proposed approach achieved very good accuracy and was also efficient performance-wise. � 2017 IEEE.
Deep Neural Network Models for Question Classification in Community Question-Answering Forums
(2019) Upadhya, B.A.; Udupa, S.; Sowmya, Kamath S.
Automatic generation of responses to questions is a challenging problem that has applications in fields like customer support, question-answering forums etc. Prerequisite to developing such systems is a requirement for a methodology that classifies questions as yes/no or opinion-based questions, so that quick and accurate responses can be provided. Performing this classification is advantageous, as yes/no questions can generally be answered using the data that is already available. In the case of an opinion-based or a yes/no question that wasn't previously answered, an external knowledge source is needed to generate the answer. We propose a LSTM based model that performs question classification into the two aforementioned categories. Given a question as an input, the objective is to classify it into opinion-based or yes/no question. The proposed model was tested on the Amazon community question-answer dataset as it is reflective of the problem statement we are trying to solve. The proposed methodology achieved promising results, with a high accuracy rate of 91% in question classification. � 2019 IEEE.
Deep neural learning for automated diagnostic code group prediction using unstructured nursing notes
(2020) Jayasimha, A.; Gangavarapu, T.; Sowmya, Kamath S.; Krishnan, G.S.
Disease prediction, a central problem in clinical care and management, has gained much significance over the last decade. Nursing notes documented by caregivers contain valuable information concerning a patient's state, which can aid in the development of intelligent clinical prediction systems. Moreover, due to the limited adaptation of structured electronic health records in developing countries, the need for disease prediction from such clinical text has garnered substantial interest from the research community. The availability of large, publicly available databases such as MIMIC-III, and advancements in machine and deep learning models with high predictive capabilities have further facilitated research in this direction. In this work, we model the latent knowledge embedded in the unstructured clinical nursing notes, to address the clinical task of disease prediction as a multi-label classification of ICD-9 code groups. We present EnTAGS, which facilitates aggregation of the data in the clinical nursing notes of a patient, by modeling them independent of one another. To handle the sparsity and high dimensionality of clinical nursing notes effectively, our proposed EnTAGS is built on the topics extracted using Non-negative matrix factorization. Furthermore, we explore the applicability of deep learning models for the clinical task of disease prediction, and assess the reliability of the proposed models using standard evaluation metrics. Our experimental evaluation revealed that the proposed approach consistently exceeded the state-of-the-art prediction model by 1.87% in accuracy, 12.68% in AUPRC, and 11.64% in MCC score. � 2020 Association for Computing Machinery.

2. Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results