Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 10 of 12
  • Item
    KCE DALab-APDA@FIRE2019: Author profiling and deception detection in Arabic using weighted embedding
    (CEUR-WS ceurws@sunsite.informatik.rwth-aachen.de, 2019) Sharmila Devi, V.; Subramanian, S.; Ravikumar, G.; Anand Kumar, M.A.
    This paper explaining the work submitted on Author Pro- filing and Deception Detection in Arabic Tweets shared task organized at the Forum for Information Retrieval Evaluation (FIRE) 2019. The first task Author profiling illustrates identifying the categories of au- thors based on the Arabic tweets. In the second task, the aim is to Detect deception in Arabic for two genres such as Twitter and News. Deception detection means that the automatic way of identifying false messages in the text content on social network or news. For each task, we have submitted three different systems. For submission 1, we have used the Term Frequency and Inverse Document Frequency (TFIDF) based Support Vector Machine classification and in submission 2, we have used fastText classifier. For submission 3, we have proposed a low dimensional weighted document embedding (TFIDF + Word embedding) with SVM classification. We have attained second place in the Deception detection and third in Author profiling. The performance difference between the top team results and the submitted runs are only 3.34% for Author pro- filing and 1.16% for Deception detection. © Copyright 2019 for this paper by its authors.
  • Item
    ARS NITK at MEDIQA 2019: Analysing various methods for natural language inference, recognising question entailment and medical question answering system
    (Association for Computational Linguistics (ACL), 2019) Agrawal, A.; George, R.A.; Ravi, S.S.; Kamath S․, S.S.; Anand Kumar, M.A.
    This paper includes approaches we have taken for Natural Language Inference, Question Entailment Recognition and Question-Answering tasks to improve domain-specific Information Retrieval. Natural Language Inference (NLI) is a task that aims to determine if a given hypothesis is an entailment, contradiction or is neutral to the given premise. Recognizing Question Entailment (RQE) focuses on identifying entailment between two questions while the objective of Question-Answering (QA) is to filter and improve the ranking of automatically retrieved answers. For addressing the NLI task, the UMLS Metathesaurus was used to find the synonyms of medical terms in given sentences, on which the InferSent model was trained to predict if the given sentence is an entailment, contradictory or neutral. We also introduce a new Extreme gradient boosting model built on PubMed embeddings to perform RQE. Further, a closed-domain Question Answering technique that uses Bi-directional LSTMs trained on the SquAD dataset to determine relevant ranks of answers for a given question is also discussed. Experimental validation showed that the proposed models achieved promising results. © 2019 Association for Computational Linguistics
  • Item
    Intrinsic evaluation for english–tamil bilingual word embeddings
    (Springer Verlag service@springer.de, 2020) Jp, J.P.; Krishna Menon, V.K.; Rajendran, S.; Padannayil, K.P.; Anand Kumar, M.A.
    Despite the growth of bilingual word embeddings, there is no work done so far, for directly evaluating them for English–Tamil language pair. In this paper, we present a data resource and evaluation for the English–Tamil bilingual word vector model. In this paper, we present dataset and the evaluation paradigm for English–Tamil bilingual language pair. This dataset contains words that covers a range of concepts that occur in natural language. The dataset is scored based on the similarity rather than association or relatedness. Hence, the word pairs that are associated but not literally similar have a low rating. The measures are quantified further to ensure consistency in the dataset, mimicking the cognitive phenomena. Henceforth, the dataset can be used by non-native speakers, with minimal effort. We also present some inferences and insights into the semantics captured by word vectors and human cognition. © Springer Nature Singapore Pte Ltd. 2020.
  • Item
    Dynamic mode-based feature with random mapping for sentiment analysis
    (Springer Verlag service@springer.de, 2020) Sachin Kumar, S.; Anand Kumar, M.A.; Padannayil, K.P.; Poornachandran, P.
    Sentiment analysis (SA) or polarity identification is a research topic which receives considerable number of attention. The work in this research attempts to explore the sentiments or opinions in text data related to any event, politics, movies, product reviews, sports, etc. The present article discusses the use of dynamic modes from dynamic mode decomposition (DMD) method with random mapping for sentiment classification. Random mapping is performed using random kitchen sink (RKS) method. The present work aims to explore the use of dynamic modes as the feature for sentiment classification task. In order to conduct the experiment and analysis, the dataset used consists of tweets from SAIL 2015 shared task (tweets in Tamil, Bengali, Hindi) and Malayalam languages. The dataset for Malayalam is prepared by us for the work. The evaluations are performed using accuracy, F1-score, recall, and precision. It is observed from the evaluations that the proposed approach provides competing result. © Springer Nature Singapore Pte Ltd. 2020.
  • Item
    Analyzing Banking Services Applicability Using Explainable Artificial Intelligence
    (Association for Computing Machinery, 2022) Sriram, A.; Gorti, S.S.; Amin, E.G.; Anand Kumar, M.A.
    Over the last few years, the banking sector has had a pivotal role to play in the global economy, comprising of about 24% of the global GDP and employing millions of people worldwide. Banks have a wide array of products and services to offer, ranging from ATMs, Tele-Banking, Credit Cards, Debit cards, Electronic Fund Transfers (EFT), Internet Banking, Mobile Banking, etc. Machine learning is a method of data analysis that automates analytical model building and can be an essential decision support tool for banks in providing services to certain customers and to help in improving customer satisfaction and experience based on collected data. In this study, we made use of several machine learning models and Artificial Neural Networks (ANN) to help banks make predictions about timely customer loan repayment and customer satisfaction. We explored different machine learning algorithms and have performed SHAP analysis, which has helped make conclusions about the significant features driving these decisions. © 2022 ACM.
  • Item
    Multi-Vehicle Tracking and Speed Estimation Model using Deep Learning
    (Association for Computing Machinery, 2022) Prajwal, K.; Navaneeth, P.; Tharun, K.; Anand Kumar, M.A.
    Speed estimation of vehicles is one of the prime application of speed estimation of moving objects. The YOLOv5 model has proven to have a very good accuracy in detecting moving objects in real-time. The vehicles on the road are extracted from each frame of the video by running it through a custom YOLOv5 object detector. The YOLO model splits the frame into a grid and each grid detects a vehicle within itself. An instance identifier tracks the vehicle across the frames. The tracking algorithm computes deep features for every bounding box and utilizes the similarities within the deep features to identify and track the object. The pixel per meter metric has to adjusted based on perspective after which the speed of the vehicle can be estimated. Finally a comparison of our model metrics with the existing state of the art models is provided. © 2022 ACM.
  • Item
    Representation Learning in Continuous-Time Dynamic Signed Networks
    (Association for Computing Machinery, 2023) Sharma, K.; Raghavendra, M.; Lee, Y.-C.; Anand Kumar, M.A.; Kumar, S.
    Signed networks allow us to model conflicting relationships and interactions, such as friend/enemy and support/oppose. These signed interactions happen in real-time. Modeling such dynamics of signed networks is crucial to understanding the evolution of polarization in the network and enabling effective prediction of the signed structure (i.e., link signs) in the future. However, existing works have modeled either (static) signed networks or dynamic (unsigned) networks but not dynamic signed networks. Since both sign and dynamics inform the graph structure in different ways, it is non-trivial to model how to combine the two features. In this work, we propose a new Graph Neural Network (GNN)-based approach to model dynamic signed networks, named SEMBA: Signed link's Evolution using Memory modules and Balanced Aggregation. Here, the idea is to incorporate the signs of temporal interactions using separate modules guided by balance theory and to evolve the embeddings from a higher-order neighborhood. Experiments on 4 real-world datasets and 3 different tasks demonstrate that SEMBA consistently and significantly outperforms the baselines by up to 80% on the tasks of predicting signs of future links while matching the state-of-the-art performance on predicting existence of these links in the future. We find that this improvement is due specifically to superior performance of SEMBA on the minority negative class. Code is made available at https://github.com/claws-lab/semba. © 2023 Copyright held by the owner/author(s). ACM ISBN 979-8-4007-0124-5/23/10.
  • Item
    Multimodal Meme Troll and Domain Classification Using Contrastive Learning
    (Institute of Electrical and Electronics Engineers Inc., 2024) Phadatare, A.; Jayanth, P.; Anand Kumar, M.A.
    This paper presents a holistic approach to meme trolling detection and domain classification, focusing on Telugu and Kannada languages. Leveraging a spectrum of methodologies ranging from basic machine learning models such as Support Vector Machines (SVM), Random Forest, Naive Bayes, to image-based models like Convolutional Neural Networks (CNN), ResNet-50, and state-of-the-art models such as CLIP, multilingual BERT, XLM-BERT, and Vision Transformers, we explore diverse modalities including image classification, extracted text classification, and combined text-caption classification. Our system integrates multiple models to achieve two primary goals: accurately detecting trolling behavior and classifying memes into thematic domains like politics, movies, sports.. By training on multilingual data and considering linguistic diversity, our approach ensures robust performance across different linguistic contexts, providing valuable insights into meme culture and trolling behavior in Telugu and Kannada-speaking communities. © 2024 IEEE.
  • Item
    Depression Severity Detection from Social Media Posts
    (Springer Science and Business Media Deutschland GmbH, 2024) Recharla, N.; Bolimera, P.; Gupta, Y.; Anand Kumar, M.A.
    Regardless of age, gender, or color, mental health problems affect people all over the world. People feel increasingly at ease sharing their opinions on social networking sites (SNS) practically every day in the present era of communication and technology. Reddit is a social networking site that consists of subreddits, or single-topic communities, that are created, maintained, and frequented by anonymous users. The dataset used in the paper is, eRisk2021 dataset provided for task 3, which is used for depression severity measurement. It consists posts of Reddit users. In this paper, the approach involves finding user depression severity based on their Reddit history with the help of the BDI-II questionnaire, which is discussed. The paper provides three different approaches in finding the users depression severity from their social media data. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
  • Item
    Detecting Suicide Risk Patterns using Hierarchical Attention Networks with Large Language Models
    (Association for Computational Linguistics (ACL), 2024) Koushik, L.; Vishruth, M.; Anand Kumar, M.A.
    Suicide has become a major public health and social concern in the world . This Paper looks into a method through use of LLMs (Large Language Model) to extract the likely reason for a person to attempt suicide, through analysis of their social media text posts detailing about the event, using this data we can extract the reason for the cause such mental state which can provide support for suicide prevention. This submission presents our approach for CLPsych Shared Task 2024. Our model uses Hierarchical Attention Networks (HAN) and Llama2 for finding supporting evidence about an individual’s suicide risk level. ©2024 Association for Computational Linguistics.