Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    A New Glowworm Swarm Optimization Based Clustering Algorithm for Multimedia Documents
    (Institute of Electrical and Electronics Engineers Inc., 2016) Pushpalatha, K.; Ananthanarayana, V.S.
    Due to the explosion of multimedia data, the demand for the sophisticated multimedia knowledge discovery systems has been increased. The multimodal nature of multimedia data is the big barrier for knowledge extraction. The representation of multimodal data in a unimodal space will be more advantageous for any mining task. We initially represent the multimodal multimedia documents in a unimodal space by converting the multimedia objects into signal objects. The dynamic nature of the glowworms motivated us to propose the Glowworm Swarm Optimization based Multimedia Document Clustering (GSOMDC) algorithm to group the multimedia documents into topics. The better purity and entropy values indicates that the GSOMDC algorithm successfully clusters the multimedia documents into topics. The goodness of the clustering is evaluated by performing the cluster based retrieval of multimedia documents with better precision values. © 2015 IEEE.
  • Item
    IIMH: Intention Identification in Multimodal Human Utterances
    (Association for Computing Machinery, 2023) Keerthan Kumar, T.G.; Dhakate, H.; Koolagudi, S.G.
    Intention identification is a challenging problem in the field of natural language processing, speech processing, and computer vision. People often use contradictory or ambiguous words in different contexts, which can sometimes be very confusing to identify the intention behind an utterance. Intention identification has many practical applications in the fields of natural language processing, sentiment analysis, social media analysis, robotics, and human-computer interaction, where valuable insights into user behavior can be achieved by identifying intention. In this work, we propose a model to determine whether an utterance made by a person is intentional or not intentional. To achieve this, we collected a multimodal dataset containing text, video, and speech from various TV shows, movies, and YouTube videos and labeled them with their corresponding intention. Feature extraction is done at both utterance and word levels to get useful information from all three modalities. We trained the baseline model using SVM to set a benchmark performance. We designed an architecture to detect the contradiction between positive spoken words with negative facial expressions or speech to identify an utterance as non-intentional. Along with the architecture, we used different approaches for classification and got the best results with the Support vector machine (SVM) classifier using RBF kernel, with an accuracy of 78.83% and proven to be better compared to the baseline approach. © 2023 ACM.
  • Item
    Creation and Classification of Kannada Meme Dataset: Exploring Domain and Troll Categories
    (Springer Science and Business Media Deutschland GmbH, 2024) Kundargi, S.Y.; N, N.; Anand Kumar, M.; Chakravarthi, B.R.
    In this pioneering research, the first-ever Kannada memes dataset is established, marking a groundbreaking contribution. This dataset encompasses 2002 memes, spanning various categories such as movies, politics, sports, trolls, and non-troll memes. The classification models have been meticulously fine-tuned for memes, incorporating image-based models using DenseNet169 and text-based models with BERT for text encoding. An innovative multimodal approach combines insights from images and text, acknowledging the comprehensive nature of meme content. Throughout the study, model strengths and weaknesses are assessed, emphasizing their reliance on cutting-edge technologies like Deep Learning and Natural Language Processing. Valuable improvements are recommended, such as the implementation of oversampling techniques and regular dataset updates to enhance relevance and accuracy. This work extends beyond immediate research, contributing to the development of adaptive meme classification systems, particularly for Kannada-speaking audiences within the evolving meme culture landscape. Notably, the results indicate that multimodal models achieved the best scores for domain classification, while image-based models excelled in troll meme classification, further highlighting the significance of this approach within the field. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
  • Item
    Multimodal Propaganda Detection in Memes with Tolerance-Based Soft Computing Method
    (Springer Science and Business Media Deutschland GmbH, 2024) Kelkar, S.; Ravi, S.; Ramanna, S.; Anand Kumar, M.
    This paper presents a tolerance-based near sets-based classifier applied to multimodal propaganda detection task using text and image data originating from Memes. Memes on the internet consist of an image superimposed with text and are very popular in social media. They are often used as a part of disinformation campaign whereby social media users are influenced via a number of rhetorical and psychological techniques known as persuasion techniques. The focus of this paper is on a subtask of the SemEval-2024 Multilingual Detection of Persuasion Techniques Competition in Memes to detect the presence or absence of a persuasion technique. We introduce a multimodal Tolerance Near Sets Classifier (MTNSC) trained on a combination of word embeddings (RoBERTa) and pre-trained image features (ResNet and ResNet-Memes) using the competition data. This work extends our earlier work in the Natural Language Processing domain where a tolerance-based near sets-based sentiment classifier was introduced. The proposed MTNSC achieves a macro F1 score of 70.15% and micro-F1 score of 75.33% on the test dataset demonstrating satisfactory performance of TNS-based classifiers in a multimodal setting. Our findings point to the model’s effectiveness when compared to a few leading submissions based on deep learning techniques. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.