IIMH: Intention Identification in Multimodal Human Utterances
No Thumbnail Available
Date
2023
Journal Title
Journal ISSN
Volume Title
Publisher
Association for Computing Machinery
Abstract
Intention identification is a challenging problem in the field of natural language processing, speech processing, and computer vision. People often use contradictory or ambiguous words in different contexts, which can sometimes be very confusing to identify the intention behind an utterance. Intention identification has many practical applications in the fields of natural language processing, sentiment analysis, social media analysis, robotics, and human-computer interaction, where valuable insights into user behavior can be achieved by identifying intention. In this work, we propose a model to determine whether an utterance made by a person is intentional or not intentional. To achieve this, we collected a multimodal dataset containing text, video, and speech from various TV shows, movies, and YouTube videos and labeled them with their corresponding intention. Feature extraction is done at both utterance and word levels to get useful information from all three modalities. We trained the baseline model using SVM to set a benchmark performance. We designed an architecture to detect the contradiction between positive spoken words with negative facial expressions or speech to identify an utterance as non-intentional. Along with the architecture, we used different approaches for classification and got the best results with the Support vector machine (SVM) classifier using RBF kernel, with an accuracy of 78.83% and proven to be better compared to the baseline approach. © 2023 ACM.
Description
Keywords
BERT, Deep Learning, Intention, Multimodal, NLP, Sentiment, SVM, Utterance-level features, Word-level features
Citation
ACM International Conference Proceeding Series, 2023, Vol., , p. 337-344
