IIMH: Intention Identification in Multimodal Human Utterances

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Association for Computing Machinery

Abstract

Intention identification is a challenging problem in the field of natural language processing, speech processing, and computer vision. People often use contradictory or ambiguous words in different contexts, which can sometimes be very confusing to identify the intention behind an utterance. Intention identification has many practical applications in the fields of natural language processing, sentiment analysis, social media analysis, robotics, and human-computer interaction, where valuable insights into user behavior can be achieved by identifying intention. In this work, we propose a model to determine whether an utterance made by a person is intentional or not intentional. To achieve this, we collected a multimodal dataset containing text, video, and speech from various TV shows, movies, and YouTube videos and labeled them with their corresponding intention. Feature extraction is done at both utterance and word levels to get useful information from all three modalities. We trained the baseline model using SVM to set a benchmark performance. We designed an architecture to detect the contradiction between positive spoken words with negative facial expressions or speech to identify an utterance as non-intentional. Along with the architecture, we used different approaches for classification and got the best results with the Support vector machine (SVM) classifier using RBF kernel, with an accuracy of 78.83% and proven to be better compared to the baseline approach. © 2023 ACM.

Description

Keywords

BERT, Deep Learning, Intention, Multimodal, NLP, Sentiment, SVM, Utterance-level features, Word-level features

Citation

ACM International Conference Proceeding Series, 2023, Vol., , p. 337-344

Endorsement

Review

Supplemented By

Referenced By