Analysis of Speaker Recognition in Blended Emotional Environment Using Deep Learning Approaches
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media Deutschland GmbH
Abstract
Generally, human conversation has some emotion, and natural emotions are often blended. Today’s Speaker Recognition systems lack the component of emotion. This work proposes a Speaker Recognition approaches in Blended Emotion Environment (SRBEE) system to enhance Speaker Recognition (SR) in an emotional context. Speaker Recognition algorithms nearly always achieve perfect performance in the case of neutral speech, but it is not true from an emotional perspective. This work attempts the recognition of speakers in blended emotion with the Mel-Frequency Cepstral Coefficients (MFCC) feature extraction using the Conv2D classifier. In the blended emotional environment, calculating the accuracy of the Speaker Recognition task is complex. The blend of four basic natural emotions (happy, sad, angry, and fearful) utterances tested in the proposed system to reduce SR’s complexity in a blended emotional environment. The proposed system achieves an average accuracy of 99.3% for blended emotion with neutral speech and 92.8% for four basic blended natural emotions (happy, sad, angry, and fearful). The dataset was prepared by blending two emotions in one utterance. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.
Description
Keywords
Blended emotion, Convolutional Neural Network, Mel Frequency Cepstral Coefficients, Speaker Recognition, Speaker Recognition in Blended Emotion Environment, Valence
Citation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, Vol.14301 LNCS, , p. 691-698
