Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
1 results
Search Results
Item EnsembleWave: An ensembled approach for Automatic Speech Emotion Recognition(Institute of Electrical and Electronics Engineers Inc., 2022) Barkur, R.; Deepansh; I Suresh, D.; Mahesh Kumar, T.N.; Narasimhadhan, A.V.Accurate recognition of emotions from speech and understanding the determining factors behind the judgment can improve the machine's decision-making quality. Current state-of-the-art architectures have focused on either deep learning-based approaches or hand-engineered features. As a result, models fail in gathering complete contextual information and weak generalization across different datasets. This paper presents an end-to-end ensemble-based deep learning architecture that examines raw speech signals and classifies them into the four basic emotions - Sad, Angry, Happy, and Neutral. The proposed EnsembleWave architecture incorporates Attention Wavenet and hand-engineered feature extraction to assimilate a larger field-of-view and capture dataset independent characteristics. The model has achieved an overall accuracy of 98%, 85%, 74%, and 99%, on the four famous Speech Emotion Recognition (SER) datasets - EMO-DB, SAVEE, CREMA-D, and TESS, respectively, outperforming the state-of-the-art techniques both quantitatively and qualitatively. The proposed architecture can also learn the generalized categorization of emotions across different datasets. The python source code of the proposed model will be available at https://github.com/deepanshi-s/EnsembleWave © 2022 IEEE.
