Transformation ofÂ Emotional Speech toÂ Anger Speech toÂ Reduce Mismatches inÂ Testing andÂ Enrollment Speech forÂ Speaker Recognition System

Tomar, S.; Koolagudi, S.G.

Transformation ofÂ Emotional Speech toÂ Anger Speech toÂ Reduce Mismatches inÂ Testing andÂ Enrollment Speech forÂ Speaker Recognition System

dc.contributor.author	Tomar, S.
dc.contributor.author	Koolagudi, S.G.
dc.date.accessioned	2026-02-06T06:33:15Z
dc.date.issued	2025
dc.description.abstract	Speaker Recognition (SR) is a critical component of digital speech processing. The robustness of Speaker Recognition systems is compromised by the variance in speakersâ€™ emotional states. According to a study on SR utilizing emotive speech, it seems complicated to distinguish between emotions like â€œanger,â€ â€œsad,â€ â€œfear,â€ and â€œhappyâ€ . Developing a speaker recognition model that works effectively using emotional speech is challenging, specifically in the case of some intense emotions like anger. This work explores emotional speech transformation approaches to reduce the mismatch between training and testing emotional speech for the SR tasks. The recommended effort aims to develop speech transformation techniques to transform different emotional speech into anger. This study modifies the prosodic features â€œTPIBâ€ (Tempo, Pitch, Intensity, and Brightness) to transform the speech from neutral, happy, fearful, and sad emotions to anger. Performance evaluations of the SR system employing transformed emotional speech are obtained through integrating Mel-Spectrogram feature extraction and deep learning techniques, including the CREMA-D and NITK-KLESC datasets. The experiment results demonstrate that the suggested emotional speech transformation technique increases SR accuracy in transforming neutral by approximately 15%, happy by 11%, sad by 32%, and fear by 30%. Â© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2025, Vol.15300 LNAI, , p. 185-200
dc.identifier.issn	3029743
dc.identifier.uri	https://doi.org/10.1007/978-3-031-78014-1_14
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/28543
dc.publisher	Springer Science and Business Media Deutschland GmbH
dc.subject	Brightness
dc.subject	Emotional speech transformation
dc.subject	Intensity
dc.subject	Pitch
dc.subject	Speaker recognition
dc.subject	Tempo
dc.subject	Timbre features
dc.title	Transformation ofÂ Emotional Speech toÂ Anger Speech toÂ Reduce Mismatches inÂ Testing andÂ Enrollment Speech forÂ Speaker Recognition System

Collections

Conference Papers

Transformation ofÂ Emotional Speech toÂ Anger Speech toÂ Reduce Mismatches inÂ Testing andÂ Enrollment Speech forÂ Speaker Recognition System

Files

Collections