Identification of Speaker-Specific Features to Minimize the Mismatch Outcomes for Speaker Recognition Using Anger and Happy Emotional Speech

dc.contributor.authorTomar, S.
dc.contributor.authorKoolagudi, S.G.
dc.date.accessioned2026-02-06T06:33:27Z
dc.date.issued2025
dc.description.abstractA vital component of digital speech processing is Speaker Recognition (SR). However, variation in speakers’ emotional states, such as happiness, anger, sadness, or fear, poses a significant challenge that compromises the robustness of speaker recognition systems. It appears to be challenging to distinguish between emotions like “anger†and “happy†, according to research on SR using emotive speech. The study looks at prosody-related speech characteristics to determine how to distinguish between “anger† and “happy†emotional speech for SR tasks. The goal is to explore speaker-specific features. The experiment outcomes demonstrate that, as speaker-specific features for the SR task, Intensity, Pitch, and Brightness (IPB) variables can distinguish between angry and happy emotional speech. Combining IPB and MFCC (IPBCC) feature extraction with the Hybrid CNN-LSTM combined with an attention mechanism approach achieves an SR accuracy of 95.45% for anger and 96.22% for happy emotional speech. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
dc.identifier.citationCommunications in Computer and Information Science, 2025, Vol.2389 CCIS, , p. 63-76
dc.identifier.issn18650929
dc.identifier.urihttps://doi.org/10.1007/978-3-031-91331-0_5
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/28665
dc.publisherSpringer Science and Business Media Deutschland GmbH
dc.subjectBrightness
dc.subjectIntensity
dc.subjectPitch
dc.subjectSpeaker Recognition using Emotional Speech
dc.titleIdentification of Speaker-Specific Features to Minimize the Mismatch Outcomes for Speaker Recognition Using Anger and Happy Emotional Speech

Files