NITK-TIEKLS: A Text-Independent Emotional Kannada Language Speech Dataset for Speaker Recognition
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media Deutschland GmbH
Abstract
Speaker recognition systems have traditionally relied on the consistency of speech content to identify individuals. However, text-independent speaker recognition, irrespective of the spoken content, presents a more flexible and robust alternative, especially in real-world scenarios. This research focuses on enhancing text-independent speaker recognition by incorporating a dataset for the Speaker Recognition (SR) task. The dataset is named the National Institute of Technology Karnataka - Text-Independent Emotional Kannada Language Speech (NITK-TIEKLS) dataset. The 200 natives of the Karnataka state of India have recorded emotional speech in the Kannada language for the proposed dataset. The neutral text-independent speech consists of a 4-min speech duration for each speaker. The two emotional speech utterances, from any two of the emotions anger, happiness, sadness, and fear, are text-independent speech utterances that consist of 2 min. The total duration is approximately 30 h. The proposed study includes developing, processing, analyzing, acquiring, and evaluating the proposed dataset. The suggested dataset consists of performance evaluations of the SR system through deep learning techniques with the proposed Wavelet-Mel Spectrogram. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Description
Keywords
Mel Spectrogram, Pitch, Speaker Recognition, Speaker Recognition in Emotional Environment, Tempo, Text-independent emotional speech, Wavelet Spectrogram, Zero Crossing Rate
Citation
Communications in Computer and Information Science, 2025, Vol.2389 CCIS, , p. 137-151
