NITK-TIEKLS: A Text-Independent Emotional Kannada Language Speech Dataset for Speaker Recognition

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Science and Business Media Deutschland GmbH

Abstract

Speaker recognition systems have traditionally relied on the consistency of speech content to identify individuals. However, text-independent speaker recognition, irrespective of the spoken content, presents a more flexible and robust alternative, especially in real-world scenarios. This research focuses on enhancing text-independent speaker recognition by incorporating a dataset for the Speaker Recognition (SR) task. The dataset is named the National Institute of Technology Karnataka - Text-Independent Emotional Kannada Language Speech (NITK-TIEKLS) dataset. The 200 natives of the Karnataka state of India have recorded emotional speech in the Kannada language for the proposed dataset. The neutral text-independent speech consists of a 4-min speech duration for each speaker. The two emotional speech utterances, from any two of the emotions anger, happiness, sadness, and fear, are text-independent speech utterances that consist of 2 min. The total duration is approximately 30 h. The proposed study includes developing, processing, analyzing, acquiring, and evaluating the proposed dataset. The suggested dataset consists of performance evaluations of the SR system through deep learning techniques with the proposed Wavelet-Mel Spectrogram. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

Description

Keywords

Mel Spectrogram, Pitch, Speaker Recognition, Speaker Recognition in Emotional Environment, Tempo, Text-independent emotional speech, Wavelet Spectrogram, Zero Crossing Rate

Citation

Communications in Computer and Information Science, 2025, Vol.2389 CCIS, , p. 137-151

Endorsement

Review

Supplemented By

Referenced By