Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    NITK-KLESC: Kannada Language Emotional Speech Corpus for Speaker Recognition
    (Institute of Electrical and Electronics Engineers Inc., 2023) Tomar, S.; Gupta, P.; Koolagudi, S.G.
    This work introduces an emotional speech dataset for Speaker Recognition (SR) task. The proposed dataset is recorded in the Kannada language from the people of Karnataka state of India. The speech dataset is collected by simulating five different emotions, such as Fear, Sad, Anger, Happy, and Neutral. The dataset is named as National Institute of Technology Karnataka, India- Kannada Language Emotional Speech Corpus (NITK-KLESC). The proposed dataset will be useful for SR tasks in various emotions. The proposed emotional speech dataset will be useful for emotion recognition, analysis of emotional speech, speech recognition, gender identification, and age identification of the age group 20 to 50 years. The proposed work describes the development, processing, analysis, acquisition, and evaluation of the proposed emotional speech dataset (NITK-KLESC). The analysis of emotional speech was done by considering various basic speech parameters like Pitch, Tempo, Intensity, and Zero Crossing Rate (ZCR). The characteristics of the dataset are reported using MFCC feature extraction and considered the CNN model as a classifier, compared with the existing EmoDB dataset. The average accuracy of the Emotional Speech Speaker Recognition (ESSR) task was measured at 84.44% with the EmoDB dataset and 95.2% with the proposed NITK-KLESC dataset. © 2023 IEEE.
  • Item
    NITK-TIEKLS: A Text-Independent Emotional Kannada Language Speech Dataset for Speaker Recognition
    (Springer Science and Business Media Deutschland GmbH, 2025) Tomar, S.; Koolagudi, S.G.
    Speaker recognition systems have traditionally relied on the consistency of speech content to identify individuals. However, text-independent speaker recognition, irrespective of the spoken content, presents a more flexible and robust alternative, especially in real-world scenarios. This research focuses on enhancing text-independent speaker recognition by incorporating a dataset for the Speaker Recognition (SR) task. The dataset is named the National Institute of Technology Karnataka - Text-Independent Emotional Kannada Language Speech (NITK-TIEKLS) dataset. The 200 natives of the Karnataka state of India have recorded emotional speech in the Kannada language for the proposed dataset. The neutral text-independent speech consists of a 4-min speech duration for each speaker. The two emotional speech utterances, from any two of the emotions anger, happiness, sadness, and fear, are text-independent speech utterances that consist of 2 min. The total duration is approximately 30 h. The proposed study includes developing, processing, analyzing, acquiring, and evaluating the proposed dataset. The suggested dataset consists of performance evaluations of the SR system through deep learning techniques with the proposed Wavelet-Mel Spectrogram. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
  • Item
    Transformation of Emotional Speech to Anger Speech to Reduce Mismatches in Testing and Enrollment Speech for Speaker Recognition System
    (Springer Science and Business Media Deutschland GmbH, 2025) Tomar, S.; Koolagudi, S.G.
    Speaker Recognition (SR) is a critical component of digital speech processing. The robustness of Speaker Recognition systems is compromised by the variance in speakers’ emotional states. According to a study on SR utilizing emotive speech, it seems complicated to distinguish between emotions like “anger,†“sad,†“fear,†and “happy†. Developing a speaker recognition model that works effectively using emotional speech is challenging, specifically in the case of some intense emotions like anger. This work explores emotional speech transformation approaches to reduce the mismatch between training and testing emotional speech for the SR tasks. The recommended effort aims to develop speech transformation techniques to transform different emotional speech into anger. This study modifies the prosodic features “TPIB†(Tempo, Pitch, Intensity, and Brightness) to transform the speech from neutral, happy, fearful, and sad emotions to anger. Performance evaluations of the SR system employing transformed emotional speech are obtained through integrating Mel-Spectrogram feature extraction and deep learning techniques, including the CREMA-D and NITK-KLESC datasets. The experiment results demonstrate that the suggested emotional speech transformation technique increases SR accuracy in transforming neutral by approximately 15%, happy by 11%, sad by 32%, and fear by 30%. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
  • Item
    Novel eco-friendly synthesis of graphene directly from graphite using 2,2,6,6-tetramethylpiperidine 1-oxyl and study of its electrochemical properties
    (Elsevier B.V., 2015) Subramanya, B.; Bhat, D.K.
    Herein we report a simple, low cost, highly efficient and environment friendly one-pot method for the high throughput synthesis of graphene directly from graphite using 2,2,6,6-tetramethylpiperidine 1-oxyl (TEMPO) and H2O2 under microwave irradiation. The formation mechanism of graphene nanosheets (GNS) as investigated by Raman spectroscopy and electron microscopy techniques reveal surface defect generation, intercalation and exfoliation as the main steps. The rapid and local Joule heating of graphite by microwave radiation results in simultaneous deoxygenation and exfoliation forming GNS. The as-synthesized GNS are a few layer thick with a high surface area of 937.6 m2 g-1 and a high C/O ratio of 9.2. These results open the perspective of replacing toxic oxidizing and reducing agents by environment friendly chemicals of similar efficacy, thus facilitating the large-scale production of GNS by a greener method. Furthermore, GNS exhibits good electrochemical performance with a large specific capacitance (197 F g-1), excellent rate capability and a long cycle life (1000 cycles) in neat 1-ethyl-3-methylimidazolium tetrafluoroborate (EMIMBF4) electrolyte. It also has a high energy density of 76.03 W h kg-1 while simultaneously possessing a high power density of 1.12 kW kg-1. © 2014 Elsevier B.V.