Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Gender Identification from Children's Speech
    (Institute of Electrical and Electronics Engineers Inc., 2018) Ramteke, P.B.; Dixit, A.A.; Supanekar, S.; Dharwadkar, N.V.; Koolagudi, S.G.
    Children's speech can be characterized by higher pitch and format frequencies compared to the adult speech. Gender identification task from children's speech is difficult as there is no significant difference in the acoustic properties of male and female child. Here, an attempt has been made to explore the features efficient in discriminating the gender from children's speech. Different combinations of spectral features such as Mel-frequency cepstral coefficients (MFCCs), ΔMFCCs and ΔΔMFCCs, Formants, Linear predictive cepstral coefficients (LPCCs); Shimmer and Jitter; Prosodic features like pitch and its statistical variations along with Δpitch related features are explored. Features are evaluated using non linear classifiers namely Artificial Neural Network (ANNs), Deep Neural Network (DNNs) and Random Forest (RF). From the results it is observed that the RF achieves an highest accuracy of 84.79% amongst the other classifiers. © 2018 IEEE.
  • Item
    Nitk Kids' speech corpus
    (International Speech Communication Association publication@isca-speech.org 4 Rue des Fauvettes - Lous Tourils Baixas 66390, 2019) Ramteke, P.B.; Supanekar, S.; Hegde, P.; Nelson, H.; Aithal, V.; Koolagudi, S.G.
    This paper introduces speech database for analyzing children's speech. The proposed database of children is recorded in Kannada language (one of the South Indian languages) from children between age 2 12 to 6 12 years. The database is named as National Institute of Technology Karnataka Kids' Speech Corpus (NITK Kids' Speech Corpus). The relevant design considerations for the database collection are discussed in detail. It is divided into four age groups with an interval of 1 year between each age group. The speech corpus includes nearly 10 hours of speech recordings from 160 children. For each age range, the data is recorded from 40 children (20 male and 20 female). Further, the effect of developmental changes on the speech from 2 12 to 6 12 years are analyzed using pitch and formant analysis. Some of the potential applications, of the NITK Kids' Speech Corpus, such as, systematic study on the language learning ability of children, phonological process analysis and children speech recognition are discussed. © © 2019 ISCA
  • Item
    Gender Identification using Spectral Features and Glottal Closure Instants (GCIs)
    (Institute of Electrical and Electronics Engineers Inc., 2019) Ramteke, P.B.; Supanekar, S.; Koolagudi, S.G.
    Automatic identification of gender from speech may help to improve the performance of the systems such as speaker speech recognition, forensic analysis, authentication processes. The difference in the physiological parameters of male and female vocal folds results in significant changes in their vocal fold vibration pattern. These changes can be characterized from the differences in the duration of their glottal closure. In this paper, an attempt has been made for gender recognition from speech using spectral features such as MFCCs, LPCCs, etc.; pitch (F0), excitation source features like glottal closure instants (GCIs) and its statistical variations. Western Michigan University's Gender dataset is used for experimentation. The dataset is collected from 93 speakers consisting of speech from 45 male and 48 female speakers respectively. Random forests (RFs) and Support vector machines (SVMs) are used to measure the performance of the proposed features. Random forest is observed to achieve average frame level accuracy of 96.908% using 13 MFCCs, 13 LPCCs, Pitch (F0) and GCI Stats (5). SVM is observed to achieve an average accuracy of 98.607% using 13 MFCCs, 13 LPCCs and GCI Stats (5). From the results, it is observed that the proposed features are efficient in discriminating the gender from speech. © 2019 IEEE.