Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 5 of 5
  • Item
    Feature analysis for mispronounced phonemes in the case of alvoelar approximant (/r/) substituted with voiced dental consonant (/∂/)
    (Institute of Electrical and Electronics Engineers Inc., 2015) Ramteke, P.B.; Koolagudi, S.G.; Prabhakar, A.
    Mispronunciation is commonly observed in children from age 2 to 8 years. Some of the common mispronunciations are stopping, fronting, backing and affrication. These processes are known as phonological processes. Identification of these processes is crucial in studying the vocal tract development pattern and treating the phonological disorders in children. The features that clearly discriminate correctly pronounced phoneme from corresponding mispronounced phoneme have to be compared to identify the phonological processes. This paper focuses on the analysis of mispronounced alveolar approximant (/r/) substituted with voiced fricative consonant (/∂/). In this work, spectral and pitch related features are considered for the analysis using scatter plots and histograms. From the analysis, it is observed that the energy feature against 2nd and 4th cepstral coefficients achieves 75% and 65% discrimination respectively. © 2015 IEEE.
  • Item
    Recognition of repetition and prolongation in stuttered speech using ANN
    (Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2016) Savin, P.S.; Ramteke, P.B.; Koolagudi, S.G.
    This paper mainly focuses on repetition and prolongation detection in stuttered speech signal. The acoustic and pitch related features like Mel-frequency cepstral coefficients (MFCCs), formants, pitch, zero crossing rate (ZCR) and Energy are used to test the effectiveness in recognizing repetitions and prolongations in stammered speech. Artificial Neural Networks (ANN) are used as classifier. The results are evaluated using combination of different features. The results show that the ANN classifier trained using MFCC features achieves an average accuracy of 87.39% for repetition and prolongation recognition. © Springer India 2016.
  • Item
    Repetition detection in stuttered speech
    (Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2016) Ramteke, P.B.; Koolagudi, S.G.; Afroz, F.
    This paper mainly focuses on detection of repetitions in stuttered speech. The stuttered speech signal is divided into isolated units based on energy. Mel-frequency cepstrum coefficients (MFCCs), formants and shimmer are used as features for repetition recognition. These features are extracted from each isolated unit. Using Dynamic Time Warping (DTW) the features of each isolated unit are compared with those subsequent units within one second interval of speech. Based on the analysis of scores obtained from DTW a threshold is set, if the score is below the set threshold then the units are identified as repeated events. Twenty seven seconds of speech data used in this work, consists of 50 repetition events. The result shows that the combination of MFCCs, formants and shimmer can be used for the recognition of repetitions in stuttered speech. Out of 50 repetitions, 47 are correctly identified. © Springer India 2016.
  • Item
    Text-independent automatic accent identification system for Kannada language
    (Springer Verlag service@springer.de, 2017) Soorajkumar, R.; Girish, G.N.; Ramteke, P.B.; Joshi, S.S.; Koolagudi, S.G.
    Accent identification is one of the applications paid more attention in speech processing.Atext-independent accent identification system is proposed using Gaussian mixturemodels (GMMs) for Kannada language. Spectral and prosodic features such as Mel-frequency cepstral coefficients (MFCCs), pitch, and energy are considered for the experimentation. The dataset is collected from three regions of Karnataka namely Mumbai Karnataka, Mysore Karnataka, and Karavali Karnataka having significant variations in accent. Experiments are conducted using 32 speech samples from each region where each clip is of one minute duration spoken by native speakers. The baseline system implemented using MFCC features found to achieve 76.7% accuracy. From the results it is observed that the hybrid features improve the performance of the system by 3 %. © Springer Science+Business Media Singapore 2017.
  • Item
    Automatic text-independent Kannada dialect identification system
    (Springer Verlag service@springer.de, 2019) Chittaragi, N.B.; Limaye, A.; Chandana, N.T.; Annappa, B.; Koolagudi, S.G.
    This paper proposes a dialect identification system for the Kannada language. A system that can automatically identify the dialects of the language being spoken has a wide variety of applications. However, not many Automatic Speech Recognition (ASR) and dialect identification tasks are carried out in majority of the Indian languages. Further, there are only a few good quality annotated audio datasets available. In this paper, a new dataset for 5 spoken dialects of the Kannada language is introduced. Spectral and prosodic features have captured the most prominent features for recognition of Kannada dialects. Support Vector Machine (SVM) and neural networks algorithms are used for modeling text-independent recognition system. A neural network model that attempts for identification dialects based on sentence level cues has also been built. Hyper-parameters for SVM and neural network models are chosen using grid search. Neural network models have outperformed SVMs when complete utterances are considered. © Springer Nature Singapore Pte Ltd. 2019.