Conference Papers

Search Results

Now showing 1 - 2 of 2

Feature selection and model optimization for semi-supervised speaker spotting
(European Signal Processing Conference, EUSIPCO, 2016) Chetupalli, S.R.; Gopalakrishnan, A.; Sreenivas, T.V.
We explore, experimentally, feature selection and optimization of stochastic model parameters for the problem of speaker spotting. Based on an initially identified segment of speech of a speaker, an iterative model refinement method is developed along with a latent variable mixture model so that segments of the same speaker are identified in a long speech record. It is found that a GMM with moderate number of mixtures is better suited for the task than a large number mixture model as used in speaker identification. Similarly, a PCA based low-dimensional projection of MFCC based feature vector provides better performance. We show that about 6 seconds of initially identified speaker data is sufficient to achieve > 90% performance of speaker segment identification. Â© 2016 IEEE.
Identification of Phonological Process: Final Consonant Deletion from Childrens' Speech
(Institute of Electrical and Electronics Engineers Inc., 2018) Ramteke, P.B.; Supanekar, S.; Koolagudi, S.G.
Children within the age range of 2 1/2 to 6 1/2 years face difficulties in pronunciation due to underdeveloped vocal tract and neuromotor control. They try to substitute a simple class of sounds in place of sounds difficult for them to pronounce. These pronunciation error patterns are called phonological processes. Phonological processes disappear as the child advances in age, and its analysis gives the measure of language learning ability of children over the time. Appearance of these processes after the specified age (8 years) represents a phonological disorder. In this paper, final consonant deletion, one of the phonological processes in the Kannada language is considered for the analysis. In final consonant deletion consonant, part syllable, syllable or part word which appear at the end of the word is deleted. As the part of the word is deleted, features efficient in speech recognition namely MFCCs and LPCCs are explored for the analysis. Dynamic time warping (DTW) algorithm is considered to compare the correct and mispronounced word for identification of the region of final consonant deletion. DTW comparison path is observed to warp around the end of the mispronounced word where the part of the word is deleted. Combination of 13 MFCCs and 13 LPCCs is observed to achieve the highest accuracy of 72.68% within the tolerance range of Â±50ms. Results show that the features efficient in speech recognition are efficient in the identification of final consonant deletion. Â© 2018 IEEE.

Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results