Conference Papers

Search Results

Now showing 1 - 2 of 2

Contribution of Telugu vowels in identifying emotions
(Institute of Electrical and Electronics Engineers Inc., 2015) Shashidhar Koolagudi, G.; Shivakranthi, B.; Sreenivasa Rao, K.S.; Ramteke, P.B.
This work is mainly intended at identifying emotion contribution of different vowels in Telugu language. Instead of processing the entire speech signal we propose to focus only vowel parts of the utterance (/a/, /i/, /u/, /e/ and /o/). By analysing the vowels we can discriminate the emotions. In this work spectral and prosodic features are used for studying the effect of emotions on different vowels. Even though prosodic features are best discriminators of emotions at utterance level, at phoneme level spectral features are more useful. One may observe that same vowel exhibits different spectral behaviour when expressed in different emotions. Shimmer and jitter play a crucial role for classifying emotions using vowels. A semi natural database used in this work is collected from Telugu movies. Gaussian Mixture Models (GMMs) are used as the mathematical models for classification. Emotions considered for this work are anger, fear, happy, sad and neutral. Average emotion recognition performance obtained by combining MFCCs, formants, intensity, shimmer and jitter is around 78%. Â© 2015 IEEE.
Repetition detection in stuttered speech
(Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2016) Ramteke, P.B.; Koolagudi, S.G.; Afroz, F.
This paper mainly focuses on detection of repetitions in stuttered speech. The stuttered speech signal is divided into isolated units based on energy. Mel-frequency cepstrum coefficients (MFCCs), formants and shimmer are used as features for repetition recognition. These features are extracted from each isolated unit. Using Dynamic Time Warping (DTW) the features of each isolated unit are compared with those subsequent units within one second interval of speech. Based on the analysis of scores obtained from DTW a threshold is set, if the score is below the set threshold then the units are identified as repeated events. Twenty seven seconds of speech data used in this work, consists of 50 repetition events. The result shows that the combination of MFCCs, formants and shimmer can be used for the recognition of repetitions in stuttered speech. Out of 50 repetitions, 47 are correctly identified. Â© Springer India 2016.

Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results