Faculty Publications
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736
Publications by NITK Faculty
Browse
3 results
Search Results
Item Age approximation from speech using Gaussian mixture models(IEEE Computer Society help@computer.org, 2013) Mittal, T.; Barthwal, A.; Koolagudi, S.G.In this work, spectral features are extracted from speech to perform speaker classification based on thier age. Mel frequency cepstral coefficients (MFCCs) are explored as features. Gaussian mixture models (GMMs) are proposed as classifiers. The age groups considered in this study are 1-10, 11-20, 21-30, 31-40 and 41-50. The age-group database used in this work is recorded in Hindi from speakers of different ages and dialects containing five Hindi text prompts. The text prompts are constructed using textually neutral Hindi words recorded in neutral emotion which are used for characterizing the age group, for both male and female. Average age recognition performance, in the case of multiple speaker database is observed to be around 92.0%. © 2013 IEEE.Item Sentence-Based Dialect Identification System Using Extreme Gradient Boosting Algorithm(Springer, 2020) Chittaragi, N.B.; Koolagudi, S.G.In this paper, a dialect identification system (DIS) is proposed by exploring the dialect specific prosodic features and cepstral coefficients from sentence-level utterances. Commonly, people belonging to a specific region follow a unique speaking style among them known as dialects. Sentence speech units are chosen for dialect identification since it is observed that a unique intonation and energy patterns are followed in sentences. Sentences are derived from a standard Intonational Variations in English (IViE) speech dataset. In this paper, pitch and energy contour are used to derive intonation and energy features respectively by using Legendre polynomial fit function along with five statistical features. Further, Mel frequency cepstral coefficients (MFCCs) are added to capture dialect specific spectral information. Extreme Gradient Boosting (XGB) ensemble method is employed for evaluation of the system under individual and combinations of features. Obtained results have indicated the influences of both prosodic and spectral features in recognition of dialects, also combined feature vectors have shown a better DIS performance of about 89.6%. © 2020, Springer Nature Singapore Pte Ltd.Item Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers(Springer Verlag, 2018) Chittaragi, N.B.; Prakash, A.; Koolagudi, S.G.In this paper, investigation of the significance of spectral and prosodic behaviors of speech signal has been carried out for dialect identification. Spectral features such as cepstral coefficients, spectral flux, and entropy are extracted from shorter frames. Prosodic attributes such as pitch, energy, and duration are derived from longer frames. IViE (Intonational Variations in English) speech corpus covering nine dialectal regions of British Isles has been considered, to evaluate the proposed approach. Since corpus is available in both read and semi-spontaneous modes, the influence of spectral and prosodic behavior over these datasets is distinguishably articulated. Further, two distinct classification algorithms, namely support vector machine (SVM) and an ensemble of decision trees along with the SVM are used for identification of nine dialects. Dialect discriminating information captured from both features are used for constructing feature vectors. Experiments have been conducted on individual and combinations of features. A better dialect recognition performance is observed with ensemble methods over a single independent SVM. © 2017, King Fahd University of Petroleum & Minerals.
