Sentence-Based Dialect Identification System Using Extreme Gradient Boosting Algorithm

dc.contributor.authorChittaragi, N.B.
dc.contributor.authorKoolagudi, S.G.
dc.date.accessioned2026-02-06T06:37:07Z
dc.date.issued2020
dc.description.abstractIn this paper, a dialect identification system (DIS) is proposed by exploring the dialect specific prosodic features and cepstral coefficients from sentence-level utterances. Commonly, people belonging to a specific region follow a unique speaking style among them known as dialects. Sentence speech units are chosen for dialect identification since it is observed that a unique intonation and energy patterns are followed in sentences. Sentences are derived from a standard Intonational Variations in English (IViE) speech dataset. In this paper, pitch and energy contour are used to derive intonation and energy features respectively by using Legendre polynomial fit function along with five statistical features. Further, Mel frequency cepstral coefficients (MFCCs) are added to capture dialect specific spectral information. Extreme Gradient Boosting (XGB) ensemble method is employed for evaluation of the system under individual and combinations of features. Obtained results have indicated the influences of both prosodic and spectral features in recognition of dialects, also combined feature vectors have shown a better DIS performance of about 89.6%. © 2020, Springer Nature Singapore Pte Ltd.
dc.identifier.citationAdvances in Intelligent Systems and Computing, 2020, Vol.766, , p. 131-138
dc.identifier.issn21945357
dc.identifier.urihttps://doi.org/10.1007/978-981-13-9683-0_14
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/30877
dc.publisherSpringer
dc.subjectDialect identification system
dc.subjectIViE speech corpus
dc.subjectProsodic features
dc.subjectSentence segmentation
dc.subjectSpectral features
dc.subjectXGB model
dc.titleSentence-Based Dialect Identification System Using Extreme Gradient Boosting Algorithm

Files