Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

Chittaragi, N.B.; Koolagudi, S.G.

Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

dc.contributor.author	Chittaragi, N.B.
dc.contributor.author	Koolagudi, S.G.
dc.date.accessioned	2026-02-05T09:28:36Z
dc.date.issued	2020
dc.description.abstract	In this paper, an automatic dialect identification (ADI) system is proposed by extracting spectral and prosodic features for Kannada language. A new dialect dataset is collected from native speakers of Kannada language (A Dravidian language). This dataset includes five distinct dialects of Kannada language representing five geographical regions of Karnataka state. Investigation of the significance of spectral and prosodic variations on five Kannada dialects is carried out. Mel-frequency cepstral coefficients (MFCCs), spectral flux, and entropy are used as representatives of spectral features. Besides, pitch and energy features are extracted as representatives of prosodic parameters for identification of dialects. These raw feature vectors are further processed to get a new derived feature vectors by using statistical processing. In this paper, a single classifier based multi-class support vector machine (SVM) and multiple classifier based ensemble SVM (ESVM) techniques are employed for classification of dialects. The effectiveness and performance evaluation of the explored features are carried out on newly collected Kannada speech corpus, with five Kannada dialects and internationally known standard Intonation Variation in English (IViE) dataset with nine British English dialects. Experimental results have demonstrated that the derived feature vectors performs better when compared to raw feature vectors. However, ESVM technique has demonstrated better performance over a single SVM. Spectral and prosodic features have resulted individually with the dialect recognition performance of 83.12% and 44.52% respectively. Further, the complementary nature of both spectral and prosodic features is evaluated by combining both feature vectors for dialect recognition. However, an increase in dialect recognition performance of about 86.25% is observed. This indicates the existence of complementary dialect specific evidence with spectral and prosodic features. The experiments conducted on standard IViE corpus have shown a higher recognition rate of 91.38% using ESVM. Proposed ADI systems with derived features have shown better performance over the state-of-the-art i-vector feature based systems on both datasets. © 2019, Springer Nature B.V.
dc.identifier.citation	Language Resources and Evaluation, 2020, 54, 2, pp. 553-585
dc.identifier.issn	1574020X
dc.identifier.uri	https://doi.org/10.1007/s10579-019-09481-5
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/23893
dc.publisher	Springer editorial@springerplus.com
dc.subject	Derived features
dc.subject	Dialect identification
dc.subject	Ensemble SVM
dc.subject	IViE dialect dataset
dc.subject	Kannada dialect dataset
dc.subject	Single SVM
dc.subject	Spectral and prosodic features
dc.title	Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

Collections

Journal Articles

Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

Files

Collections