Kannada Dialect Identification from Case-Based Word Utterances Using Gradient Boosting Algorithm

No Thumbnail Available

Date

2022

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Science and Business Media Deutschland GmbH

Abstract

Dialects or accents constitute the grammatical variations along with phonological and lexical changes those are commonly observed in the usage of a language with minor and subtle differences. Dialectal variations existing among dialects are mainly due to unique speaking patterns followed among the group of speakers. The dialect processing systems are essential in the development of automatic speech recognition systems (ASRs) for regional and resource-constrained languages in the country like India. Since India is with rich diversity in languages. In this paper, a language-dependent dialect identification system is proposed for Kannada language from words especially with the Kannada language-specific case (Vibhakthi Prathyayas) information. Special morphological operations that exist in the Kannada language in terms of various cases commonly called as a grammatical function of a noun or pronoun. These word utterances are used for the classification of five dialects of Kannada. This is a novel idea to use the smaller word utterances that consist of dialect-specific information representing the unique characteristics. In this paper, case-based word utterance dataset is prepared by considering five Kannada dialects from Kannada Dialect Speech Corpus (KDSC). Dynamic and static prosodic features are extracted to capture dialectal variations. Addition to these features, spectral MFCC features are also considered for evaluation of differences among dialects from these word-level units. Initially, multi-class Support vector machine (SVM) technique is used and later effective extreme gradient boosting (XGB) ensemble algorithms are used for the development of an automatic Kannada dialect recognition system. The research findings have demonstrated the words with case information convey dialect specific linguistic cues effectively. The combination of dynamic and static prosodic cues has a significant effect on the characterization of dialects along with spectral features. © 2022, Springer Nature Switzerland AG.

Description

Keywords

Cases (Vibhakthi Prathyayas), Dynamic and static prosodic features, Gradient boosting algorithm, Kannada dialect identification, Support vector machines

Citation

Communications in Computer and Information Science, 2022, Vol.1534 CCIS, , p. 675-686

Endorsement

Review

Supplemented By

Referenced By