Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 5 of 5
  • Item
    Text-independent automatic accent identification system for Kannada language
    (Springer Verlag service@springer.de, 2017) Soorajkumar, R.; Girish, G.N.; Ramteke, P.B.; Joshi, S.S.; Koolagudi, S.G.
    Accent identification is one of the applications paid more attention in speech processing.Atext-independent accent identification system is proposed using Gaussian mixturemodels (GMMs) for Kannada language. Spectral and prosodic features such as Mel-frequency cepstral coefficients (MFCCs), pitch, and energy are considered for the experimentation. The dataset is collected from three regions of Karnataka namely Mumbai Karnataka, Mysore Karnataka, and Karavali Karnataka having significant variations in accent. Experiments are conducted using 32 speech samples from each region where each clip is of one minute duration spoken by native speakers. The baseline system implemented using MFCC features found to achieve 76.7% accuracy. From the results it is observed that the hybrid features improve the performance of the system by 3 %. © Springer Science+Business Media Singapore 2017.
  • Item
    Automatic text-independent Kannada dialect identification system
    (Springer Verlag service@springer.de, 2019) Chittaragi, N.B.; Limaye, A.; Chandana, N.T.; Annappa, B.; Koolagudi, S.G.
    This paper proposes a dialect identification system for the Kannada language. A system that can automatically identify the dialects of the language being spoken has a wide variety of applications. However, not many Automatic Speech Recognition (ASR) and dialect identification tasks are carried out in majority of the Indian languages. Further, there are only a few good quality annotated audio datasets available. In this paper, a new dataset for 5 spoken dialects of the Kannada language is introduced. Spectral and prosodic features have captured the most prominent features for recognition of Kannada dialects. Support Vector Machine (SVM) and neural networks algorithms are used for modeling text-independent recognition system. A neural network model that attempts for identification dialects based on sentence level cues has also been built. Hyper-parameters for SVM and neural network models are chosen using grid search. Neural network models have outperformed SVMs when complete utterances are considered. © Springer Nature Singapore Pte Ltd. 2019.
  • Item
    Acoustic-phonetic feature based Kannada dialect identification from vowel sounds
    (Springer New York LLC barbara.b.bertram@gsk.com, 2019) Chittaragi, N.B.; Koolagudi, S.G.
    In this paper, a dialect identification system is proposed for Kannada language using vowels sounds. Dialectal cues are characterized through acoustic parameters such as formant frequencies (F1–F3), and prosodic features [energy, pitch (F0), and duration]. For this purpose, a vowel dataset is collected from native speakers of Kannada belonging to different dialectal regions. Global features representing frame level global statistics such as mean, minimum, maximum, standard deviation and variance are extracted from vowel sounds. Local features representing temporal dynamic properties from the contour level are derived from the steady-state vowel region. Three decision tree-based ensemble algorithms, namely random forest, extreme random forest (ERF) and extreme gradient boosting algorithms are used for classification. Performance of both global and local features is evaluated individually. Further, the significance of every feature in dialect discrimination is analyzed using single factor-ANOVA (analysis of variances) tests. Global features with ERF ensemble model has shown a better average dialect identification performance of around 76%. Also, the contribution of every feature in dialect identification is verified. The role of duration, energy, pitch, and three formant features is found to be evidential in Kannada dialect classification. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
  • Item
    Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms
    (Springer editorial@springerplus.com, 2020) Chittaragi, N.B.; Koolagudi, S.G.
    In this paper, an automatic dialect identification (ADI) system is proposed by extracting spectral and prosodic features for Kannada language. A new dialect dataset is collected from native speakers of Kannada language (A Dravidian language). This dataset includes five distinct dialects of Kannada language representing five geographical regions of Karnataka state. Investigation of the significance of spectral and prosodic variations on five Kannada dialects is carried out. Mel-frequency cepstral coefficients (MFCCs), spectral flux, and entropy are used as representatives of spectral features. Besides, pitch and energy features are extracted as representatives of prosodic parameters for identification of dialects. These raw feature vectors are further processed to get a new derived feature vectors by using statistical processing. In this paper, a single classifier based multi-class support vector machine (SVM) and multiple classifier based ensemble SVM (ESVM) techniques are employed for classification of dialects. The effectiveness and performance evaluation of the explored features are carried out on newly collected Kannada speech corpus, with five Kannada dialects and internationally known standard Intonation Variation in English (IViE) dataset with nine British English dialects. Experimental results have demonstrated that the derived feature vectors performs better when compared to raw feature vectors. However, ESVM technique has demonstrated better performance over a single SVM. Spectral and prosodic features have resulted individually with the dialect recognition performance of 83.12% and 44.52% respectively. Further, the complementary nature of both spectral and prosodic features is evaluated by combining both feature vectors for dialect recognition. However, an increase in dialect recognition performance of about 86.25% is observed. This indicates the existence of complementary dialect specific evidence with spectral and prosodic features. The experiments conducted on standard IViE corpus have shown a higher recognition rate of 91.38% using ESVM. Proposed ADI systems with derived features have shown better performance over the state-of-the-art i-vector feature based systems on both datasets. © 2019, Springer Nature B.V.
  • Item
    Dialect Identification using Chroma-Spectral Shape Features with Ensemble Technique
    (Academic Press, 2021) Chittaragi, N.B.; Koolagudi, S.G.
    The present work proposes a text-independent dialect identification system. Generally, dialects of a language exhibit varying pronunciation styles followed in a specific geographical region. In this paper, chroma features familiar with music-related systems are employed for identification of dialects. In addition, eight significant spectral shape related features from short term spectra are computed and combined along with chroma features and named as chroma-spectral shape features. Chroma features try to aggregate spectral information and attempt to encapsulate the evidential variations, concerning timbre, correlated melody, rhythmic, and intonation patterns found prominently among dialects of few languages. The effectiveness of the proposed features and approach is evaluated on five prominent Kannada dialects spoken in Karnataka, India and internationally known standard Intonation Variation in English (IViE) dataset with nine British English dialects. Discriminative models such as, single classifier based Support Vector Machine (SVM) and ensemble based support vector machines (ESVM) are employed for classification. The proposed features have shown better performance over state-of-the-art i-vector features on both datasets. The highest recognition performance of 95.6% and 97.52% are achieved in the cases of Kannada and IViE dialect datasets respectively using ESVM. Proposed features have also demonstrated robust performance with small sized (limited data) audio clips even in noisy conditions. © 2021 Elsevier Ltd