2. Thesis and Dissertations
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/1/10
Browse
2 results
Search Results
Item Phonology Analysis From Childrens' Speech(National Institute of Technology Karnataka, Surathkal, 2022) Bhaskar, Ramteke Pravin; Koolagudi, Shashidhar G.Human vocal tract can produce various sounds. The speech sounds are relatively a very small set of such sounds that appears uniquely quali ed to be used in the production of speech. It includes positions of the parts of the body necessary for producing spoken words and the e ect of air rushing from lungs as it passes through the larynx, pharynx, vocal cords, nasal passages and mouth. Phonetic sounds (phones) are the actual speech sounds classi ed by the manner and place of articulation (i.e. the way in which air is forced through the mouth and shaped by the tongue, teeth, palate, lips and in some languages by the uvula). Children begin language acquisition with their rst meaningful word. Further, they acquire language by mimicking the adult pronunciation. This development mainly depends on the development of vocal tract, neuro-motor control and in uence from the language of people surrounding them. Signi cant di erence can be observed in the vocal tract of the child and adult where the vocal tract in children is underdeveloped and short in comparison with the adult vocal tract. Along with these, other oral cavity parameters such as tongue, larynx, epiglottis, vocal cords are also underdeveloped. Due to this, children face di culty in producing speech sounds, where the pronunciations are simpli ed by substituting the di cult speech sounds with other simple one. This results in signi cant deviations and replacements in the pronunciation of phonemes in children leading to mispronunciation or pronunciation errors. These processes are referred to as phonological processes. The phonological processes appear in the children represents the agewise speech learning ability. The analysis helps the Speech Language Pathologists (SLPs) in studying language learning ability of the children. The manual process of phonology analysis involves lot of human e ort and time. Literature reports that the phonological processes are properly studied in the children speaking English as native language. Indian languages are syllabic in nature and di er from English which is phonemic in nature. Hence, the observations made in the case of English children may not be directly applicable to the study of phonological developments observed in the case of Indian children. In general, the appearance of phonological processes in the case of Indian children is not well studied i and documented. The appearance of these processes beyond certain age may indicate the presence of the phonological disorder. It helps the SLPs to automatically identify the processes and analyse the language learning pattern along with disorders present if processes are observed beyond certain age. In this work, we aim to develop the systems for automatic identi cation of phonological processes in Kannada language. Applications of this research work include evaluation of language learning ability, identi cation of speech and motor disorder, gender based analysis of phonological processes, etc. Some of the important issues in this research area are, large number of non-standardized phonological processes; lack of detailed studies in Indian languages; availability of children's speech databases in the required age range from 31 2 to 61 2 years; di culties in adapting existing systems of mispronunciation identi cation due to huge di erence in the speech production parameters of the adults and children for the proposed age range; need of identifying features characterizing each phonological process in comparison based algorithms. We recorded Kannada language speech dataset from children between age 3 1 2 to 61 2 years and named it as NITK Kids' Speech Corpus. It is collected in three age groups with an interval of one year in each age group. For each age range, the data is recorded from 40 children (20 male and 20 female). This work provides, the detailed analysis of the phonological processes that appear in children from age 3 1 2 years to 6 1 2 years speaking Kannada as native language. Based on the pattern of disappearance of the phonological process, the age-wise analysis of the acquisition of phonemes is provided. A detailed comparison of language learning ability of the children speaking English language and Kannada language is also performed. Based on the e ectiveness of the comparison based algorithms in identi cation of phonological processes in smaller age range, it is considered for the analysis. Commonly observed phonological processes that are considered for our study are: aspiration, nasal- ization & nasal assimilation, palatal fricative fronting, nal consonant deletion, voicing assimilation and vowel deviations. Spectral, prosodic and excitation source features ef- cient in discriminating the correct pronunciation of a phoneme and its mispronounced counterpart are identi ed and exploited for the identi cation of phonological processes. Two case studies are considered for the evaluation. Based on the availability of the dataset for phonological disorder, 'rhotacism' is considered for the analysis. The spec- tral and prosodic features e cient in characterization of the phonological disorder are explored. During the processes of phonological process identi cation, we came across ii interesting problem of children gender identi cation. The task of gender identi cation from children's speech is di cult compared to adult gender identi cation. The gender identi cation from adult speech is also performed to analyze the di culties in the task of children gender identi cation in comparison with the adult speech. The role of spec- tral, prosodic, excitation source features have been proposed gender identi cation in both implementations using suitable machine learning algorithms. Detailed experimental eval- uation is carried out to compare the performance of each of the proposed approaches against baseline and state-of-the-art systems.Item Speech Processing Approaches towards Characterization and Identification of Dialects(National Institute of Technology Karnataka, Surathkal, 2020) Chittaragi, Nagaratna B.; Koolagudi, Shashidhar G.Dialects constitute the phonological, lexical, and grammatical variations in the usage of a language with very minor and subtle differences. These variations are mainly due to specific speaking patterns followed among the group of speakers. In the recent past, dialect identification from the speech is emerging as one of the prominent speech research areas. This is mainly due to the extensive increase in the use of interactive voice-based systems. Therefore, it is essential to address speech variabilities caused due to dialectal differences in order to achieve effective, realistic man-machine interaction. The existing research on characterization and identification of dialects has mainly focused on acoustic, phonetic and phonotactic approaches on several languages such as English, Chinese, Arabic, Hindi, Spanish, etc. However, these models are not proved to be language independent. Applying these models to other languages may not perform equally well as there are many fundamental differences between dialects of different languages. However, in the literature dialect processing models reported with respect to Indian regional languages are considerably less. In this thesis, an attempt is made to develop few useful language independent and dependent Automatic Dialect Identification (ADI) systems for the Kannada language. In the beginning, a new text-independent Kannada Dialect Speech Corpus (KDSC) is collected from native speakers belonging to five prominent dialectal regions of Karnataka. This thesis investigates the significances of the excitation source, spectral, and prosodic features of speech for dialect identification. Additionally, spectrotemporal variations across dialects are captured through 2D Gabor features which are known to be biologically inspired ones. Further, the existence of non- conventional dialect-specific rhythmic and melodic correlations among dialects are explored using chroma features. These are well-established features in music-related applications. Robustness of these proposed features has been investigated under noisy background conditions and with small sized (limited data) audio clips. Inaddition, word and sentence based ADI systems are proposed using intonation and intensity variations representing the dynamic and static prosodic behaviors. Further, language dependent dialect identification systems are proposed for Kannada language using basic phonetic unit level dialect information. Additionally, Kannada language specific ’case’ (Vibhakthi Prathyayas) based dialect identification approaches are proposed. A single classifier based Support Vector Machines (SVM) and multiple classifiers based ensemble algorithms are used for classification of dialects. Experiments are carried out using individual and combinations of features. Use of different features has illustrated their complementary nature towards dialect processing. Performance comparison of both categories of classification algorithms has shown that ensemble algorithms perform better over single classifier based algorithms. Further, the intuition to use rhythm based aspects of dialects through chroma and spectral-shape features has shown better performance over state-of-the-art i-vector features. Moreover, this feature set has shown the noise robustness over the conventional MFCCs. In this work, we also have proposed intonation and intensity features to capture dialectal information from words and sentences for effective classification of dialects. In continuation, the role of duration, energy, pitch, three formants, and spectral features is also found to be evidential in Kannada dialect classification.