Browsing by Author "Chittaragi, N.B."

Now showing 1 - 20 of 32

A Novel Approach to Video Steganography using a 3D Chaotic Map
(Institute of Electrical and Electronics Engineers Inc., 2019) Narayanan, G.; Narayanan, R.; Haneef, N.; Chittaragi, N.B.; Koolagudi, S.G.
In this paper, we introduce a novel approach for data-hiding in videos using 3-dimensional Chaotic Maps. A video is represented as a 3-dimensional image, with the third axis constituting the frames of the video. Existing chaotic map based data-hiding techniques on videos is confined to applications of 2-dimensional chaotic maps on a per-frame basis. In this paper, a 3-dimensional extension of the logistic chaos map is applied to identify pixels to encode information in the video's 3-dimensional space and 3-3-2 Least Significant Bit (LSB) substitution is used to encode 1 byte of information per pixel. We have implemented and presented a proof of concept that has been analyzed on a test video using various quality metrics. The chaotic map based data-hiding approach proposed in this paper is shown to be secured and the results observed are inline with the standard results for a video steganographic algorithm using LSB substitution. Â© 2019 IEEE.
Acoustic features based word level dialect classification using SVM and ensemble methods
(2018) Chittaragi, N.B.; Koolagudi, S.G.
In this paper, word based dialect classification system is proposed by using acoustic characteristics of the speech signal. Dialects mainly represent the different pronunciation patterns of any language. Dialectal cues can exist at various levels such as phoneme, syllable, word, sentence and phrase in an utterance. Word level dialectal traits are extracted to recognize dialects since every word exhibits significant dialect discriminating cues. Intonational Variations in English (IViE) speech corpus recorded in British English has been considered. The corpus includes nine dialects which cover nine distinct regions of British Isles. Acoustic properties such as spectral and prosodic features are derived from word level to construct the feature vector. Further, two different classification algorithms such as support vector machine (SVM) and tree-based extreme gradient boosting (XGB) ensemble algorithms are used to extract the prominent patterns that are used to discriminate the dialects. From the experiments, a better performance has been observed with word level traits using ensemble methods over the SVM classification method. � 2017 IEEE.
Acoustic features based word level dialect classification using SVM and ensemble methods
(Institute of Electrical and Electronics Engineers Inc., 2017) Chittaragi, N.B.; Koolagudi, S.G.
In this paper, word based dialect classification system is proposed by using acoustic characteristics of the speech signal. Dialects mainly represent the different pronunciation patterns of any language. Dialectal cues can exist at various levels such as phoneme, syllable, word, sentence and phrase in an utterance. Word level dialectal traits are extracted to recognize dialects since every word exhibits significant dialect discriminating cues. Intonational Variations in English (IViE) speech corpus recorded in British English has been considered. The corpus includes nine dialects which cover nine distinct regions of British Isles. Acoustic properties such as spectral and prosodic features are derived from word level to construct the feature vector. Further, two different classification algorithms such as support vector machine (SVM) and tree-based extreme gradient boosting (XGB) ensemble algorithms are used to extract the prominent patterns that are used to discriminate the dialects. From the experiments, a better performance has been observed with word level traits using ensemble methods over the SVM classification method. Â© 2017 IEEE.
Acoustic-phonetic feature based Kannada dialect identification from vowel sounds
(Springer New York LLC barbara.b.bertram@gsk.com, 2019) Chittaragi, N.B.; Koolagudi, S.G.
In this paper, a dialect identification system is proposed for Kannada language using vowels sounds. Dialectal cues are characterized through acoustic parameters such as formant frequencies (F1–F3), and prosodic features [energy, pitch (F0), and duration]. For this purpose, a vowel dataset is collected from native speakers of Kannada belonging to different dialectal regions. Global features representing frame level global statistics such as mean, minimum, maximum, standard deviation and variance are extracted from vowel sounds. Local features representing temporal dynamic properties from the contour level are derived from the steady-state vowel region. Three decision tree-based ensemble algorithms, namely random forest, extreme random forest (ERF) and extreme gradient boosting algorithms are used for classification. Performance of both global and local features is evaluated individually. Further, the significance of every feature in dialect discrimination is analyzed using single factor-ANOVA (analysis of variances) tests. Global features with ERF ensemble model has shown a better average dialect identification performance of around 76%. Also, the contribution of every feature in dialect identification is verified. The role of duration, energy, pitch, and three formant features is found to be evidential in Kannada dialect classification. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
Automatic diagnosis of COVID-19 related respiratory diseases from speech
(Springer, 2023) Shekhar, K.; Chittaragi, N.B.; Koolagudi, S.G.
In this work, an attempt is made to propose an intelligent and automatic system to recognize COVID-19 related illnesses from mere speech samples by using automatic speech processing techniques. We used a standard crowd-sourced dataset which was collected by the University of Cambridge through a web based application and an android/iPhone app. We worked on cough and breath datasets individually, and also with a combination of both the datasets. We trained the datasets on two sets of features, one consisting of only standard audio features such as spectral and prosodic features and one combining excitation source features with standard audio features extracted, and trained our model on shallow classifiers such as ensemble classifiers and SVM classification methods. Our model has shown better performance on both breath and cough datasets, but the best results in each of the cases was obtained through different combinations of features and classifiers. We got our best result when we used only standard audio features, and combined both cough and breath data. In this case, we achieved an accuracy of 84% and an Area Under Curve (AUC) score of 84%. Intelligent systems have already started to make a mark in medical diagnosis, and this type of study can help better the health system by providing much needed assistance to the health workers. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms
(Springer editorial@springerplus.com, 2020) Chittaragi, N.B.; Koolagudi, S.G.
In this paper, an automatic dialect identification (ADI) system is proposed by extracting spectral and prosodic features for Kannada language. A new dialect dataset is collected from native speakers of Kannada language (A Dravidian language). This dataset includes five distinct dialects of Kannada language representing five geographical regions of Karnataka state. Investigation of the significance of spectral and prosodic variations on five Kannada dialects is carried out. Mel-frequency cepstral coefficients (MFCCs), spectral flux, and entropy are used as representatives of spectral features. Besides, pitch and energy features are extracted as representatives of prosodic parameters for identification of dialects. These raw feature vectors are further processed to get a new derived feature vectors by using statistical processing. In this paper, a single classifier based multi-class support vector machine (SVM) and multiple classifier based ensemble SVM (ESVM) techniques are employed for classification of dialects. The effectiveness and performance evaluation of the explored features are carried out on newly collected Kannada speech corpus, with five Kannada dialects and internationally known standard Intonation Variation in English (IViE) dataset with nine British English dialects. Experimental results have demonstrated that the derived feature vectors performs better when compared to raw feature vectors. However, ESVM technique has demonstrated better performance over a single SVM. Spectral and prosodic features have resulted individually with the dialect recognition performance of 83.12% and 44.52% respectively. Further, the complementary nature of both spectral and prosodic features is evaluated by combining both feature vectors for dialect recognition. However, an increase in dialect recognition performance of about 86.25% is observed. This indicates the existence of complementary dialect specific evidence with spectral and prosodic features. The experiments conducted on standard IViE corpus have shown a higher recognition rate of 91.38% using ESVM. Proposed ADI systems with derived features have shown better performance over the state-of-the-art i-vector feature based systems on both datasets. © 2019, Springer Nature B.V.
Automatic hate speech detection in audio using machine learning algorithms
(Springer, 2024) Imbwaga, J.L.; Chittaragi, N.B.; Koolagudi, S.G.
Even though every individual is entitled to freedom of speech, some limitations exist when this freedom is used to target and harm another individual or a group of people, as it translates to hate speech. In this study, the proposed research deals with detection of hate speech for English and Kiswahili languages from audio. The dataset used in this work was collected manually from YouTube videos and then converted to audio. Audio-based features namely spectral, temporal, prosodic and excitation source features were extracted and used to train various machine learning classifiers. Initial experiments were conducted for English language and later on for Kiswahili language. However, it is observed from literature that research activities on Kiswahili language is comparatively lesser. The scores calculated for accuracy, recall, precision, auc and f1 score in detecting hate speech, suggest that Random Forest classifier performed better for English language while the Extreme Gradient Boosting classifier performed better for Kiswahili language. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
Automatic text-independent Kannada dialect identification system
(2019) Chittaragi, N.B.; Limaye, A.; Chandana, N.T.; Annappa, B.; Koolagudi, S.G.
This paper proposes a dialect identification system for the Kannada language. A system that can automatically identify the dialects of the language being spoken has a wide variety of applications. However, not many Automatic Speech Recognition (ASR) and dialect identification tasks are carried out in majority of the Indian languages. Further, there are only a few good quality annotated audio datasets available. In this paper, a new dataset for 5 spoken dialects of the Kannada language is introduced. Spectral and prosodic features have captured the most prominent features for recognition of Kannada dialects. Support Vector Machine (SVM) and neural networks algorithms are used for modeling text-independent recognition system. A neural network model that attempts for identification dialects based on sentence level cues has also been built. Hyper-parameters for SVM and neural network models are chosen using grid search. Neural network models have outperformed SVMs when complete utterances are considered. � Springer Nature Singapore Pte Ltd. 2019.
Automatic text-independent Kannada dialect identification system
(Springer Verlag service@springer.de, 2019) Chittaragi, N.B.; Limaye, A.; Chandana, N.T.; Annappa, B.; Koolagudi, S.G.
This paper proposes a dialect identification system for the Kannada language. A system that can automatically identify the dialects of the language being spoken has a wide variety of applications. However, not many Automatic Speech Recognition (ASR) and dialect identification tasks are carried out in majority of the Indian languages. Further, there are only a few good quality annotated audio datasets available. In this paper, a new dataset for 5 spoken dialects of the Kannada language is introduced. Spectral and prosodic features have captured the most prominent features for recognition of Kannada dialects. Support Vector Machine (SVM) and neural networks algorithms are used for modeling text-independent recognition system. A neural network model that attempts for identification dialects based on sentence level cues has also been built. Hyper-parameters for SVM and neural network models are chosen using grid search. Neural network models have outperformed SVMs when complete utterances are considered. Â© Springer Nature Singapore Pte Ltd. 2019.
Dialect Identification using Chroma-Spectral Shape Features with Ensemble Technique
(Academic Press, 2021) Chittaragi, N.B.; Koolagudi, S.G.
The present work proposes a text-independent dialect identification system. Generally, dialects of a language exhibit varying pronunciation styles followed in a specific geographical region. In this paper, chroma features familiar with music-related systems are employed for identification of dialects. In addition, eight significant spectral shape related features from short term spectra are computed and combined along with chroma features and named as chroma-spectral shape features. Chroma features try to aggregate spectral information and attempt to encapsulate the evidential variations, concerning timbre, correlated melody, rhythmic, and intonation patterns found prominently among dialects of few languages. The effectiveness of the proposed features and approach is evaluated on five prominent Kannada dialects spoken in Karnataka, India and internationally known standard Intonation Variation in English (IViE) dataset with nine British English dialects. Discriminative models such as, single classifier based Support Vector Machine (SVM) and ensemble based support vector machines (ESVM) are employed for classification. The proposed features have shown better performance over state-of-the-art i-vector features on both datasets. The highest recognition performance of 95.6% and 97.52% are achieved in the cases of Kannada and IViE dialect datasets respectively using ESVM. Proposed features have also demonstrated robust performance with small sized (limited data) audio clips even in noisy conditions. © 2021 Elsevier Ltd
Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers
(Springer Verlag, 2018) Chittaragi, N.B.; Prakash, A.; Koolagudi, S.G.
In this paper, investigation of the significance of spectral and prosodic behaviors of speech signal has been carried out for dialect identification. Spectral features such as cepstral coefficients, spectral flux, and entropy are extracted from shorter frames. Prosodic attributes such as pitch, energy, and duration are derived from longer frames. IViE (Intonational Variations in English) speech corpus covering nine dialectal regions of British Isles has been considered, to evaluate the proposed approach. Since corpus is available in both read and semi-spontaneous modes, the influence of spectral and prosodic behavior over these datasets is distinguishably articulated. Further, two distinct classification algorithms, namely support vector machine (SVM) and an ensemble of decision trees along with the SVM are used for identification of nine dialects. Dialect discriminating information captured from both features are used for constructing feature vectors. Experiments have been conducted on individual and combinations of features. A better dialect recognition performance is observed with ensemble methods over a single independent SVM. © 2017, King Fahd University of Petroleum & Minerals.
Dialect Recognition System Using Excitation Source Features
(Institute of Electrical and Electronics Engineers Inc., 2018) Choudhury, A.R.; Chittaragi, N.B.; Koolagudi, S.G.
This paper focuses on building an automatic dialect recognition system using excitation source features. Every spoken unit represents the unique articulatory configuration of the excitation source and the vocal tract system. This paper emphasis on exploring source information to capture dialectal cues over vocal tract information. Epochs representing the instants of maximum excitation of the vocal tract at the closure are used as source features. Additionally, strength and slope of epochs and instantaneous frequency features are extracted from zero frequency filtered signal. Further, 13 cepstral coefficients are derived from the LP residual to prepare feature vector. Two dialect datasets such as Kannada dataset with five prominent dialects and English dataset with nine dialects are used for evaluation of the significances explored features. Classification experiments are conducted with support vector machines designed with sequential minimal optimization (SMO-SVM) function. Performances are analyzed individually and in combinations. Obtained results have exhibited the existence of dialect information at excitation source information and complementary cues at vocal tract system. Â© 2018 IEEE.
Explainable hate speech detection using LIME
(Springer, 2024) Imbwaga, J.L.; Chittaragi, N.B.; Koolagudi, S.G.
Free speech is essential, but it can conflict with protecting marginalized groups from harm caused by hate speech. Social media platforms have become breeding grounds for this harmful content. While studies exist to detect hate speech, there are significant research gaps. First, most studies used text data instead of other modalities such as videos or audio. Second, most studies explored traditional machine learning algorithms. However, due to the increase in complexities of computational tasks, there is need to employ complex techniques and methodologies. Third, majority of the research studies have either been evaluated using very few evaluation metrics or not statistically evaluated at all. Lastly, due to the opaque, black-box nature of the complex classifiers, there is need to use explainability techniques. This research aims to address these gaps by detecting hate speech in English and Kiswahili languages using videos manually collected from YouTube. The videos were converted to text and used to train various classifiers. The performance of these classifiers was evaluated using various evaluation and statistical measurements. The experimental results suggest that the random forest classifier achieved the highest results for both languages across all evaluation measurements compared to all classifiers used. The results for English language were: accuracy 98%, AUC 96%, precision 99%, recall 97%, F1 98%, specificity 98% and MCC 96% while the results for Kiswahili language were: accuracy 90%, AUC 94%, precision 93%, recall 92%, F1 94%, specificity 87% and MCC 75%. These results suggest that the random forest classifier is robust, effective and efficient in detecting hate speech in any language. This also implies that the classifier is reliable in detecting hate speech and other related problems in social media. However, to understand the classifiers’ decision-making process, we used the Local Interpretable Model-agnostic Explanations (LIME) technique to explain the predictions achieved by the random forest classifier. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
Extractive Document Summarization Using a Supervised Learning Approach
(2019) Charitha, S.; Chittaragi, N.B.; Koolagudi, S.G.
In this paper, we present a model for extractive multi-document text summarization using a supervised learning approach. The model uses a convolutional neural networks (CNN) which is capable of learning sentence features on its own for sentence ranking. This approach has been used in order to avoid the overhead of extracting features from sentences manually. Integer linear programming (ILP) approach has been adopted for selecting sentences to generate the summary based on sentence ranks. This ILP model minimizes the redundancy in the generated summary. We have evaluated our proposed approach on the DUC 2007 dataset and its performance is found to be competitive or better in comparison with state-of-the-art systems. � 2018 IEEE.
Extractive Document Summarization Using a Supervised Learning Approach
(Institute of Electrical and Electronics Engineers Inc., 2018) Charitha, S.; Chittaragi, N.B.; Koolagudi, S.G.
In this paper, we present a model for extractive multi-document text summarization using a supervised learning approach. The model uses a convolutional neural networks (CNN) which is capable of learning sentence features on its own for sentence ranking. This approach has been used in order to avoid the overhead of extracting features from sentences manually. Integer linear programming (ILP) approach has been adopted for selecting sentences to generate the summary based on sentence ranks. This ILP model minimizes the redundancy in the generated summary. We have evaluated our proposed approach on the DUC 2007 dataset and its performance is found to be competitive or better in comparison with state-of-the-art systems. Â© 2018 IEEE.
Kannada Dialect Classification using Artificial Neural Networks
(Institute of Electrical and Electronics Engineers Inc., 2020) Mothukuri, S.K.P.; Hegde, P.; Chittaragi, N.B.; Koolagudi, S.G.
In this paper, Automatic Dialect Classification (ADC) system is proposed for dialects of Kannada language (the Dravidian language spoken in Southern Karnataka). ADC system is proposed by extracting spectral Mel Frequency Cepstral Coefficients (MFCCs), and log filter bank features along with Linear predictive coefficients. In addition, prosodic pitch and energy features are extracted to capture dialect specific cues. A Kannada dialect speech corpus consisting of five prominent dialects of Kannada language is used for designing the ADC system. An attempt is made by using Artificial Neural Networks (ANNs) technique for classification of Kannada dialects. As, recently, ANNs and its variants are gaining more popularity in the area of speech processing application. Hyperparameter tuning of ANN has resulted with an increase in performance. Â© 2020 IEEE.
Kannada Dialect Classification UsingÂ CNN
(Springer Science and Business Media Deutschland GmbH, 2020) Hegde, P.; Chittaragi, N.B.; Mothukuri, S.K.P.; Koolagudi, S.G.
Kannada is one of the prominent languages spoken in southern India. Since the Kannada is a lingua franca and spoken by more than 70 million people, it is evident to have dialects. In this paper, we identified five major dialectal regions in Karnataka state. An attempt is made to classify these five dialects from sentence-level utterances. Sentences are segmented from continuous speech automatically by using spectral centroid and short term energy features. Mel frequency cepstral coefficient (MFCC) features are extracted from these sentence units. These features are used to train the convolutional neural networks (CNN). Along with MFCCs, shifted delta and double delta coefficients are also attempted to train the CNN model. The proposed CNN based dialect recognition system is also tested with internationally known standard Intonation Variation in English (IViE) dataset. The CNN model has resulted in better performance. It is observed that the use of one convolution layer and three fully connected layers balances computational complexity and results in better accuracy with both Kannada and English datasets. Â© 2020, Springer Nature Switzerland AG.
Kannada Dialect Identification fromÂ Case-Based Word Utterances Using Gradient Boosting Algorithm
(Springer Science and Business Media Deutschland GmbH, 2022) Chittaragi, N.B.; Koolagudi, S.G.
Dialects or accents constitute the grammatical variations along with phonological and lexical changes those are commonly observed in the usage of a language with minor and subtle differences. Dialectal variations existing among dialects are mainly due to unique speaking patterns followed among the group of speakers. The dialect processing systems are essential in the development of automatic speech recognition systems (ASRs) for regional and resource-constrained languages in the country like India. Since India is with rich diversity in languages. In this paper, a language-dependent dialect identification system is proposed for Kannada language from words especially with the Kannada language-specific case (Vibhakthi Prathyayas) information. Special morphological operations that exist in the Kannada language in terms of various cases commonly called as a grammatical function of a noun or pronoun. These word utterances are used for the classification of five dialects of Kannada. This is a novel idea to use the smaller word utterances that consist of dialect-specific information representing the unique characteristics. In this paper, case-based word utterance dataset is prepared by considering five Kannada dialects from Kannada Dialect Speech Corpus (KDSC). Dynamic and static prosodic features are extracted to capture dialectal variations. Addition to these features, spectral MFCC features are also considered for evaluation of differences among dialects from these word-level units. Initially, multi-class Support vector machine (SVM) technique is used and later effective extreme gradient boosting (XGB) ensemble algorithms are used for the development of an automatic Kannada dialect recognition system. The research findings have demonstrated the words with case information convey dialect specific linguistic cues effectively. The combination of dynamic and static prosodic cues has a significant effect on the characterization of dialects along with spectral features. Â© 2022, Springer Nature Switzerland AG.
Matching Witness' Account with Mugshots for Forensic Applications
(2018) Mohan, A.; Dhir, R.; Hirashkar, H.; Chittaragi, N.B.; Koolagudi, S.G.
This paper proposes a system that can be used by the forensics department to identify and disclose criminal details automatically. The problem of matching the description of a suspect in a crime scene provided by an eye-witness to existing mugshots (mugshots represents photograph taken as someone is arrested) in the police departments criminal database is addressed in this work. Prominent features such as skin colour, size of nose lips, shape the size of eyes, and shape of the face are considered for discrimination of individual criminals. The witness fills in the description fields through which, most appropriate images are selected from an existing database. Images are scored on the basis of the degree of closeness to the given description, and most relevant images are displayed first followed by the rest. The classification of images based on explored facial features is done using extreme gradient boosting (XGBoost) supervised an ensemble learning method. Comparatively better performances are observed. � 2018 IEEE.
Matching Witness' Account with Mugshots for Forensic Applications
(Institute of Electrical and Electronics Engineers Inc., 2018) Mohan, A.; Dhir, R.; Hirashkar, H.; Chittaragi, N.B.; Koolagudi, S.G.
This paper proposes a system that can be used by the forensics department to identify and disclose criminal details automatically. The problem of matching the description of a suspect in a crime scene provided by an eye-witness to existing mugshots (mugshots represents photograph taken as someone is arrested) in the police departments criminal database is addressed in this work. Prominent features such as skin colour, size of nose lips, shape the size of eyes, and shape of the face are considered for discrimination of individual criminals. The witness fills in the description fields through which, most appropriate images are selected from an existing database. Images are scored on the basis of the degree of closeness to the given description, and most relevant images are displayed first followed by the rest. The classification of images based on explored facial features is done using extreme gradient boosting (XGBoost) supervised an ensemble learning method. Comparatively better performances are observed. Â© 2018 IEEE.