Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 10 of 14

Normalized videosnapping: A non-linear video synchronization approach
(Institute of Electrical and Electronics Engineers Inc., 2017) Tripathi, A.; Changmai, B.; Habib, S.; Chittaragi, N.B.; Koolagudi, S.G.
Video synchronization is the task of content-based alignment of two or more videos depicting the same event with spatial variations or in the same object with temporal changes. Video synchronization is one of the most fundamental tasks when it comes to manipulations with temporally or spatially multi-perspective video-shots. In this paper, a model is proposed to deal with the synchronization problem and efficiently tackles issues arising during synchronizing two videos. Here, videos are dealt, at the frame level with features from each frame forming the basis of alignment. Features are matched and mapped to generate a cost matrix of similarities among the frames of the videos in concern. A modified version of Djikstra's algorithm that yields an optimal path through the matrix is applied. Through an optimal path, events are grouped into adjacent regions following which temporal warpings are introduced into the videos to achieve the best possible alignment among them. The model has proven to be efficient and compatible with all classes of quality levels of videos. Â© 2017 IEEE.
Acoustic features based word level dialect classification using SVM and ensemble methods
(Institute of Electrical and Electronics Engineers Inc., 2017) Chittaragi, N.B.; Koolagudi, S.G.
In this paper, word based dialect classification system is proposed by using acoustic characteristics of the speech signal. Dialects mainly represent the different pronunciation patterns of any language. Dialectal cues can exist at various levels such as phoneme, syllable, word, sentence and phrase in an utterance. Word level dialectal traits are extracted to recognize dialects since every word exhibits significant dialect discriminating cues. Intonational Variations in English (IViE) speech corpus recorded in British English has been considered. The corpus includes nine dialects which cover nine distinct regions of British Isles. Acoustic properties such as spectral and prosodic features are derived from word level to construct the feature vector. Further, two different classification algorithms such as support vector machine (SVM) and tree-based extreme gradient boosting (XGB) ensemble algorithms are used to extract the prominent patterns that are used to discriminate the dialects. From the experiments, a better performance has been observed with word level traits using ensemble methods over the SVM classification method. Â© 2017 IEEE.
Robust Dialect Identification System using Spectro-Temporal Gabor Features
(Institute of Electrical and Electronics Engineers Inc., 2018) Chittaragi, N.B.; Mothukuri, S.P.; Hegde, P.; Koolagudi, S.G.
Automatic identification of dialects of a language is gaining popularity in the field of automatic speech recognition (ASR) systems. The present work proposes an automatic dialect identification (ADI) system using 2D Gabor and spectral features. A comprehensive study of the five dialects of a Dravidian Kannada language has been taken up. Gabor filters representing spectro-temporal modulations attempt in emulation of the human auditory system concerning signal processing strategies. Hence, they are able to well perceive human voices in tern recognize dialectal variations effectively. Also, spectral features Mel frequency cepstral coefficients (MFCC) are derived. A single classifier based support vector machine (SVM) and ensemble based extreme random forest (ERF) classification methods are employed for recognition. The effectiveness of the Gabor features for ADI system is demonstrated with proposed Kannada dialect dataset along with a standard intonation variation in English (IViE) dataset for British English dialects. The Gabor features have shown better performance over MFCC features with both datasets. Better recognition performance of 88.75% and 99.16% is achieved with Kannada and IViE dialect datasets respectively. Proposed Gabor features have demonstrated better performances even under noisy conditions. Â© 2018 IEEE.
Extractive Document Summarization Using a Supervised Learning Approach
(Institute of Electrical and Electronics Engineers Inc., 2018) Charitha, S.; Chittaragi, N.B.; Koolagudi, S.G.
In this paper, we present a model for extractive multi-document text summarization using a supervised learning approach. The model uses a convolutional neural networks (CNN) which is capable of learning sentence features on its own for sentence ranking. This approach has been used in order to avoid the overhead of extracting features from sentences manually. Integer linear programming (ILP) approach has been adopted for selecting sentences to generate the summary based on sentence ranks. This ILP model minimizes the redundancy in the generated summary. We have evaluated our proposed approach on the DUC 2007 dataset and its performance is found to be competitive or better in comparison with state-of-the-art systems. Â© 2018 IEEE.
Tomato Leaf Disease Detection Using Convolutional Neural Networks
(Institute of Electrical and Electronics Engineers Inc., 2018) Tm, P.; Pranathi, A.; Saiashritha, K.; Chittaragi, N.B.; Koolagudi, S.G.
The tomato crop is an important staple in the Indian market with high commercial value and is produced in large quantities. Diseases are detrimental to the plant's health which in turn affects its growth. To ensure minimal losses to the cultivated crop, it is crucial to supervise its growth. There are numerous types of tomato diseases that target the crop's leaf at an alarming rate. This paper adopts a slight variation of the convolutional neural network model called LeNet to detect and identify diseases in tomato leaves. The main aim of the proposed work is to find a solution to the problem of tomato leaf disease detection using the simplest approach while making use of minimal computing resources to achieve results comparable to state of the art techniques. Neural network models employ automatic feature extraction to aid in the classification of the input image into respective disease classes. This proposed system has achieved an average accuracy of 94-95 % indicating the feasibility of the neural network approach even under unfavourable conditions. Â© 2018 IEEE.
Matching Witness' Account with Mugshots for Forensic Applications
(Institute of Electrical and Electronics Engineers Inc., 2018) Mohan, A.; Dhir, R.; Hirashkar, H.; Chittaragi, N.B.; Koolagudi, S.G.
This paper proposes a system that can be used by the forensics department to identify and disclose criminal details automatically. The problem of matching the description of a suspect in a crime scene provided by an eye-witness to existing mugshots (mugshots represents photograph taken as someone is arrested) in the police departments criminal database is addressed in this work. Prominent features such as skin colour, size of nose lips, shape the size of eyes, and shape of the face are considered for discrimination of individual criminals. The witness fills in the description fields through which, most appropriate images are selected from an existing database. Images are scored on the basis of the degree of closeness to the given description, and most relevant images are displayed first followed by the rest. The classification of images based on explored facial features is done using extreme gradient boosting (XGBoost) supervised an ensemble learning method. Comparatively better performances are observed. Â© 2018 IEEE.
Dialect Recognition System Using Excitation Source Features
(Institute of Electrical and Electronics Engineers Inc., 2018) Choudhury, A.R.; Chittaragi, N.B.; Koolagudi, S.G.
This paper focuses on building an automatic dialect recognition system using excitation source features. Every spoken unit represents the unique articulatory configuration of the excitation source and the vocal tract system. This paper emphasis on exploring source information to capture dialectal cues over vocal tract information. Epochs representing the instants of maximum excitation of the vocal tract at the closure are used as source features. Additionally, strength and slope of epochs and instantaneous frequency features are extracted from zero frequency filtered signal. Further, 13 cepstral coefficients are derived from the LP residual to prepare feature vector. Two dialect datasets such as Kannada dataset with five prominent dialects and English dataset with nine dialects are used for evaluation of the significances explored features. Classification experiments are conducted with support vector machines designed with sequential minimal optimization (SMO-SVM) function. Performances are analyzed individually and in combinations. Obtained results have exhibited the existence of dialect information at excitation source information and complementary cues at vocal tract system. Â© 2018 IEEE.
Automatic text-independent Kannada dialect identification system
(Springer Verlag service@springer.de, 2019) Chittaragi, N.B.; Limaye, A.; Chandana, N.T.; Annappa, B.; Koolagudi, S.G.
This paper proposes a dialect identification system for the Kannada language. A system that can automatically identify the dialects of the language being spoken has a wide variety of applications. However, not many Automatic Speech Recognition (ASR) and dialect identification tasks are carried out in majority of the Indian languages. Further, there are only a few good quality annotated audio datasets available. In this paper, a new dataset for 5 spoken dialects of the Kannada language is introduced. Spectral and prosodic features have captured the most prominent features for recognition of Kannada dialects. Support Vector Machine (SVM) and neural networks algorithms are used for modeling text-independent recognition system. A neural network model that attempts for identification dialects based on sentence level cues has also been built. Hyper-parameters for SVM and neural network models are chosen using grid search. Neural network models have outperformed SVMs when complete utterances are considered. Â© Springer Nature Singapore Pte Ltd. 2019.
Spectral Feature Based Kannada Dialect Classification from Stop Consonants
(Springer, 2019) Chittaragi, N.B.; Hegde, P.; Mothukuri, S.K.P.; Koolagudi, G.K.
This study focuses on the investigation of the significance of stop consonants in view of the classification of Kannada dialects. Majority of the studies proposed have shown the existence of evidential differences in the pronunciation of vowels across dialects. However, consonant based studies on dialect processing are found to be comparatively lesser. In this work, eight stop consonants are used for characterization of five Kannada dialects. Acoustic characteristics such as cepstral coefficients, formant frequencies, spectral flux, and rolloff features are explored from spectral analysis of stops. The consonant dataset is derived from standard Kannada dialect dataset consisting of 2417 consonants obtained from 16 native speakers from each dialect. Support vector machine (SVM) and decision tree-based extreme gradient boosting (XGB) ensemble classification methods are employed for automatic recognition of Kannada dialects. The research findings show that the stops existing for shorter duration also convey dialectal linguistic cues. Combination of spectral properties has contributed to the identification of distinct dialect-specific information across Kannada dialects. Â© 2019, Springer Nature Switzerland AG.
A Novel Approach to Video Steganography using a 3D Chaotic Map
(Institute of Electrical and Electronics Engineers Inc., 2019) Narayanan, G.; Narayanan, R.; Haneef, N.; Chittaragi, N.B.; Koolagudi, S.G.
In this paper, we introduce a novel approach for data-hiding in videos using 3-dimensional Chaotic Maps. A video is represented as a 3-dimensional image, with the third axis constituting the frames of the video. Existing chaotic map based data-hiding techniques on videos is confined to applications of 2-dimensional chaotic maps on a per-frame basis. In this paper, a 3-dimensional extension of the logistic chaos map is applied to identify pixels to encode information in the video's 3-dimensional space and 3-3-2 Least Significant Bit (LSB) substitution is used to encode 1 byte of information per pixel. We have implemented and presented a proof of concept that has been analyzed on a test video using various quality metrics. The chaotic map based data-hiding approach proposed in this paper is shown to be secured and the results observed are inline with the standard results for a video steganographic algorithm using LSB substitution. Â© 2019 IEEE.

Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results