Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 5 of 5

Audio Replay Attack Detection for Speaker Verification System Using Convolutional Neural Networks
(Springer, 2019) Kemanth, P.J.; Supanekar, S.; Koolagudi, G.K.
An audio replay attack is one of the most popular spoofing attacks on speaker verification systems because it is very economical and does not require much knowledge of signal processing. In this paper, we investigate the significance of non-voiced audio segments and deep learning models like Convolutional Neural Networks (CNN) for audio replay attack detection. The non-voiced segments of the audio can be used to detect reverberation and channel noise. FFT spectrograms are generated and given as input to CNN to classify the audio as genuine or replay. The advantage of the proposed approach is, because of the removal of the voiced speech, the feature vector size is reduced without compromising the necessary features. This leads to significant amount of reduction on training time of the networks. The ASVspoof 2017 dataset is used to train and evaluate the model. The Equal Error Rate (EER) is computed and used as a metric to evaluate model performance. The proposed system has achieved an EER of 5.62% on the development dataset and 12.47% on the evaluation dataset. Â© 2019, Springer Nature Switzerland AG.
Spectral Feature Based Kannada Dialect Classification from Stop Consonants
(Springer, 2019) Chittaragi, N.B.; Hegde, P.; Mothukuri, S.K.P.; Koolagudi, G.K.
This study focuses on the investigation of the significance of stop consonants in view of the classification of Kannada dialects. Majority of the studies proposed have shown the existence of evidential differences in the pronunciation of vowels across dialects. However, consonant based studies on dialect processing are found to be comparatively lesser. In this work, eight stop consonants are used for characterization of five Kannada dialects. Acoustic characteristics such as cepstral coefficients, formant frequencies, spectral flux, and rolloff features are explored from spectral analysis of stops. The consonant dataset is derived from standard Kannada dialect dataset consisting of 2417 consonants obtained from 16 native speakers from each dialect. Support vector machine (SVM) and decision tree-based extreme gradient boosting (XGB) ensemble classification methods are employed for automatic recognition of Kannada dialects. The research findings show that the stops existing for shorter duration also convey dialectal linguistic cues. Combination of spectral properties has contributed to the identification of distinct dialect-specific information across Kannada dialects. Â© 2019, Springer Nature Switzerland AG.
Locality-constrained linear coding based fused visual features for robust acoustic event classification
(International Speech Communication Association, 2019) Mulimani, M.; Koolagudi, G.K.
In this paper, a novel Fused Visual Features (FVFs) are proposed for Acoustic Event Classification (AEC) in the meeting room and office environments. The codes of Visual Features (VFs) are evaluated from row vectors and Scale Invariant Feature Transform (SIFT) vectors of the grayscale Gammatonegram of an acoustic event separately using Locality-constrained Linear Coding (LLC). Further, VFs from row vectors and SIFT vectors of the grayscale Gammatonegram are fused to get FVFs. Performance of the proposed FVFs is evaluated on acoustic events of publicly available UPC-TALP and DCASE datasets in clean and noisy conditions. Results show that proposed FVFs are robust to noise and achieve overall recognition accuracy of 96.40% and 90.45% on UPC-TALP and DCASE datasets, respectively. Â© 2019 ISCA
Recognition and Classification of Pauses in Stuttered Speech Using Acoustic Features
(Institute of Electrical and Electronics Engineers Inc., 2019) Afroz, F.; Koolagudi, G.K.
Pauses plays an essential role in speech activities. Normally it helps the listener by creating a time and space to decode and interpret the message of a speaker. But in case of stuttering pauses disturbs the normal flow of speech. The uncontrolled, frequent and unplanned occurance of pasuses leads to slow speaking rate, results in broken words and increases the severity level of stuttering. Hence pauses and stuttering has a close relationship. Pauses are considered as one of the important pattern in diagnoisis and treatment of stuttering. In this work, an attempt has been made for the identification of inaudible (Silent or Unfilled) pauses from stuttered speech. The attributes like duration, frequency, position and distribution of pauses during speech tasks are measured and quantified. UCLASS stuttered speech corpus is considered for the analysis. Automatic blind segmentation approach is adopted to segment the speech signal into voice and unvoiced regions using dynamic threshold set based on energy and zero crossing rate (ZCR). 4 th formant frequencies are analysed to identify intra-morphic (unfilled) pauses present within voiced regions. The duratiion of intra-morphic pauses are analysed for stuttred speech and normal speech. It is observed that the duration of normal intra-morphic pause ranges from 150 ms-250 ms and inter-morphic pauses are <=250 ms and short pause have duration ranges from 50 ms-150 ms. Whereas in stuttering short intra-morphic pauses ranges from 10 ms to 50 ms, long pauses ranges from 250 ms to 1 or 2 seconds. Segmentation of the intra-morphic pauses is observed to acheive an accuracy of 98%. Results are compared and validated with manual method. Â© 2019 IEEE.
Academic Curriculum Load Balancing using GA
(Institute of Electrical and Electronics Engineers Inc., 2019) Chakradhar, M.; Charan, M.S.; Sai, R.U.; Kunal, M.; Vishnu Srinivasa Murthy, Y.V.S.; Koolagudi, G.K.
In the paper, we propose an algorithm using genetic alogithm to find out the optimal solution for the academic load balancing problem. The load balancing problem is to optimize the load of credits per semester in an academic curriculum. In the proposed method, we try to distribute the course load as evenly as possible so that the deviation from the mean credit load per each semester is as minimal as possible. The objective function is to distribute the credit load among all the semesters evenly such that the deviation from the mean credits per semester is minimal. The proposed approach explores the solution space using only mutation operators and does not operate using crossover as the solutions obtained using cross over does not create any newer and better solutions in the solution space.The algorithm is applied on three data sets and the results are compared with the solutions obtained using the existing approaches. The results obtained using the state of the art solution are either better than approaches or on par with the state of art optimal solutions. The solution set obtained using the proposed approach is well spread out through out all the periods and all the periods contain almost mean number of credits. Â© 2019 IEEE.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results