Audio Replay Attack Detection for Speaker Verification System Using Convolutional Neural Networks

dc.contributor.authorKemanth, P.J.
dc.contributor.authorSupanekar, S.
dc.contributor.authorKoolagudi, S.G.
dc.date.accessioned2020-03-30T09:59:02Z
dc.date.available2020-03-30T09:59:02Z
dc.date.issued2019
dc.description.abstractAn audio replay attack is one of the most popular spoofing attacks on speaker verification systems because it is very economical and does not require much knowledge of signal processing. In this paper, we investigate the significance of non-voiced audio segments and deep learning models like Convolutional Neural Networks (CNN) for audio replay attack detection. The non-voiced segments of the audio can be used to detect reverberation and channel noise. FFT spectrograms are generated and given as input to CNN to classify the audio as genuine or replay. The advantage of the proposed approach is, because of the removal of the voiced speech, the feature vector size is reduced without compromising the necessary features. This leads to significant amount of reduction on training time of the networks. The ASVspoof 2017 dataset is used to train and evaluate the model. The Equal Error Rate (EER) is computed and used as a metric to evaluate model performance. The proposed system has achieved an EER of 5.62% on the development dataset and 12.47% on the evaluation dataset. � 2019, Springer Nature Switzerland AG.en_US
dc.identifier.citationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, Vol.11942 LNCS, , pp.445-453en_US
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/7413
dc.titleAudio Replay Attack Detection for Speaker Verification System Using Convolutional Neural Networksen_US
dc.typeBook chapteren_US

Files