Speaker Identification and Verification using Deep Learning
No Thumbnail Available
Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
Many voice assistants gained importance across globe in the recent times, for example, Cortana, Siri, Ok Google. These assistants are part of everyone's life these days. The main motive behind the proposed system is to improve recognition assistant system. The speaker prediction model is trained using features MFCC, Chroma, Tonnetz, Mel spectrogram, and Spectral contrast extracted from audio samples. The proposed system has numerous real-world applications, such as meeting transcription, unlocking smart devices using voice, and online viva voice verification. It can replace the existing biometric system for faculty attendance and traditional fingerprint recognition. A Dense Neural Network was created for each audio feature and finally concatenated using a concatenation layer which fetched the best performance output compared to LSTM. Dense Neural Network successfully predicted the speaker with an accuracy of more than 95% most of the times. In the case of LSTM, due to fewer samples, the accuracy of speaker prediction is around 79%. In the case of CNN, the accuracy of speaker prediction is around 86%; this behavior can be attributed to the noise environment. When an unknown speaker tries to speak, the Dense Neural network can manage the task by placing them in an anonymous class. © 2022 IEEE.
Description
Keywords
Chroma, CNN, DNN, LSTM, Mel spectrogram, MFCC, Speaker Recognition, Spectral contrast, Tonnetz
Citation
2022 International Conference on Signal and Information Processing, IConSIP 2022, 2022, Vol., , p. -
