Spectral Features for Emotional Speaker Recognition

Sandhya P.; Spoorthy V.; Koolagudi S.G.; Sobhana N.V.

Please use this identifier to cite or link to this item: https://idr.nitk.ac.in/jspui/handle/123456789/15056

Full metadata record

DC Field	Value	Language
dc.contributor.author	Sandhya P.
dc.contributor.author	Spoorthy V.
dc.contributor.author	Koolagudi S.G.
dc.contributor.author	Sobhana N.V.
dc.date.accessioned	2021-05-05T10:16:19Z	-
dc.date.available	2021-05-05T10:16:19Z	-
dc.date.issued	2020
dc.identifier.citation	Proceedings of 2020 3rd International Conference on Advances in Electronics, Computers and Communications, ICAECC 2020 , Vol. , , p. -	en_US
dc.identifier.uri	https://doi.org/10.1109/ICAECC50550.2020.9339502
dc.identifier.uri	http://idr.nitk.ac.in/jspui/handle/123456789/15056	-
dc.description.abstract	Speaker recognition in an emotive environment is a bit challenging task because of influence of emotions in a speech. Identifying the speaker from the speech can be done by analyzing the features of the speech signal. In normal conditions, identifying a speaker is not a tedious task. Whereas, identifying the speaker in an emotional environment such as happy, sad, anger, surprise, sarcastic, fear etc. is really challenging, since speech becomes altered under emotions and noise. The spectral features of speech signal include Mel Frequency Cepstral Co-efficients(MFCC), Shifted Delta Cepstral Coefficients (SDCC), spectral centroid, spectral roll off, spectral flatness, spectral contrast, spectral bandwidth, chroma-stft, zero crossing rate, root mean square energy, Linear Prediction Cepstral Coefficients (LPCC), spectral subband centroid, Teager energy based MFCC, line spectral frequencies, single frequency cepstral coefficients, formant frequencies, Power Normalized Cepstral Coefficients (PNCC), etc. The features that are extracted from the speech signal are classified using classifiers. Support Vector Machine(SVM), Gaussian Mixture Model, Gaussian Naive Bayes, K-Nearest Neighbour, Random Forest and a simple Neural Network using Keras is used for classification. The important application include security systems in which a person can be identified by biometrics that is voice of the person. The work aims to identify the speaker in an emotional environment using spectral features and classify using any of the classification techniques and to achieve a high speaker recognition rate. Feature combinations can also be used to improve accuracy. The proposed model performed better than most of the state-of-The-Art methods. © 2020 IEEE.	en_US
dc.title	Spectral Features for Emotional Speaker Recognition	en_US
dc.type	Conference Paper	en_US
Appears in Collections:	2. Conference Papers

Files in This Item:

There are no files associated with this item.

Show simple item record