Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition

Koolagudi, S.G.; Murthy, Y.V.S.; Bhaskar, S.P.

Please use this identifier to cite or link to this item: https://idr.nitk.ac.in/jspui/handle/123456789/10210

Full metadata record

DC Field	Value	Language
dc.contributor.author	Koolagudi, S.G.
dc.contributor.author	Murthy, Y.V.S.
dc.contributor.author	Bhaskar, S.P.
dc.date.accessioned	2020-03-31T08:18:44Z	-
dc.date.available	2020-03-31T08:18:44Z	-
dc.date.issued	2018
dc.identifier.citation	International Journal of Speech Technology, 2018, Vol.21, 1, pp.167-183	en_US
dc.identifier.uri	http://idr.nitk.ac.in/jspui/handle/123456789/10210	-
dc.description.abstract	In this paper, the process of selecting a classifier based on the properties of dataset is designed since it is very difficult to experiment the data on n number of classifiers. As a case study speech emotion recognition is considered. Different combinations of spectral and prosodic features relevant to emotions are explored. The best subset of the chosen set of features is recommended for each of the classifiers based on the properties of chosen dataset. Various statistical tests have been used to estimate the properties of dataset. The nature of dataset gives an idea to select the relevant classifier. To make it more precise, three other clustering and classification techniques such as K-means clustering, vector quantization and artificial neural networks are used for experimentation and results are compared with the selected classifier. Prosodic features like pitch, intensity, jitter, shimmer, spectral features such as mel frequency cepstral coefficients (MFCCs) and formants are considered in this work. Statistical parameters of prosody such as minimum, maximum, mean (?) and standard deviation (?) are extracted from speech and combined with basic spectral (MFCCs) features to get better performance. Five basic emotions namely anger, fear, happiness, neutral and sadness are considered. For analysing the performance of different datasets on different classifiers, content and speaker independent emotional data is used, collected from Telugu movies. Mean opinion score of fifty users is collected to label the emotional data. To make it more accurate, one of the benchmark IIT-Kharagpur emotional database is used to generalize the conclusions. 2018, Springer Science+Business Media, LLC, part of Springer Nature.	en_US
dc.title	Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition	en_US
dc.type	Article	en_US
Appears in Collections:	1. Journal Articles

Files in This Item:

There are no files associated with this item.

Show simple item record