Faculty Publications

Now showing 1 - 2 of 2

Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition
(Springer New York LLC barbara.b.bertram@gsk.com, 2018) Koolagudi, S.G.; Vishnu Srinivasa Murthy, Y.V.S.; Bhaskar, S.P.
In this paper, the process of selecting a classifier based on the properties of dataset is designed since it is very difficult to experiment the data on n—number of classifiers. As a case study speech emotion recognition is considered. Different combinations of spectral and prosodic features relevant to emotions are explored. The best subset of the chosen set of features is recommended for each of the classifiers based on the properties of chosen dataset. Various statistical tests have been used to estimate the properties of dataset. The nature of dataset gives an idea to select the relevant classifier. To make it more precise, three other clustering and classification techniques such as K-means clustering, vector quantization and artificial neural networks are used for experimentation and results are compared with the selected classifier. Prosodic features like pitch, intensity, jitter, shimmer, spectral features such as mel frequency cepstral coefficients (MFCCs) and formants are considered in this work. Statistical parameters of prosody such as minimum, maximum, mean (?) and standard deviation (?) are extracted from speech and combined with basic spectral (MFCCs) features to get better performance. Five basic emotions namely anger, fear, happiness, neutral and sadness are considered. For analysing the performance of different datasets on different classifiers, content and speaker independent emotional data is used, collected from Telugu movies. Mean opinion score of fifty users is collected to label the emotional data. To make it more accurate, one of the benchmark IIT-Kharagpur emotional database is used to generalize the conclusions. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
Singer identification for Indian singers using convolutional neural networks
(Springer, 2021) Vishnu Srinivasa Murthy, Y.V.S.; Koolagudi, S.G.; Jeshventh Raja, T.K.
Singer identification is one of the important aspects of music information retrieval (MIR). In this work, traditional feature-based and trending convolutional neural network (CNN) based approaches are considered and compared for identifying singers. Two different datasets, namely artist20 and the Indian popular singers’ database with 20 singers are used in this work to evaluate proposed approaches. Cepstral features such as Mel-frequency cepstral coefficients (MFCCs) and linear prediction cepstral coefficients (LPCCs) are considered to represent timbre information. Shifted delta cepstral (SDC) features are also computed beside the cepstral coefficients to capture temporal information. In addition, chroma features are computed from 12 semitones of a musical octave, overall forming a 46-dimensional feature vector. Experiments are conducted with different feature combinations, and suitable features are selected using the genetic algorithm-based feature selection (GAFS) approach. Two different classification techniques, namely artificial neural networks (ANNs) and random forest (RF), are considered on the features mentioned above. Further, spectrograms and chromagrams of audio clips are directly fed to CNN for classification. The singer identification results obtained using CNNs seem to be better than the traditional isolated and ensemble classifiers. Average accuracy of around 75% is observed with CNN in the case of Indian popular singers database. Whereas, on artist20 dataset, the proposed configuration of feature-based approach and CNN could not give better than 60% accuracy. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results