Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 10 of 69
  • Item
    Age approximation from speech using Gaussian mixture models
    (IEEE Computer Society help@computer.org, 2013) Mittal, T.; Barthwal, A.; Koolagudi, S.G.
    In this work, spectral features are extracted from speech to perform speaker classification based on thier age. Mel frequency cepstral coefficients (MFCCs) are explored as features. Gaussian mixture models (GMMs) are proposed as classifiers. The age groups considered in this study are 1-10, 11-20, 21-30, 31-40 and 41-50. The age-group database used in this work is recorded in Hindi from speakers of different ages and dialects containing five Hindi text prompts. The text prompts are constructed using textually neutral Hindi words recorded in neutral emotion which are used for characterizing the age group, for both male and female. Average age recognition performance, in the case of multiple speaker database is observed to be around 92.0%. © 2013 IEEE.
  • Item
    Raga classification for Carnatic music
    (Springer Verlag service@springer.de, 2015) Suma, S.M.; Koolagudi, S.G.
    In this work, an effort has been made to identify raga of given piece of Carnatic music. In the proposed method, direct raga classification without the use of note sequence has been performed using pitch as the primary feature. The primitive features that are extracted from the probability density function (pdf) of the pitch contour are used for classification. A feature vector of 36 dimension is obtained by extracting some parameters from the pdf. Since non-sequential features are extracted from the signal, artificial neural network (ANN) is used as a classifier. The database used for validating the system consists of 162 songs from 12 ragas. The average classification accuracy is found to be 89.5%. © Springer India 2015.
  • Item
    Closed Item-Set Mining for Prediction of Indian Summer Monsoon Rainfall A Data Mining Model with Land and Ocean Variables as Predictors
    (Elsevier, 2015) Vathsala, H.; Koolagudi, S.G.
    Practical application of data mining in scientific and engineering domains, when explored, pose many problems and provide interesting results. In this paper, we attempt to mine out association rules from 37 (1969-2005) years of Indian summer monsoon rainfall data and try its applicability in helping better prediction of Indian summer monsoon rainfall. We shortlist 36 variables as possible predictors of Indian summer monsoon rainfall based on previous literature and compare prediction using all 36 variables and prediction by selected attributes from derived association rules. Results show better performance in prediction of All India region, West central region and Peninsular region rainfall when attributes selection is employed as compared to all 36 variables used for prediction. © 2015 The Authors.
  • Item
    Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations
    (Institute of Electrical and Electronics Engineers Inc., 2015) Vishnu Srinivasa Murthy, Y.V.S.; Koolagudi, S.G.
    In this work, an effort has been made to identify vocal and non-vocal regions from a given song using signal processing techniques and machine learning algorithm. Initially spectral features like mel-frequency cepstral coefficients (MFCCs) are used to develop the baseline system. Statistical values of pitch, jitter and shimmer are considered to improve performance of the system. Artificial neural networks (ANNs) are used to capture the characteristics of vocal and non-vocal segments of the songs. The experiment is conducted on 60 vocal and 60 non-vocal clips extracted from Telugu albums. 11-point moving window is used to ensure the continuity of vocal and non-vocal segments, thus improving the accuracy of system. With this approach system achieves 85.59% accuracy for vocal and 88.52% for non-vocal segment classification. © 2015 IEEE.
  • Item
    Identification of allied raagas in Carnatic music
    (Institute of Electrical and Electronics Engineers Inc., 2015) Upadhyaya, P.; Suma, S.M.; Koolagudi, S.G.
    In this work, an effort has been made to differentiate the allied raagas in Carnatic music. Allied raagas are the raagas that are composed using same set of notes. The features derived from the pitch sequence are used for differentiating these raagas. The coefficients of legendre polynomials, used to fit the pitch contours of the song clips are used for identifying raagas. Obtained features are validated using different classifiers such as Neural networks, Naive Bayes, Multi class classifier, Bagging and Random forest. The proposed system is tested on 4 sets of allied raagas. Naive Bayes classifier gives an average accuracy of 86.67% for allied set of Todi-Dhanyasi and Multi class classifier gives an average accuracy of 86.67% for allied set of Kharaharapriya-Anandabhairavi-Reethigoula. In general, Neural network classifier performance is found to be better than other classifiers. © 2015 IEEE.
  • Item
    Identifying gamakas in Carnatic music
    (Institute of Electrical and Electronics Engineers Inc., 2015) Vyas, H.M.; Suma, S.M.; Koolagudi, S.G.; Guruprasad, K.R.
    In this work, an effort has been made to identify the gamakas present in a given piece of Carnatic music clip. Gamakas are the beautification elements used to improve the melody. The identification of gamaka is very important stage in note transcription. In the proposed method, features that correspond to melodic variations such as pitch and energy are used for characterizing the gamakas. The input pitch contour is modelled using Hidden Markov Model with 3 states, namely Attack, Sustain and Decay. These states correspond to ups and downs in the melody of the music. The system is validated using a comprehensive data set consisting 160 songs from 8 different ragas. The average accuracy of 75.86% is achieved using this method. © 2015 IEEE.
  • Item
    Feature analysis for mispronounced phonemes in the case of alvoelar approximant (/r/) substituted with voiced dental consonant (/∂/)
    (Institute of Electrical and Electronics Engineers Inc., 2015) Ramteke, P.B.; Koolagudi, S.G.; Prabhakar, A.
    Mispronunciation is commonly observed in children from age 2 to 8 years. Some of the common mispronunciations are stopping, fronting, backing and affrication. These processes are known as phonological processes. Identification of these processes is crucial in studying the vocal tract development pattern and treating the phonological disorders in children. The features that clearly discriminate correctly pronounced phoneme from corresponding mispronounced phoneme have to be compared to identify the phonological processes. This paper focuses on the analysis of mispronounced alveolar approximant (/r/) substituted with voiced fricative consonant (/∂/). In this work, spectral and pitch related features are considered for the analysis using scatter plots and histograms. From the analysis, it is observed that the energy feature against 2nd and 4th cepstral coefficients achieves 75% and 65% discrimination respectively. © 2015 IEEE.
  • Item
    Analytic technique for optimal workload scheduling in data-center using phase detection
    (Institute of Electrical and Electronics Engineers Inc., 2015) Gupta, P.; Koolagudi, S.G.; Khanna, R.; Ganguli, M.; Sankaranarayanan, A.N.
    Typically, complex resource-interdependence and heterogeneous workload patterns can result in sub-optimal job allocation leading to performance loss or under-utilization of compute resources. A well behaved model can anticipate the demand patterns and proactively react to the dynamic stresses in a timely and well optimized manner. For a workload hosting environment, pool of available resources are optimally configured and utilized to sustain certain expectation of Quality-of-Service (QoS) in the presence of power, thermal and reliability constraints. The workload (or job) scheduling mechanism is expected to withstand dynamic variations in demand stresses while maximizing the resource utilization and minimizing the performance loss. Furthermore, workloads can be co-allocated to the clusters with least amount of resource contention. In this paper we introduce the methodology that facilitates the coordinated scheduling of the workloads to the systems with least contentious resources through phase-assisted dynamic characterization. We describe the method to perform optimal job scheduling by using phase model synthesized by learning and classifying the run-time behavior of workloads. © 2015 IEEE.
  • Item
    Recognition of repetition and prolongation in stuttered speech using ANN
    (Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2016) Savin, P.S.; Ramteke, P.B.; Koolagudi, S.G.
    This paper mainly focuses on repetition and prolongation detection in stuttered speech signal. The acoustic and pitch related features like Mel-frequency cepstral coefficients (MFCCs), formants, pitch, zero crossing rate (ZCR) and Energy are used to test the effectiveness in recognizing repetitions and prolongations in stammered speech. Artificial Neural Networks (ANN) are used as classifier. The results are evaluated using combination of different features. The results show that the ANN classifier trained using MFCC features achieves an average accuracy of 87.39% for repetition and prolongation recognition. © Springer India 2016.
  • Item
    Rhythm and timbre analysis for carnatic music processing
    (Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2016) Heshi, R.; Suma, S.M.; Koolagudi, S.G.; Bhandari, S.; Sreenivasa Rao, K.S.
    In this work, an effort has been made to analyze rhythm and timbre related features to identify raga and tala from a piece of Carnatic music. Raga and Tala classification is performed using both rhythm and timbre features. Rhythm patterns and rhythm histogram are used as rhythm features. Zero crossing rate (ZCR), centroid, spectral roll-off, flux, entropy are used as timbre features. Music clips contain both instrumental and vocals. To find similarity between the feature vectors T-Test is used as a similarity measure. Further, classification is done using Gaussian Mixture Models (GMM). The results shows that the rhythm patterns are able to distinguish different ragas and talas with an average accuracy of 89.98 and 86.67 % respectively. © Springer India 2016.