2. Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/1/7

Browse

Search Results

Now showing 1 - 10 of 63

Rhythm and timbre analysis for carnatic music processing
(2016) Heshi, R.; Suma, S.M.; Koolagudi, S.G.; Bhandari, S.; Rao, K.S.
In this work, an effort has been made to analyze rhythm and timbre related features to identify raga and tala from a piece of Carnatic music. Raga and Tala classification is performed using both rhythm and timbre features. Rhythm patterns and rhythm histogram are used as rhythm features. Zero crossing rate (ZCR), centroid, spectral roll-off, flux, entropy are used as timbre features. Music clips contain both instrumental and vocals. To find similarity between the feature vectors T-Test is used as a similarity measure. Further, classification is done using Gaussian Mixture Models (GMM). The results shows that the rhythm patterns are able to distinguish different ragas and talas with an average accuracy of 89.98 and 86.67 % respectively. � Springer India 2016.
Repetition detection in stuttered speech
(2016) Ramteke, P.B.; Koolagudi, S.G.; Afroz, F.
This paper mainly focuses on detection of repetitions in stuttered speech. The stuttered speech signal is divided into isolated units based on energy. Mel-frequency cepstrum coefficients (MFCCs), formants and shimmer are used as features for repetition recognition. These features are extracted from each isolated unit. Using Dynamic Time Warping (DTW) the features of each isolated unit are compared with those subsequent units within one second interval of speech. Based on the analysis of scores obtained from DTW a threshold is set, if the score is below the set threshold then the units are identified as repeated events. Twenty seven seconds of speech data used in this work, consists of 50 repetition events. The result shows that the combination of MFCCs, formants and shimmer can be used for the recognition of repetitions in stuttered speech. Out of 50 repetitions, 47 are correctly identified. � Springer India 2016.
Reconstruction of Edges from Fan-Beam Projections
(2019) Narasimhadhan, A.V.; Sharma, A.; Koolagudi, S.G.; Naganjaneyulu, G.V.S.S.K.R.; Avinash, S.; Peddireddy, V.; Kishan, N.B.; Rajan, J.
The goal of computerised tomography is to reconstruct cross sectional image of the object under consideration from it's projections whereas edge detection is an image analysis problem of utmost importance in medical imaging to outline the boundaries of tumours, bones etc. In this paper, a technique to reconstruct the edges directly from fan-beam projections, using the Marr-Hildreth operator, is presented. To obtain the edge map of object under consideration, the divergent beam transform of Marr-Hildreth operator is convolved with ramp filter to yield an edge reconstruction filter which is finally convolved with the acquired fan-beam projections and back-projected, resulting in a convolution back-projection, to reconstruct the edges. The paper also discusses about the utilisation of state-of-the-art Noo's algorithm to reconstruct the edges directly from equi-angular fan beam projections. Finally, the proposed technique is simulated to make relevant conclusions and inferences. � 2018 IEEE.
Recognition of repetition and prolongation in stuttered speech using ANN
(2016) Savin, P.S.; Ramteke, P.B.; Koolagudi, S.G.
This paper mainly focuses on repetition and prolongation detection in stuttered speech signal. The acoustic and pitch related features like Mel-frequency cepstral coefficients (MFCCs), formants, pitch, zero crossing rate (ZCR) and Energy are used to test the effectiveness in recognizing repetitions and prolongations in stammered speech. Artificial Neural Networks (ANN) are used as classifier. The results are evaluated using combination of different features. The results show that the ANN classifier trained using MFCC features achieves an average accuracy of 87.39% for repetition and prolongation recognition. � Springer India 2016.
Recognition and Classification of Pauses in Stuttered Speech Using Acoustic Features
(2019) Afroz, F.; Koolagudi, S.G.
Pauses plays an essential role in speech activities. Normally it helps the listener by creating a time and space to decode and interpret the message of a speaker. But in case of stuttering pauses disturbs the normal flow of speech. The uncontrolled, frequent and unplanned occurance of pasuses leads to slow speaking rate, results in broken words and increases the severity level of stuttering. Hence pauses and stuttering has a close relationship. Pauses are considered as one of the important pattern in diagnoisis and treatment of stuttering. In this work, an attempt has been made for the identification of inaudible (Silent or Unfilled) pauses from stuttered speech. The attributes like duration, frequency, position and distribution of pauses during speech tasks are measured and quantified. UCLASS stuttered speech corpus is considered for the analysis. Automatic blind segmentation approach is adopted to segment the speech signal into voice and unvoiced regions using dynamic threshold set based on energy and zero crossing rate (ZCR). 4 th formant frequencies are analysed to identify intra-morphic (unfilled) pauses present within voiced regions. The duratiion of intra-morphic pauses are analysed for stuttred speech and normal speech. It is observed that the duration of normal intra-morphic pause ranges from 150 ms-250 ms and inter-morphic pauses are <=250 ms and short pause have duration ranges from 50 ms-150 ms. Whereas in stuttering short intra-morphic pauses ranges from 10 ms to 50 ms, long pauses ranges from 250 ms to 1 or 2 seconds. Segmentation of the intra-morphic pauses is observed to acheive an accuracy of 98%. Results are compared and validated with manual method. � 2019 IEEE.
Realistic golf flight simulation
(2016) Sumukha, R.M.; Koolagudi, S.G.; Naresh, V.; Afroz, F.; Reddy, Y.N.A.
The motion of the projectile is an easily observable phenomenon. The knowledge of the behavior of projectiles has been used extensively in warfare, since many centuries. From cannons to present day GPS-guided missiles, all rely on the principles of projectile motion. Apart from missiles, a flying golf ball is an interesting subject to study projectiles. The actual flight path can be simulated on a digital computer with computer graphics. In a natural setting, the golf ball's motion is dependent on various environmental factors. In this paper, apart from the initial velocity and angle of launch, resistance due to air and cross wind effects will also be considered. At the end of the projectile's flight, the landing will be simulated using bouncing ball physics. The theory for the object's motion will be utilized and then it will be adapted for simulation. The position and the configuration of the object and environmental conditions are taken as variables while modelling its flight. � 2016 IEEE.
Raga classification for Carnatic music
(2015) Suma, S.M.; Koolagudi, S.G.
In this work, an effort has been made to identify raga of given piece of Carnatic music. In the proposed method, direct raga classification without the use of note sequence has been performed using pitch as the primary feature. The primitive features that are extracted from the probability density function (pdf) of the pitch contour are used for classification. A feature vector of 36 dimension is obtained by extracting some parameters from the pdf. Since non-sequential features are extracted from the signal, artificial neural network (ANN) is used as a classifier. The database used for validating the system consists of 162 songs from 12 ragas. The average classification accuracy is found to be 89.5%. � Springer India 2015.
Product review based on optimized facial expression detection
(2017) Chaugule, V.; Abhishek, D.; Vijayakumar, A.; Ramteke, P.B.; Koolagudi, S.G.
This paper proposes a method to review public acceptance of products based on their brand by analyzing the facial expression of the customer intending to buy the product from a supermarket or hypermarket. In such cases, facial expression recognition plays a significant role in product review. Here, facial expression detection is performed by extracting feature points using a modified Harris algorithm. The modified Harris algorithm reduced the time complexity of the existing feature extraction Harris Algorithm. A comparison of time complexities of existing algorithms is done with proposed algorithm. The algorithm proved to be significantly faster and nearly accurate for the needed application by reducing the time complexity for corner points detection. � 2016 IEEE.
Matching Witness' Account with Mugshots for Forensic Applications
(2018) Mohan, A.; Dhir, R.; Hirashkar, H.; Chittaragi, N.B.; Koolagudi, S.G.
This paper proposes a system that can be used by the forensics department to identify and disclose criminal details automatically. The problem of matching the description of a suspect in a crime scene provided by an eye-witness to existing mugshots (mugshots represents photograph taken as someone is arrested) in the police departments criminal database is addressed in this work. Prominent features such as skin colour, size of nose lips, shape the size of eyes, and shape of the face are considered for discrimination of individual criminals. The witness fills in the description fields through which, most appropriate images are selected from an existing database. Images are scored on the basis of the degree of closeness to the given description, and most relevant images are displayed first followed by the rest. The classification of images based on explored facial features is done using extreme gradient boosting (XGBoost) supervised an ensemble learning method. Comparatively better performances are observed. � 2018 IEEE.
Polygonal Meshes Predicated Watermarking Algorithm to Avert Misinterpretation of ATM Cards
(2016) Verma, G.; Gawande, S.M.; Bhura, M.; Koolagudi, S.G.
There is a loophole between the Mazuma card standards followed in the banks and the financial frauds done by the Mazuma card cloning. It undertakes two primary tasks; namely understanding of the traditional standard cash card provided by the banks and a proposed methodology to make them more secure to reduce the Mazuma card frauds. The methodology utilizes the watermarking procedure to embed the customer's unique signature in the magnetic stripe of the Mazuma card which plays a prominent role to authenticate the utilizer. This authentication mechanism is a subsidiary while transaction to secure cash card from being cloned via skimming contrivance. In this paper we compute the Laplacian coordinates and then construct vectors (histogram) followed by embedding the watermark adjusting the state of that histogram. We hide all the users details in this watermark. The watermark extraction is done blindly without referencing the host model. It is also robust and resists the geometrical transformation such as translation, uniform scaling, rotation and vertex reordering. � 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license.

2. Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results