Browsing by Author "Mulimani, M."

Now showing 1 - 20 of 24

Acoustic Event and Scene Classification: A Review
(Springer, 2025) Mulimani, M.; Venkatesh, S.; Koolagudi, S.G.
This paper gives deeper insight into the range of recent approaches developed and reported in the literature specifically for monophonic acoustic event classification (AEC), polyphonic acoustic event detection (AED) and acoustic scene classification (ASC) concerning datasets, features and classifiers. A list of datasets used for monophonic AEC, polyphonic AED and ASC is introduced. The features and classifiers used for monophonic AEC, polyphonic AED and ASC are reviewed with their success and failures. A list of the research issues is derived from the critical review of the available literature at the end of the paper. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.
Acoustic event classification using graph signals
(2017) Mulimani, M.; Jahnavi, U.P.; Koolagudi, S.G.
In this paper, a graph signal is generated from spectrogram and features are investigated from graph signal for Acoustic Event Classification (AEC). Different acoustic events are selected from Sound Scene Database of Real Word Computing Partnership (RWCP) group. Three different noises are selected from NOISEX'92 database and added to test samples at different noise conditions separately. The recognition performance of acoustic events using proposed features and Mel-frequency cepstral coefficients (MFCCs) with clean and noisy test samples are compared. The proposed features show significantly improved recognition accuracy over MFCCs in noisy conditions. � 2017 IEEE.
Acoustic event classification using graph signals
(Institute of Electrical and Electronics Engineers Inc., 2017) Mulimani, M.; Jahnavi, U.P.; Koolagudi, S.G.
In this paper, a graph signal is generated from spectrogram and features are investigated from graph signal for Acoustic Event Classification (AEC). Different acoustic events are selected from Sound Scene Database of Real Word Computing Partnership (RWCP) group. Three different noises are selected from NOISEX'92 database and added to test samples at different noise conditions separately. The recognition performance of acoustic events using proposed features and Mel-frequency cepstral coefficients (MFCCs) with clean and noisy test samples are compared. The proposed features show significantly improved recognition accuracy over MFCCs in noisy conditions. Â© 2017 IEEE.
Acoustic Event Classification Using Spectrogram Features
(2019) Mulimani, M.; Koolagudi, S.G.
This paper investigates a new feature extraction method to extract different features from the spectrogram of an audio signal for Acoustic Event Classification (AEC). A new set of features is formulated and extracted from local spectrogram regions named blocks. The average recognition performance of proposed spectrogram based features and Mel-frequency cepstral coefficients (MFCCs) with their deltas and accelerations on Support Vector Machines (SVM) is compared. In this work, different categories of acoustic events are considered from the Freiburg-106 dataset. Proposed features show significantly improved performance over conventional Mel-frequency cepstral coefficients (MFCCs) for Acoustic Event Classification. � 2018 IEEE.
Acoustic Event Classification Using Spectrogram Features
(Institute of Electrical and Electronics Engineers Inc., 2018) Mulimani, M.; Koolagudi, S.G.
This paper investigates a new feature extraction method to extract different features from the spectrogram of an audio signal for Acoustic Event Classification (AEC). A new set of features is formulated and extracted from local spectrogram regions named blocks. The average recognition performance of proposed spectrogram based features and Mel-frequency cepstral coefficients (MFCCs) with their deltas and accelerations on Support Vector Machines (SVM) is compared. In this work, different categories of acoustic events are considered from the Freiburg-106 dataset. Proposed features show significantly improved performance over conventional Mel-frequency cepstral coefficients (MFCCs) for Acoustic Event Classification. Â© 2018 IEEE.
Acoustic Scene Classification using Deep Fisher network
(Elsevier Inc., 2023) Venkatesh, S.; Mulimani, M.; Koolagudi, S.G.
Acoustic Scene Classification (ASC) is the task of assigning a semantic label to an audio recording, based on the surrounding environment. In this work, a Fisher network is introduced for ASC. The proposed method mimics the working mechanism of a feed-forward Convolutional Neural Network (CNN) where, output of a layer is fed as an input to the succeeding layer. The Fisher network consists of a feature extraction step followed by a Fisher layer. The Fisher layer has three sub-layers, namely, Fisher Vector (FV) encoder, temporal pyramid and normalization layers along with feature reduction layer. Gammatone Time Cepstral Coefficients (GTCCs) and Mel-spectrograms are the features encoded as Fisher vector representation in FV encoder sub-layer. Temporal information of the Fisher vectors is retained using temporal pyramid sub-layer. After temporal pyramids are extracted from Fisher vectors, they are available as a feature vector. Irrelevant dimensions of the temporal pyramids are reduced further using Principal Component Analysis (PCA) in normalization and PCA sub-layers. The proposed model is evaluated on five DCASE datasets, TUT Urban Acoustic Scenes 2018 and Mobile, DCASE 2019 Acoustic Scene Classification Task 1(a) and Task 1(b), TAU Urban Acoustic Scenes 2020 datasets. The overall classification accuracy is 93%, 91%, 92%, 91% and 89% for TUT 2018, TUT Mobile 2018, DCASE Task 1(a) 2019, DCASE Task 1(b) 2019, and TAU Urban Acoustic Scenes 2020 datasets, respectively. The proposed model performed much better than the state-of-the-art ASC systems. © 2023 Elsevier Inc.
Acoustic Scene Classification using Deep Learning Architectures
(Institute of Electrical and Electronics Engineers Inc., 2021) V Spoorthy; Mulimani, M.; Koolagudi, S.G.
Enabling devices to make sense of sound is known as Acoustic Scene Classification (ASC). The analysis of various scenes by applying computational algorithms is known as computational auditory scene analysis. The main aim of this paper is to classify audio recordings based on the scenes/environment in which they are recorded. Deep learning is amongst the recent trends in most of the applications. In this paper, two deep learning algorithms are used to perform the classification of acoustic scenes, namely Convolution Neural Network (CNN) and Convolution-Recurrent Neural Network (CRNN). The model is evaluated on three activation functions, namely, ReLU, LeakyReLU and ELU. The highest recognition accuracy achieved for ASC task is 90.96% from CRNN model. The model performed well on basic convolution architecture with 10.9% improvement from the baseline system of this task. Â© 2021 IEEE.
Acoustic scene classification using projection Kervolutional neural network
(Springer, 2023) Mulimani, M.; Nandi, R.; Koolagudi, S.G.
In this paper, a novel Projection Kervolutional Neural Network (ProKNN) is proposed for Acoustic Scene Classification (ASC). ProKNN is a combination of two special filters known as the left and right projection layers and Kervolutional Neural Network (KNN). KNN replaces the linearity of the Convolutional Neural Network (CNN) with a non-linear polynomial kernel. We extend the ProKNN to learn from the features of two channels of audio recordings in the initial stage. The performance of the ProKNN is evaluated on the two publicly available datasets: TUT Urban Acoustic Scenes 2018 and TUT Urban Acoustic Scenes Mobile 2018 development datasets. Results show that the proposed ProKNN outperforms the existing systems with an absolute improvement of accuracy of 8% and 14% on TUT Urban Acoustic Scenes 2018 and TUT Urban Acoustic Scenes Mobile 2018 development datasets respectively, as compared to the baseline model of Detection and Classification of Acoustic Scene and Events (DCASE) - 2018 challenge. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Currency recognition system using image processing
(2018) Abburu, V.; Gupta, S.; Rimitha, S.R.; Mulimani, M.; Koolagudi, S.G.
In this paper, we propose a system for automated currency recognition using image processing techniques. The proposed method can be used for recognizing both the country or origin as well as the denomination or value of a given banknote. Only paper currencies have been considered. This method works by first identifying the country of origin using certain predefined areas of interest, and then extracting the denomination value using characteristics such as size, color, or text on the note, depending on how much the notes within the same country differ. We have considered 20 of the most traded currencies, as well as their denominations. Our system is able to accurately and quickly identify test notes. � 2017 IEEE.
Currency recognition system using image processing
(Institute of Electrical and Electronics Engineers Inc., 2017) Abburu, V.; Gupta, S.; Rimitha, S.R.; Mulimani, M.; Koolagudi, S.G.
In this paper, we propose a system for automated currency recognition using image processing techniques. The proposed method can be used for recognizing both the country or origin as well as the denomination or value of a given banknote. Only paper currencies have been considered. This method works by first identifying the country of origin using certain predefined areas of interest, and then extracting the denomination value using characteristics such as size, color, or text on the note, depending on how much the notes within the same country differ. We have considered 20 of the most traded currencies, as well as their denominations. Our system is able to accurately and quickly identify test notes. Â© 2017 IEEE.
Dynamic 3D graph visualizations in Julia
(2016) Anilkumar, A.; Mathew, K.T.; Mulimani, M.; Koolagudi, S.G.; Jamadagni, C.
A major problem with graph visualization libraries and packages is the lack of interactivity and 3D visualization. This makes understanding and analyzing complex graphs and topologies difficult. Existing packages and tools which do provide similar functionality are difficult to use, install, integrate and have many dependencies. This paper discusses NetworkViz.jl, a Julia package which addresses the issues of existing graph visualization platforms while ensuring simplicity, efficiency, a diverse set of features and easy integration with other packages. This package supports two-And three- dimensional visualizations and uses a force-directed graph drawing approach to generate aesthetically pleasing and easy-To-use graphs. The library was built entirely in Julia due to its good documentation, large open source community and in order to fully utilize (he inherent advantages provided by the language. As graph visualizations are important for analyzing complex networks, testing routing algorithms, as teaching AIDS, etc., we believe that NetworkViz.jl will be of integral use in the fields of research and education. � 2016 Society for Modeling & Simulation International (SCS).
Dynamic 3D graph visualizations in Julia
(The Society for Modeling and Simulation International www.scs.org, 2016) Anilkumar, A.; Mathew, K.T.; Mulimani, M.; Koolagudi, S.G.; Jamadagni, C.
A major problem with graph visualization libraries and packages is the lack of interactivity and 3D visualization. This makes understanding and analyzing complex graphs and topologies difficult. Existing packages and tools which do provide similar functionality are difficult to use, install, integrate and have many dependencies. This paper discusses NetworkViz.jl, a Julia package which addresses the issues of existing graph visualization platforms while ensuring simplicity, efficiency, a diverse set of features and easy integration with other packages. This package supports two-And three- dimensional visualizations and uses a force-directed graph drawing approach to generate aesthetically pleasing and easy-To-use graphs. The library was built entirely in Julia due to its good documentation, large open source community and in order to fully utilize (he inherent advantages provided by the language. As graph visualizations are important for analyzing complex networks, testing routing algorithms, as teaching AIDS, etc., we believe that NetworkViz.jl will be of integral use in the fields of research and education. Â© 2016 Society for Modeling & Simulation International (SCS).
Extraction of MapReduce-based features from spectrograms for audio-based surveillance
(Elsevier Inc. usjcs@elsevier.com, 2019) Mulimani, M.; Koolagudi, S.G.
In this paper, we proposed a novel parallel method for extraction of significant information from spectrograms using MapReduce programming model for the audio-based surveillance system, which effectively recognizes critical acoustic events in the surrounding environment. Extraction of reliable information as features from spectrograms of big noisy audio event dataset demands high computational time. Parallelizing the feature extraction using MapReduce programming model on Hadoop improves the efficiency of the overall system. The acoustic events with real-time background noise from Mivia lab audio event data set are used for surveillance applications. The proposed approach is time efficient and achieves high performance of recognizing critical acoustic events with the average recognition rate of 96.5% in different noisy conditions. © 2019 Elsevier Inc.
Gender Detection using Handwritten Signatures
(Institute of Electrical and Electronics Engineers Inc., 2018) Mohit Reddy, J.; Guru Pradeep Reddy, T.; Mishra, S.; Mulimani, M.; Koolagudi, S.G.
In this paper, a method is proposed which uses both Image Processing and Machine Learning techniques which detects the gender of a person using handwritten signature. A photograph of a handwritten signature is given as input to the model which then extracts different features like pen pressure, slant angle, count external and internal contours etc. The features extracted from multiple images in the dataset are used to train the model, which then predicts the output of a new input given to it. Our objective is to collect unbiased datasets from a set of people and feed those signatures to the model, carrying out the statistical analysis and calculating the accuracy of the algorithm after every signature classification. We have used Adaboost classifier which gave a cross-validation accuracy of 73.2% compared to other classifiers like Gradient Boosting Classifier, Random Forest Trees and Multi-Layer Perceptron which gave 73.2%, 63.2% and 59.6% accuracies respectively. Copy Right Â© INDIACom-2018.
Image processing approach to diagnose eye diseases
(2017) Prashasthi, M.; Shravya, K.S.; Deepak, A.; Mulimani, M.; Koolagudi, S.G.
Image processing and machine learning techniques are used for automatic detection of abnormalities in eye. The proposed methodology requires a clear photograph of eye (not necessarily a fundoscopic image) from which the chromatic and spatial property of the sclera and iris is extracted. These features are used in the diagnosis of various diseases considered. The changes in the colour of iris is a symptom for corneal infections and cataract, the spatial distribution of different colours distinguishes diseases like subconjunctival haemorrhage and conjunctivitis, and the spatial arrangement of iris and sclera is an indicator of palsy. We used various classifiers of which adaboost classifier which was found to give a substantially high accuracy i.e., about 95% accuracy when compared to others (k-NN and naive-Bayes). To enumerate the accuracy of the method proposed, we used 150 samples in which 23% were used for testing and 77% were used for training. � Springer International Publishing AG 2017.
Image processing approach to diagnose eye diseases
(Springer Verlag service@springer.de, 2017) Prashasthi, P.; Shravya, K.S.; Deepak, A.; Mulimani, M.; Shashidhar, K.G.
Image processing and machine learning techniques are used for automatic detection of abnormalities in eye. The proposed methodology requires a clear photograph of eye (not necessarily a fundoscopic image) from which the chromatic and spatial property of the sclera and iris is extracted. These features are used in the diagnosis of various diseases considered. The changes in the colour of iris is a symptom for corneal infections and cataract, the spatial distribution of different colours distinguishes diseases like subconjunctival haemorrhage and conjunctivitis, and the spatial arrangement of iris and sclera is an indicator of palsy. We used various classifiers of which adaboost classifier which was found to give a substantially high accuracy i.e., about 95% accuracy when compared to others (k-NN and naive-Bayes). To enumerate the accuracy of the method proposed, we used 150 samples in which 23% were used for testing and 77% were used for training. Â© Springer International Publishing AG 2017.
Locality-constrained linear coding based fused visual features for robust acoustic event classification
(International Speech Communication Association, 2019) Mulimani, M.; Koolagudi, G.K.
In this paper, a novel Fused Visual Features (FVFs) are proposed for Acoustic Event Classification (AEC) in the meeting room and office environments. The codes of Visual Features (VFs) are evaluated from row vectors and Scale Invariant Feature Transform (SIFT) vectors of the grayscale Gammatonegram of an acoustic event separately using Locality-constrained Linear Coding (LLC). Further, VFs from row vectors and SIFT vectors of the grayscale Gammatonegram are fused to get FVFs. Performance of the proposed FVFs is evaluated on acoustic events of publicly available UPC-TALP and DCASE datasets in clean and noisy conditions. Results show that proposed FVFs are robust to noise and achieve overall recognition accuracy of 96.40% and 90.45% on UPC-TALP and DCASE datasets, respectively. Â© 2019 ISCA
Polyphonic sound event detection using transposed convolutional recurrent neural network
(Institute of Electrical and Electronics Engineers Inc., 2020) Chatterjee, C.C.; Mulimani, M.; Koolagudi, S.G.
In this paper we propose a Transposed Convolutional Recurrent Neural Network (TCRNN) architecture for polyphonic sound event recognition. Transposed convolution layer, which caries out a regular convolution operation but reverts the spatial transformation and it is combined with a bidirectional Recurrent Neural Network (RNN) to get TCRNN. Instead of the traditional mel spectrogram features, the proposed methodology incorporates mel-IFgram (Instantaneous Frequency spectrogram) features. The performance of the proposed approach is evaluated on sound events of publicly available TUT-SED 2016 and Joint sound scene and polyphonic sound event recognition datasets. Results show that the proposed approach outperforms state-of-the-art methods. Â© 2020 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
Robust acoustic event classification using bag-of-visual-words
(2018) Mulimani, M.; Koolagudi, S.G.
This paper presents a novel Bag-of-Visual-Words (BoVW) approach, to represent the grayscale spectrograms of acoustic events. Such, BoVW representations are referred as histograms of visual features, used for Acoustic Event Classification (AEC). Further, Chi-square distance between histograms of visual features evaluated, which generates kernel to Support Vector Machines (Chi-square SVM) classifier. Evaluation of the proposed histograms of visual features together with Chi-square SVM classifier is conducted on different categories of acoustic events from UPC-TALP corpora in clean and different noise conditions. Results show that proposed approach is more robust to noise and achieves improved recognition accuracy compared to other methods. � 2018 International Speech Communication Association. All rights reserved.
Robust acoustic event classification using bag-of-visual-words
(International Speech Communication Association publication@isca-speech.org 4 Rue des Fauvettes - Lous Tourils Baixas 66390, 2018) Mulimani, M.; Koolagudi, S.G.
This paper presents a novel Bag-of-Visual-Words (BoVW) approach, to represent the grayscale spectrograms of acoustic events. Such, BoVW representations are referred as histograms of visual features, used for Acoustic Event Classification (AEC). Further, Chi-square distance between histograms of visual features evaluated, which generates kernel to Support Vector Machines (Chi-square SVM) classifier. Evaluation of the proposed histograms of visual features together with Chi-square SVM classifier is conducted on different categories of acoustic events from UPC-TALP corpora in clean and different noise conditions. Results show that proposed approach is more robust to noise and achieves improved recognition accuracy compared to other methods. Â© 2018 International Speech Communication Association. All rights reserved.