Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 10 of 28

Convolutional Neural Network-Enabling Speech Command Recognition
(Springer Science and Business Media Deutschland GmbH, 2023) Patra, A.; Pandey, C.; Palaniappan, K.; Sethy, P.K.
The speech command recognition system based on deep image classification is the key that would tremendously promise to revolutionize research and development by overcoming the communication barrier between human and machine or computer. We are all aware of challenges in identifying the voice command in noise and variability in speed, pitch, and projection. This paper has developed an efficient and highly accurate speech command recognition for smart and effective speech processing applications like modern telecommunication. In particular, a novel convolutional neural network (CNN) is presented that works with a one-second audio clip consisting of one specific word including ten speech commands and other words labeled as “unknown,” and model implementations were operated in the noisy environment. The CNNs are structurally fully developed in such a way to recognize the speech commands with the utilization of deep learning (DL) for image classification concepts. Thus, this research used the concept of DL for image classification to translate the problem of speech command recognition into the image domain. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Multimodal Meme Troll and Domain Classification Using Contrastive Learning
(Institute of Electrical and Electronics Engineers Inc., 2024) Phadatare, A.; Jayanth, P.; Anand Kumar, M.A.
This paper presents a holistic approach to meme trolling detection and domain classification, focusing on Telugu and Kannada languages. Leveraging a spectrum of methodologies ranging from basic machine learning models such as Support Vector Machines (SVM), Random Forest, Naive Bayes, to image-based models like Convolutional Neural Networks (CNN), ResNet-50, and state-of-the-art models such as CLIP, multilingual BERT, XLM-BERT, and Vision Transformers, we explore diverse modalities including image classification, extracted text classification, and combined text-caption classification. Our system integrates multiple models to achieve two primary goals: accurately detecting trolling behavior and classifying memes into thematic domains like politics, movies, sports.. By training on multilingual data and considering linguistic diversity, our approach ensures robust performance across different linguistic contexts, providing valuable insights into meme culture and trolling behavior in Telugu and Kannada-speaking communities. Â© 2024 IEEE.
A metaheuristic framework based automated Spatial-Spectral graph for land cover classification from multispectral and hyperspectral satellite images
(Elsevier B.V., 2020) Suresh, S.; Lal, S.
Land cover classification of satellite images has been a very predominant area since the last few years. An increase in the amount of information acquired by satellite imaging systems, urges the need for automatic tools for classification. Satellite images exhibit spatial and/or temporal dependencies in which the conventional machine learning algorithms fail to perform well. In this paper, we propose an improved framework for automated land cover classification using Spatial Spectral Schroedinger Eigenmaps (SSSE) optimized by Cuckoo Search (CS) algorithm. Support Vector Machine (SVM) is adopted for the final thematic map generation following dimensionality reduction and clustering by the proposed approach. The novelty of the proposed framework is that the applicability of optimized SSSE for land cover classification of medium and high resolution multi-spectral satellite images is tested for the first time. The proposed method makes land cover classification system fully automatic by optimizing the algorithm specific image dependent parameter ? using CS algorithm. Experiments are carried out over publicly available high and medium resolution multi-spectral satellite image datasets (Landsat 5 TM and IKONOS 2 MS) and hyper-spectral satellite image datasets (Pavia University and Indian Pines) to assess the robustness of the proposed approach. Performance comparisons of the proposed method against state-of-the-art multi-spectral and hyper-spectral land cover classification methods reveal the efficiency of the proposed method. © 2020 Elsevier B.V.
Islanding detection method based on image classification technique using histogram of oriented gradient features
(Institution of Engineering and Technology jbristow@theiet.org, 2020) Manikonda, S.K.G.; Gaonkar, D.N.
A new islanding detection method based on image classification with support vector machine is proposed in this study. Histogram of oriented gradient features is extracted from the image for classifying non-islanding and islanding events. In the proposed technique, the time-series signal acquired from the point of common coupling is first converted into an image. Histogram of oriented gradient features is extracted from the image, which is used as an input feature vector for training and testing multiple support vector machine classifiers. Parameters such as voltage, rate of change of voltage, and rate of change of negative sequence voltage are used. Furthermore, a feature for early islanding detection is also presented to detect an islanding event even before it has occurred. The detection accuracy of the proposed method is tested with different kernels. The performance of all the classifiers is tested with 5-fold cross-validation. The classification results show that islanding detection with image classification based on the histogram of oriented gradient feature and multiple support vector machine classifiers can achieve excellent results. © The Institution of Engineering and Technology 2020
HybridCNN based hyperspectral image classification using multiscale spatiospectral features
(Elsevier B.V., 2020) Mohan, A.; Venkatesan, M.
Hyperspectral images (HSIs) are contiguous band images widely used in remote sensing applications. The evolution of deep learning techniques made a significant impact on HSI classification. Several HSI processing applications rely on various Convolutional Neural Network (CNN) models. However, the higher dimensionality nature of HSIs increases the computational complexity and leads to the Hughes phenomenon. Therefore most of the CNN models perform dimensionality reduction (DR) as a preprocessing step. Another challenge in HSI classification is the consideration of both spatial and spectral features for obtaining accurate results. A few 3-D-CNN models are designed to overcome this challenge, but it takes more execution time than other methods. This research work proposes a multiscale spatio-spectral feature based hybrid CNN model for hyperspectral image classification. Hybrid DR used for optimal band extraction, which performs linear Gaussian Random Projection (GRP) and non-linear Kernel Principal Component Analysis (KPCA). The proposed hybrid CNN classification technique extracts the spectral and spatial features for different window sizes using 3D-CNN. These features concatenated and fed into a 2D-CNN for further feature extraction and classification. The hybrid model is compared against various state-of-the-art CNN based techniques and found to showcase a satisfactory result with less computational complexity. © 2020 Elsevier B.V.
A deep neural network model for content-based medical image retrieval with multi-view classification
(Springer Science and Business Media Deutschland GmbH, 2021) Karthik, K.; Kamath S?, S.S.
In medical applications, retrieving similar images from repositories is most essential for supporting diagnostic imaging-based clinical analysis and decision support systems. However, this is a challenging task, due to the multi-modal and multi-dimensional nature of medical images. In practical scenarios, the availability of large and balanced datasets that can be used for developing intelligent systems for efficient medical image management is quite limited. Traditional models often fail to capture the latent characteristics of images and have achieved limited accuracy when applied to medical images. For addressing these issues, a deep neural network-based approach for view classification and content-based image retrieval is proposed and its application for efficient medical image retrieval is demonstrated. We also designed an approach for body part orientation view classification labels, intending to reduce the variance that occurs in different types of scans. The learned features are used first to predict class labels and later used to model the feature space for similarity computation for the retrieval task. The outcome of this approach is measured in terms of error score. When benchmarked against 12 state-of-the-art works, the model achieved the lowest error score of 132.45, with 9.62–63.14% improvement over other works, thus highlighting its suitability for real-world applications. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature.
V3O2: hybrid deep learning model for hyperspectral image classification using vanilla-3D and octave-2D convolution
(Springer Science and Business Media Deutschland GmbH, 2021) Mohan, A.; Sundaram, V.
Remote sensing image analysis is an emerging area of research and is used for various applications such as climate analysis, crop monitoring and change detection. Hyperspectral image (HSI) is one of the dominant remote sensing imaging modalities that captures information beyond the visible spectrum. The evolution of deep learning has made a significant impact on HSI analysis, mainly for its classification. The spatial–spectral feature-based classification model improves the classification accuracy of hyperspectral images (HSIs). However, these models are computationally expensive, and redundancy exists in the spatial dimension of features. This research work proposes a hybrid convolutional neural network (CNN) for HSI classification. The proposed model uses principal component analysis (PCA) as a preprocessing technique for optimal band extraction from HSIs. The hybrid CNN classification technique extracts the spectral and spatial features using three-dimensional CNN (3D CNN). These features are fed into a two-dimensional CNN (2D CNN) for further feature extraction and classification. The redundancy in spatial features of the hybrid CNN model is reduced by octave convolution (OctConv) instead of standard vanilla convolution. OctConv factorizes the spatial features into lower and higher spatial frequencies, and different convolutions are performed on them based on their frequencies. The hybrid model is compared against various state-of-the-art CNN-based techniques and found that the accuracy is boosted with a lesser computational cost. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature.
Swarm optimisation-based bag of visual words model for content-based X-ray scan retrieval
(Inderscience Publishers, 2022) Karthik, K.; Kamath S․, S.
Classification and retrieval of medical images (MedIR) are emerging applications of computer vision for enabling intelligent medical diagnostics. Medical images are multi-dimensional and require specialised processing for the extraction of features from their manifold underlying content. Existing models often fail to consider the inherent characteristics of data and have thus often fallen short when applied to medical images. In this paper, we present a MedIR approach based on the bag of visual words (BoVW) model for content-based medical image retrieval. When it comes to any medical approach models, an imbalance in the dataset is one of the issues. Hence the perspective is also considering a balanced set of categories from an imbalanced dataset. The proposed work on BoVW model extracts features from each image are used to train supervised machine learning classifier for X-ray medical image classification and retrieval. During the experimental validation, the proposed model performed well with the classification accuracy of 89.73% and a good retrieval result using our filter-based approach. © © 2022 Inderscience Enterprises Ltd.
Crossover based technique for data augmentation
(Elsevier Ireland Ltd, 2022) Raj, R.; Mathew, J.; Kannath, S.K.; Rajan, J.
Background and Objective: Medical image classification problems are frequently constrained by the availability of datasets. “Data augmentation” has come as a data enhancement and data enrichment solution to the challenge of limited data. Traditionally data augmentation techniques are based on linear and label preserving transformations; however, recent works have demonstrated that even non-linear, non-label preserving techniques can be unexpectedly effective. This paper proposes a non-linear data augmentation technique for the medical domain and explores its results. Methods: This paper introduces “Crossover technique”, a new data augmentation technique for Convolutional Neural Networks in Medical Image Classification problems. Our technique synthesizes a pair of samples by applying two-point crossover on the already available training dataset. By this technique, we create N new samples from N training samples. The proposed crossover based data augmentation technique, although non-label preserving, has performed significantly better in terms of increased accuracy and reduced loss for all the tested datasets over varied architectures. Results: The proposed method was tested on three publicly available medical datasets with various network architectures. For the mini-MIAS database of mammograms, our method improved the accuracy by 1.47%, achieving 80.15% using VGG-16 architecture. Our method works fine for both gray-scale as well as RGB images, as on the PH2 database for Skin Cancer, it improved the accuracy by 3.57%, achieving 85.71% using VGG-19 architecture. In addition, our technique improved accuracy on the brain tumor dataset by 0.40%, achieving 97.97% using VGG-16 architecture. Conclusion: The proposed novel crossover technique for training the Convolutional Neural Network (CNN) is painless to implement by applying two-point crossover on two images to form new images. The method would go a long way in tackling the challenges of limited datasets and problems of class imbalances in medical image analysis. Our code is available at https://github.com/rishiraj-cs/Crossover-augmentation © 2022
Detection of retinal disorders from OCT images using generative adversarial networks
(Springer, 2022) Smitha, A.; Padikkal, J.
Retinal image analysis has opened up a new window for prompt diagnosis and detection of various retinal disorders. Optical Coherence Tomography (OCT) is one of the major diagnostic tools to identify retinal abnormalities related to macular disorders like Age-Related Macular Degeneration (AMD) and Diabetic Macular Edema (DME). The clinical findings include retinal layer analysis to spot the abnormalities on OCT images. Though various models are proposed over the years to diagnose these disorders automatically, an end-to-end system that performs automatic denoising, segmentation, and classification does not exist to the best of our knowledge. This paper proposes a Generative Adversarial Network (GAN) based approach for automated segmentation and classification of OCT-B scans to diagnose AMD and DME. The proposed method incorporates the integration of handcrafted Gabor features to enhance the retina layer segmentation and non-local denoising to remove speckle noise. The classification metrics of GAN are compared with existing methods. The accuracy of up to 92.42% and F1-score of 0.79 indicates that the GANs can perform well for segmentation and classification of OCT images. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results