Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 10 of 63
  • Item
    Multimodal group activity state detection for classroom response system using convolutional neural networks
    (Springer Verlag service@springer.de, 2019) Sebastian, A.G.; Singh, S.; Manikanta, P.B.T.; Ashwin, T.S.; Guddeti, R.M.R.
    Human–Computer Interaction is a crucial and emerging field in computer science. This is because computers are replacing humans in many jobs to provide services. This has resulted in the computer being needed to interact with the human in the same way as the human does with another. When humans talk to each other, they gain feedback based on how the other person responds non-verbally. Since computers are now interacting with humans, they need to be able to detect these facial cues and accordingly adjust their services based on this feedback. Our proposed method aims at building a Multimodal Group Activity State Detection for Classroom Response System which tries to recognize the learning behavior of a classroom for providing effective feedback and inputs to the teacher. The key challenges dealt here are to detect and analyze as many students as possible for a non-biased evaluation of the mood of the students and classify them into three activity states defined: Active, passive, and inactive. © Springer Nature Singapore Pte Ltd. 2019
  • Item
    Inner Attention Based bi-LSTMs with Indexing for non-Factoid Question Answering
    (Institute of Electrical and Electronics Engineers Inc., 2018) Sharma, A.; Harithas, C.
    In this paper, we focussed on non-factoid question answering problem using a bidirectional LSTM with an inner attention mechanism and indexing for better accuracy. Non factoid QA is an important task and can be significantly applied in constructing useful knowledge bases and extracting valuable information. The advantage of using Deep Learning frameworks in solving these kind of problems is that it does not require any feature engineering and other linguistic tools. The proposed approach is to extend a LSTM (Long Short Term Memory) model in two directions, one with a Convolutional layer and other with an inner attention mechanism, proposed by Bingning Wang, et al., to the LSTMs, to generate answer representations in accordance with the question. On top of this Deep Learning model we used an information retrieval model based on indexing to generate answers and improve the accuracy. The proposed methodology showed an improvement in accuracy over the referred model and respective baselines and also with respect to the answer lengths used. The models are tested with two non factoid QA data sets: TREC-QA and InsuranceQA. © 2018 IEEE.
  • Item
    Dynamic Approach for Lane Detection using Google Street View and CNN
    (Institute of Electrical and Electronics Engineers Inc., 2019) Mamidala, R.S.; Uthkota, U.; Shankar, M.B.; Antony, A.J.; Narasimhadhan, A.V.
    Lane detection algorithms have been the key enablers for a fully-assistive and autonomous navigation systems. In this paper, a novel and pragmatic approach for lane detection is proposed using a convolutional neural network (CNN) model based on SegNet encoder-decoder architecture. The encoder block renders low-resolution feature maps of the input and the decoder block provides pixel-wise classification from the feature maps. The proposed model has been trained over 2000 image data-set and tested against their corresponding ground-truth provided in the data-set for evaluation. To enable real-time navigation, we extend our model's predictions interfacing it with the existing Google APIs evaluating the metrics of the model tuning the hyper-parameters. The novelty of this approach lies in the integration of existing segnet architecture with google APIs. This interface makes it handy for assistive robotic systems. The observed results show that the proposed method is robust under challenging occlusion conditions due to pre-processing involved and gives superior performance when compared to the existing methods. © 2019 IEEE.
  • Item
    Spatiospectral feature extraction and classification of hyperspectral images using 3d-cnn + convlstm model
    (Springer, 2020) Mohan, A.; Venkatesan, M.
    Hyperspectral images (HSIs) are contiguous bands captured beyond the visible spectrum. The evolution of deep learning techniques places a massive impact on hyperspectral image classification. Curse of dimensionality is one of the significant issues of hyperspectral image analysis. Therefore, most of the existing classification models perform principal component analysis (PCA) as the dimensionality reduction (DR) technique. Since hyperspectral images are nonlinear, linear DR techniques fail to reserve the nonlinear features. The usage of both spatial and spectral features together improves the classification accuracy of the model. 3D-convolutional neural networks (CNN) extract the spatiospectral features for classification, whereas it is not considering the dependencies in features. This research work proposes a new model for HSI classification using 3D-CNN and convolutional long short-term memory (ConvLSTM). The optimal band extraction is performed by a hybrid DR technique, which is the combination of Gaussian random projection (GRP) and Kernel PCA (KPCA). The proposed deep learning model extracts spatiospectral features using 3D-CNN and dependent spatial features using 2D-ConvLSTM in parallel. Combination of extracted features is fed into a fully connected network for classification. The experiment is performed on three widely used datasets, and the proposed model is compared against the various state-of-the-art techniques and found better classification accuracy. © Springer Nature Singapore Pte Ltd 2020.
  • Item
    A hybrid model of convo-GAN to detect fake images
    (Grenze Scientific Society, 2021) Saha, S.; Rudra, B.
    With advancements in the field of Deep Learning, it has become easy to generate face swaps, thereby creating fake images which look extremely realistic, leaving few traces which cannot be detected by bare human eyes. Such images are known as ‘DeepFakes’ that can be used to create a ruckus and affect the quality of public discourse on sensitive issues, defame an individual’s profile, create political distress, blackmail a person or envision fake cyber terrorists. This paper proposes methods to detect fake images with the help of hybrid models having Convolutional Neural Network with Error Level Analysis, Gated Recurrent Unit neural network, Long Short Term Memory recurrent neural network and Generative Adversarial Network respectively. The 2019 ‘Real and Fake Face Detection’ dataset from Kaggle [7] is used to train the models and by experimentation we are able to prove that the combined model of Convolutional Neural Network and Generative Adversarial Network outperforms other models. © Grenze Scientific Society, 2021.
  • Item
    An Improved Method for Speech Enhancement Using Convolutional Neural Network Approach
    (Institute of Electrical and Electronics Engineers Inc., 2022) Mahesh Kumar, T.N.; Hegde, P.; Deepak, K.T.; Narasimhadhan, A.V.
    In the speech processing domain Speech enhancement is one of the most widely used techniques. With the development of deep neural networks and the availability of powerful hardware, multiple deep learning-based speech enhancement models have come up in recent years. In this work, the speech enhancement technique using a Convolutional Neural Network(CNN) as Denoising Autoencoders (DAEs) is investigated and compared with the conventional feed-forward topology. Further, The proposed model is analyzed at various SNR levels to process the corrupted english speech and also tested on unseen speech data which includes additional SNR levels. It is observed from simulation results that the proposed model outperforms the existing model in terms of Perceptual Evaluation of Speech Quality (PESQ) and Log Spectral Distance (LSD). The network achieved 3% higher scores than feed-forward neural networks, and it is found that the convolutional DAEs perform better than feed-forward counterparts. © 2022 IEEE.
  • Item
    Human Activity Recognition for Online Examination Environment Using CNN
    (Springer Science and Business Media Deutschland GmbH, 2023) Ramu, S.; Guddeti, R.M.R.; Mohan, B.R.
    Human Activity Recognition (HAR) is an intelligent system that recognizes activities based on a sequence of observations about human behavior. Human activity recognition is essential in human-to-human interactions to identify interesting patterns. It is not easy to extract patterns since it contains information about a person’s identity, personality, and state of mind. Many studies have been conducted on recognizing human behavior using machine learning techniques. However, HAR in an online examination environment has not yet been explored. As a result, the primary focus of this work is on the recognition of human activity in the context of an online examination. This work aims to classify normal and abnormal behavior during an online examination employing the Convolutional Neural Network (CNN) technique. In this work, we considered two, three and four layered CNN architectures and we fine-tuned the hyper-parameters of CNN architectures for obtaining better results. The three layered CNN architecture performed better than other CNN architectures in terms of accuracy. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
  • Item
    Automatic Abnormality Detection in Musculoskeletal Radiographs Using Ensemble of Pre-trained Networks
    (Springer Science and Business Media Deutschland GmbH, 2023) Verma, R.; Jain, S.; Saritha, S.K.; Dodia, S.
    Musculoskeletal disability (MSDs) defined as the injuries that affect the movement or musculoskeletal system of the human body. Over the worldwide, it is the second most cause of physical disability. Musculoskeletal disability worsens over time and can result in long-term discomfort and severe disability. As a result, early detection and diagnosis of these anomalies is essential. But the diagnosis process is very time consuming, error prone and required diagnostic professional. Deep learning algorithms have recently been applied in medical imaging that provides a robust platform with very reliable outcomes. The development of Computer Aided Detection (CAD) system extensively speed up the diagnosis process. In this paper, a weighted ensemble model has been proposed, which is the combination of three pre-trained models (DenseNet169, MobileNet, and XceptionNet). The weighted ensemble model is tested on MURA dataset, a large public dataset provided by Stanford ML Group. Our model achieved a cohen’s kappa score 0.739 with precision of 0.885 and recall of 0.854, which is higher than many existing approaches such as densenet169 and ensemble200 model. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
  • Item
    Detection and Visualization of Corroded Surfaces Using Machine Learning
    (Springer Science and Business Media Deutschland GmbH, 2024) Shrivathsa, B.J.; Dhanya, R.; Meghana Nayak, D.; Pavan, G.S.
    The use of artificial intelligence in asset management greatly assists the industry and structural health monitoring systems. Using machine learning techniques for asset inspections can increase safety, reduce access costs, provide objective classification, and improve digital asset management systems. The detection and visualization of corrosion from digital images present significant advantages like automation, access to remote locations, mitigation of risk of inspectors, cost savings, and detecting speed. This paper used deep learning convolutional neural networks to build simple corrosion detection models and used an extreme gradient boosting algorithm to visualize the corroded surfaces. A large dataset of 1900 images with corrosion and without corrosion was collected using web scraping techniques and labeled accordingly. Training a deep learning model requires massive and high-resolution image datasets and intensive image labeling to approach high-level accuracy. The results and findings will improve the development of deep learning models for detecting and visualizing specific features on corroded surfaces. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
  • Item
    Recent developments in wireless capsule endoscopy imaging: Compression and summarization techniques
    (Elsevier Ltd, 2022) Sushma, B.; Aparna., P.
    Wireless capsule endoscopy (WCE) can be viewed as an innovative technology introduced in the medical domain to directly visualize the digestive system using a battery-powered electronic capsule. It is considered a desirable substitute for conventional digestive tract diagnostic methods for a comfortable and painless inspection. Despite many benefits, WCE results in poor video quality due to low frame resolution and diagnostic accuracy. Many research groups have presented diversified, low-complexity compression techniques to economize battery power consumed in the radio-frequency transmission of the captured video, which allows for capturing the images at high resolution. Many vision-based computational methods have been developed to improve the diagnostic yield. These methods include approaches for automatically detecting abnormalities and reducing the amount of time needed for video analysis. Though various research works have been put forth in the WCE imaging field, there is still a wide gap between the existing techniques and the current needs. Hence, this article systematically reviews recent WCE video compression and summarization techniques. The review's objectives are as follows: First, to provide the details of the requirement, challenges and design percepts for the low complexity WCE video compressor. Second, to discuss the most recent compression methods, emphasizing simple distributed video coding methods. Next, to review the most recent summarization techniques and the significance of using deep neural networks. Further, this review aims to provide a quantitative analysis of the state-of-the-art methods along with their advantages and drawbacks. At last, to discuss existing problems and possible future directions for building a robust WCE imaging framework. © 2022 Elsevier Ltd