Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 7 of 7
  • Item
    COVID-19 Prediction Using Chest X-rays Images
    (Institute of Electrical and Electronics Engineers Inc., 2021) Kumar, A.; Sharma, N.; Naik, D.
    Understanding covid-19 became very important since large scale vaccination of this was not possible. Chest X-ray is the first imaging technique that plays an important role in the diagnosis of COVID-19 disease. Till now in various fields, great success has been achieved using convolutional neural networks(CNNs) for image recognition and classification. However, due to the limited availability of annotated medical images, the classification of medical images remains the biggest challenge in medical diagnosis. The proposed research work has performed transfer learning using deep learning models like Resnet50 and VGG16 and compare their performance with a newly developed CNN based model. Resnet50 and VGG16 are state of the art models and have been used extensively. A comparative analysis with them will give us an idea of how good our model is. Also, this research work develops a CNN model as it is expected to perform really good on image classification related problems. The proposed research work has used kaggle radiography dataset for training, validating and testing. Moreover, this research work has used another x-ray images dataset which have been created from two different sources. The result shows that the CNN model developed by us outperforms VGG16 and Resnet50 model. © 2021 IEEE.
  • Item
    Deep Learning based detection of Diabetic Retinopathy from Inexpensive fundus imaging techniques
    (Institute of Electrical and Electronics Engineers Inc., 2021) Mukesh, B.R.; Harish, T.; Mayya, V.; Kamath S․, S.
    Diabetic Retinopathy is the leading cause of blindness across the world as per statistics published by the World Health Organization. Recently, there has been significant research on adopting deep learning methodologies to automate and improve the process of evaluating the advent and progress of chronic eye diseases using eye fundus images. Typically, eye fundus imaging equipment is used by trained specialists for evaluating eye health, however, fundus imaging tends to be expensive, and also the high-end equipment used is typically available in large hospitals and urban areas. This cost barrier leads to an imbalance in care between the developed and developing parts of the world. In this paper, we propose an inexpensive stand-in for such a device and a deep neural model pipeline that is able to analyze these images to determine the need for further evaluation from a trained ophthalmologist. The pipeline is able to achieve an AUC score of 0.9781 in detecting Referable DR. We also benchmark the proposed deep learning pipeline against other pipelines on standard datasets to demonstrate the capability of the network. © 2021 IEEE.
  • Item
    Loss Optimised Video Captioning using Deep-LSTM, Attention Mechanism and Weighted Loss Metrices
    (Institute of Electrical and Electronics Engineers Inc., 2021) Yadav, N.; Naik, D.
    The aim of the video captioning task is to use multiple natural-language sentences to define video content. Photographic, graphical, and auditory data are all used in the videos. Our goal is to investigate and recognize the video's visual features, as well as to create a caption so that anyone can get the video's information within a second. Despite the fact, that phase encoder-decoder models have made significant progress, but it still needs many improvements. In the present work, we enhanced the top-down architecture using Bahdanau Attention, Deep-Long Short-Term Memory (Deep-LSTM) and weighted loss function. VGG16 is used to extract the features from the frames. To understand the actions in the video, Deep-LSTM is paired with an attention system. On the MSVD dataset, we analysed the efficiency of our model, which indicates a major improvement over the other state-of-art model. © 2021 IEEE.
  • Item
    Deployment of Computer Vision Application on Edge Platform
    (Institute of Electrical and Electronics Engineers Inc., 2021) Geetha, V.; Kiran, C.; Sharma, M.; Rakshith Kumar, J.
    In our work, we propose a low cost device which will aid visually impaired people to understand what is in their surroundings without the requirement of internet. Current technology makes use of Cloud Architecture and would require internet to achieve this purpose. But these systems will not work in areas with poor internet connectivity. Edge platform built on Raspberry Pi powered with Intel Neural Compute Stick is used by us for this purpose. Multi Label Image Classification Deep Learning Model is trained in the cloud. It is later optimised and deployed on Edge Device which is Raspberry Pi. Setup also consists of PiCamera which will record the video and give it as input to deployed model. Model will describe the items present in video, basically describing the surroundings. The output is in the form of audio which is played through speakers, thus enabling visually impaired people to understand their surroundings without the requirement of internet. Deployment of popular Machine Learning and Deep Learning Models is also examined in the edge device and a comprehensive performance evaluation is performed. © 2021 IEEE.
  • Item
    Generating Short Video Description using Deep-LSTM and Attention Mechanism
    (Institute of Electrical and Electronics Engineers Inc., 2021) Yadav, N.; Naik, D.
    In modern days, extensive amount of data is produced from videos, because most of the populations have video capturing devices such as mobile phone, camera, etc. The video comprises of photographic data, textual data, and auditory data. Our aim is to investigate and recognize the visual feature of the video and to generate the caption so that users can get the information of the video in an instant of time. Many technologies capture static content of the frame but for video captioning, dynamic information is more important compared to static information. In this work, we introduced an Encoder-Decoder architecture using Deep-Long Short-Term Memory (Deep-LSTM) and Bahdanau Attention. In the encoder, Convolution Neural Network (CNN) VGG16 and Deep-LSTM are used for deducing information from frames and Deep-LSTM combined with attention mechanism for describing action performed in the video. We evaluated the performance of our model on MSVD dataset, which shows significant improvement as compared to the other video captioning model. © 2021 IEEE.
  • Item
    Multi-stream Multi-attention Deep Neural Network for Context-Aware Human Action Recognition
    (Institute of Electrical and Electronics Engineers Inc., 2022) Rashmi, M.; Guddeti, R.M.R.
    Technological innovations in deep learning models have enabled reasonably close solutions to a wide variety of computer vision tasks such as object detection, face recognition, and many more. On the other hand, Human Action Recognition (HAR) is still far from human-level ability due to several challenges such as diversity in performing actions. Due to data availability in multiple modalities, HAR using video data recorded by RGB-D cameras is frequently used in current research. This paper proposes an approach for recognizing human actions using depth and skeleton data captured using the Kinect depth sensor. Attention modules have been introduced in recent years to assist in focusing on the most important features in computer vision tasks. This paper proposes a multi-stream deep learning model with multiple attention blocks for HAR. At first, the depth and skeletal modalities' action data are represented using two distinct action descriptors. Each generates an image from the action data gathered from numerous frames. The proposed deep learning model is trained using these descriptors. Additionally, we propose a set of score fusion techniques for accurate HAR using all the features and trained CNN + LSTM streams. The proposed method is evaluated on two benchmark datasets using well known cross-subject evaluation protocol. The proposed technique achieved 89.83% and 90.7% accuracy on the MSRAction3D and UTDMHAD datasets, respectively. The experimental results establish the validity and effectiveness of the proposed model. © 2022 IEEE.
  • Item
    Comparative Analysis of Machine Learning Algorithms for Disease Detection in Apple Leaves
    (Institute of Electrical and Electronics Engineers Inc., 2022) Sai, A.M.; Patil, N.
    Leaves serve as unique indicators to distinguish the diseased leaves because the image information of the leaf changes when it is suffering from some disease. To detect these diseases, we need to recognize the patterns formed by the diseases in the leaves. Generally, plants are observed with a naked eye by either experts or farmers to detect and identify the plants. But this method can be expensive and time processing; therefore, it is essential to automate crop disease diagnosis in regions with few experts. This work revolves around an approach to developing a plant disease detection model based on apple leaves. The proposed methodology uses the following three feature extraction techniques: Hu Moments, Haralick Texture, and Color Histogram. The research work provides a comparative analysis of machine learning models for detecting diseases in apple leaves, namely: Black Rot, Cedar Apple Rust, and Apple Scab. The model is evaluated on a subset of the 'Plant Village Dataset' dealing with apple leaves. Out of all the machine learning models fitted, Random Forest has obtained the highest test accuracy of 98.125 percent. © 2022 IEEE.