Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    An Image Transmission Technique using Low-Cost Li-Fi Testbed
    (Institute of Electrical and Electronics Engineers Inc., 2021) Salvi, S.; Geetha, V.; Maru, H.; Kumar, N.; Ahmed, R.
    Visible Light Communication (VLC) or Light Fidelity (Li-Fi) with Light Emitting Diodes (LEDs) as transmitter and light sensor as receiver will turn the present lightening system into a communication system. Li-Fi based data communication provides secure communication within the luminous coverage of the light source. Thus, it has several applications in places where Radio Frequency interference is not desirable. Similar to other wireless communication techniques even Li-Fi is used for transmission and reception of digital data. Li-Fi system can also be used to transfer images from one device to another. In this paper, a preliminary study is discussed by proposing and implementing an encoding and decoding scheme for transmission of the binary image using Li-Fi. The proposed system is evaluated based on the light intensity, distance, accuracy, size of the image, image resolution, and transmission time. © 2021 IEEE.
  • Item
    Sketch-Based Image Retrieval Using Convolutional Neural Networks Based on Feature Adaptation and Relevance Feedback
    (Springer Science and Business Media Deutschland GmbH, 2022) Kumar, N.; Ahmed, R.; B Honnakasturi, V.; Kamath S․, S.; Mayya, V.
    Sketch-based Image Retrieval (SBIR) is an approach where natural images are retrieved according to the given input sketch query. SBIR has many applications, for example, searching for a product given the sketch pattern in digital catalogs, searching for missing people given their prominent features from a digital people photo repository etc. The main challenge involved in implementing such a system is the absence of semantic information in the sketch query. In this work, we propose a combination of image prepossessing and deep learning-based methods to tackle this issue. A binary image highlighting the edges in the natural image is obtained using Canny-Edge detection algorithm. The deep features were extracted by an ImageNet based CNN model. Cosine similarity and Euclidean distance measures are adopted to generate the rank list of candidate natural images. Relevance feedback using Rocchio’s method is used to adapt the query of sketch images and feature weights according to relevant images and non-relevant images. During the experimental evaluation, the proposed approach achieved a Mean average precision (MAP) of 71.84%, promising performance in retrieving relevant images for the input query sketch images. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
  • Item
    Deep Learning Framework Based on Audio–Visual Features for Video Summarization
    (Springer Science and Business Media Deutschland GmbH, 2022) Rhevanth, M.; Ahmed, R.; Shah, V.; Mohan, B.R.
    The techniques of video summarization (VS) has garnered immense interests in current generation leading to enormous applications in different computer vision domains, such as video extraction, image captioning, indexing, and browsing. By the addition of high-quality features and clusters to pick representative visual elements, conventional VS studies often aim at the success of the VS algorithms. Many of the existing VS mechanisms only take into consideration the visual aspect of the video input, thereby ignoring the influence of audio features in the generated summary. To cope with such issues, we propose an efficient video summarization technique that processes both visual and audio content while extracting key frames from the raw video input. Structural similarity index is used to check similarity between the frames, while mel-frequency cepstral coefficient (MFCC) helps in extracting features from the corresponding audio signals. By combining the previous two features, the redundant frames of the video are removed. The resultant key frames are refined using a deep convolution neural network (CNN) model to retrieve a list of candidate key frames which finally constitute the summarization of the data. The proposed system is experimented on video datasets from YouTube that contain events within them which helps in better understanding the video summary. Experimental observations indicate that with the inclusion of audio features and an efficient refinement technique, followed by an optimization function, provides better summary results as compared to standard VS techniques. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
  • Item
    Continuous Sign Language Recognition Using Leap Motion Sensor
    (Institute of Electrical and Electronics Engineers Inc., 2024) Kumar, N.; Ahmed, R.; Venkatesh, B.H.; Salvi, S.; Panjwani, Y.
    A vital communication tool that connects persons with hearing and speech impairments worldwide is sign language. Sign language involves mostly hand movements plus face gestures, which are interpreted by recognizing these gestures to form meaningful sentences. In this study, we use two machine learning models: Long Short-Term Memory (LSTM) and Support Vector Machines (SVM), to predict signs. A dataset of 42 unique sign words and 28 sentences was used to train and evaluate our models. Our method uses depth sensors, like the Leap Motion gadget, to improve sign language recognition (SLR).Worldwide, sign language is an essential communication tool for people with speech and hearing impairments. Sign languages are primarily made up of hand gestures and facial expressions, and their meaning is communicated through precise gesture interpretation. Our models were trained on a dataset containing 42 distinct sign words and 28 sentences, achieving an accuracy of 90.35% for word prediction and 98.21 for sentence prediction. The LSTM model outperformed the SVM model, which had accuracies of 85.96% and 89.58% for words and sentences, respectively. By using depth sensors like the Leap Motion device, our approach aims to enhance sign language recognition (SLR). © 2024 IEEE.