Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 9 of 9

Skeleton based Human Action Recognition for Smart City Application using Deep Learning
(Institute of Electrical and Electronics Engineers Inc., 2020) Rashmi, M.; Guddeti, R.M.R.
These days the Human Action Recognition (HAR) is playing a vital role in several applications such as surveillance systems, gaming, robotics, and so on. Interpreting the actions performed by a person from the video is one of the essential tasks of intelligent surveillance systems in the smart city, smart building, etc. Human action can be recognized either by using models such as depth, skeleton, or combinations of these models. In this paper, we propose the human action recognition system based on the 3D skeleton model. Since the role of different joints varies while performing the action, in the proposed work, we use the most informative distance and the angle between joints in the skeleton model as a feature set. Further, we propose a deep learning framework for human action recognition based on these features. We performed experiments using MSRAction3D, a publicly available dataset for 3D HAR, and the results demonstrated that the proposed framework obtained the accuracies of 95.83%, 92.9%, and 98.63% on three subsets of the dataset AS1, AS2, and AS3, respectively, using the protocols of [19]. Â© 2020 IEEE.
Skeleton-Based Human Action Recognition Using Motion and Orientation of Joints
(Springer Science and Business Media Deutschland GmbH, 2022) Ghosh, S.K.; Rashmi, M.; Mohan, B.R.; Guddeti, R.M.R.
Perceiving human actions accurately from a video is one of the most challenging tasks demanded by many real-time applications in smart environments. Recently, several approaches have been proposed for human action representation and further recognizing actions from the videos using different data modalities. Especially in the case of images, deep learning-based approaches have demonstrated their classification efficiency. Here, we propose an effective framework for representing actions based on features obtained from 3D skeleton data of humans performing actions. We utilized motion, pose orientation, and transition orientation of skeleton joints for action representation in the proposed work. In addition, we introduced a lightweight convolutional neural network model for learning features from action representations in order to recognize the different actions. We evaluated the proposed system on two publicly available datasets using a cross-subject evaluation protocol, and the results showed better performance compared to the existing methods. Â© 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Vision-based Hand Gesture Interface for Real-time Computer Operation Control
(Institute of Electrical and Electronics Engineers Inc., 2022) Praneeth, G.; Recharla, R.; Prakash, A.S.; Rashmi, M.; Guddeti, R.M.R.
Humans typically perform simple actions with hand gestures. If a computer interprets gestures, then human-computer interaction can be enhanced. This paper proposes hand gesture-based interface for controlling computer operations using deep learning and custom dataset. Â© 2022 IEEE.
Fall Detection and Elderly Monitoring System Using the CNN
(Springer Science and Business Media Deutschland GmbH, 2023) Reddy Anakala, V.M.; Rashmi, M.; Natesha, B.V.; Reddy Guddeti, R.M.
Fall detection has become a critical concern in the medical and healthcare fields due to the growing population of the elderly people. The research on fall and movement detection using wearable devices has made strides. Accurately recognizing the fall behavior in surveillance video and providing the early feedback can significantly minimize the fall-related injury and death of elderly people. However, the fall event is highly dynamic, impairing categorization accuracy. The current study sought to construct a fall detection architecture based on deep learning to predict falls and the Activities of Daily Living (ADLs). This paper proposes an efficient method for representing extracted features as RGB images and a CNN model for learning the features needed for accurate fall detection. Additionally, the proposed CNN model is used to test for and locate the target in video using threshold-based categorization. The suggested CNN model was evaluated on the SisFall dataset and was found to be capable of detecting falls prior to impact with a sensitivity of 100%, a specificity of 96.48%, and a response time of 223ms. The experimental findings attained an overall accuracy of 97.43%. Â© 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Surveillance video analysis for student action recognition and localization inside computer laboratories of a smart campus
(Springer, 2021) Rashmi, M.; Ashwin, T.S.; Guddeti, G.R.M.
In the era of smart campus, unobtrusive methods for students’ monitoring is a challenging task. The monitoring system must have the ability to recognize and detect the actions performed by the students. Recently many deep neural network based approaches have been proposed to automate Human Action Recognition (HAR) in different domains, but these are not explored in learning environments. HAR can be used in classrooms, laboratories, and libraries to make the teaching-learning process more effective. To make the learning process more effective in computer laboratories, in this study, we proposed a system for recognition and localization of student actions from still images extracted from (Closed Circuit Television) CCTV videos. The proposed method uses (You Only Look Once) YOLOv3, state-of-the-art real-time object detection technology, for localization, recognition of students’ actions. Further, the image template matching method is used to decrease the number of image frames and thus processing the video quickly. As actions performed by the humans are domain specific and since no standard dataset is available for students’ action recognition in smart computer laboratories, thus we created the STUDENT ACTION dataset using the image frames obtained from the CCTV cameras placed in the computer laboratory of a university campus. The proposed method recognizes various actions performed by students in different locations within an image frame. It shows excellent performance in identifying the actions with more samples compared to actions with fewer samples. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.
Human identification system using 3D skeleton-based gait features and LSTM model
(Academic Press Inc., 2022) Rashmi, M.; Guddeti, R.M.R.
Vision-based gait emerged as the preferred biometric in smart surveillance systems due to its unobtrusive nature. Recent advancements in low-cost depth sensors resulted in numerous 3D skeleton-based gait analysis techniques. For spatial–temporal analysis, existing state-of-the-art algorithms use frame-level information as the timestamp. This paper proposes gait event-level spatial–temporal features and LSTM-based deep learning model that treats each gait event as a timestamp to identify individuals from walking patterns observed in single and multi-view scenarios. On four publicly available datasets, the proposed system stands superior to state-of-the-art approaches utilizing a variety of conventional benchmark protocols. The proposed system achieved a recognition rate of greater than 99% in low-level ranks during the CMC test, making it suitable for practical applications. The statistical study of gait event-level features demonstrated retrieved features’ discriminating capacity in classification. Additionally, the ANOVA test performed on findings from K folds demonstrated the proposed system's significance in human identification. © 2021 Elsevier Inc.
Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
(Springer, 2023) Ghosh, S.K.; Rashmi, M.; Mohan, B.R.; Guddeti, R.M.R.
Human Action Recognition (HAR) is a fundamental challenge that smart surveillance systems must overcome. With the rising affordability of capturing human actions with more advanced depth cameras, HAR has garnered increased interest over the years, however the majority of these efforts have been on single-view HAR. Recognizing human actions from arbitrary viewpoints is more challenging, as the same action is observed differently from different angles. This paper proposes a multi-stream Convolutional Neural Network (CNN) model for multi-view HAR using depth and skeleton data. We also propose a novel and efficient depth descriptor, Edge Detected-Motion History Image (ED-MHI), based on Canny Edge Detection and Motion History Image. Also, the proposed skeleton descriptor, Motion and Orientation of Joints (MOJ), represent the appropriate action by using joint motion and orientation. Experimental results on two datasets of human actions: NUCLA Multiview Action3D and NTU RGB-D using a Cross-subject evaluation protocol demonstrated that the proposed system exhibits the superior performance as compared to the state-of-the-art works with 93.87% and 85.61% accuracy, respectively. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Exploiting skeleton-based gait events with attention-guided residual deep learning model for human identification
(Springer, 2023) Rashmi, M.; Guddeti, R.M.R.
Human identification using unobtrusive visual features is a daunting task in smart environments. Gait is among adequate biometric features when the camera cannot correctly capture the human face due to environmental factors. In recent years, gait-based human identification using skeleton data has been intensively studied using a variety of feature extractors and more sophisticated deep learning models. Although skeleton data is susceptible to changes in covariate variables, resulting in noisy data, most existing algorithms employ a single feature extraction technique for all frames to generate frame-level feature maps. This results in degraded performance and additional features, necessitating increased computing power. This paper proposes a robust feature extractor that extracts a quantitative summary of gait event-specific information, thereby reducing the total number of features throughout the gait cycle. In addition, a novel Attention-guided LSTM-based deep learning model with residual connections is proposed to learn the extracted features for gait recognition. The proposed approach outperforms the state-of-the-art works on five publicly available datasets on various benchmark evaluation protocols and metrics. Further, the CMC test revealed that the proposed model obtained higher than 97% Accuracy in lower-level ranks on these datasets. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Human action recognition using multi-stream attention-based deep networks with heterogeneous data from overlapping sub-actions
(Springer Science and Business Media Deutschland GmbH, 2024) Rashmi, M.; Guddeti, R.M.R.
Vision-based Human Action Recognition is difficult owing to the variations in the same action performed by various people, the temporal variations in actions, and the difference in viewing angles. Researchers have recently adopted multi-modal visual data fusion strategies to address the limitations of single-modality methodologies. Many researchers strive to produce more discriminative features because most existing techniques’ success relies on feature representation in the data modality under consideration. Human action consists of several sub-actions whose duration vary between individuals. This paper proposes a multifarious learning framework employing action data in depth and skeleton formats. Firstly, a novel action representation named Multiple Sub-action Enhanced Depth Motion Map (MS-EDMM), integrating depth features from overlapping sub-actions, is proposed. Secondly, an efficient method is introduced for extracting spatio-temporal features from skeleton data. This is achieved by dividing the skeleton sequence into sub-actions and summarizing skeleton joint information for five distinct human body regions. Next, a multi-stream deep learning model with Attention-guided CNN and residual LSTM is proposed for classification, followed by several score fusion operations to reap the benefits of streams trained with multiple data types. The proposed method demonstrated a superior performance of 1.62% over an existing method that utilized skeleton and depth data, achieving an accuracy 89.76% on a single-view UTD-MHAD dataset. Furthermore, on the multi-view NTU RGB+D dataset demonstrated encouraging performance with an accuracy of 89.75% in cross-view and 83.8% in cross-subject evaluations. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results