Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 2 of 2
  • Item
    Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
    (Springer, 2023) Ghosh, S.K.; Rashmi, M.; Mohan, B.R.; Guddeti, R.M.R.
    Human Action Recognition (HAR) is a fundamental challenge that smart surveillance systems must overcome. With the rising affordability of capturing human actions with more advanced depth cameras, HAR has garnered increased interest over the years, however the majority of these efforts have been on single-view HAR. Recognizing human actions from arbitrary viewpoints is more challenging, as the same action is observed differently from different angles. This paper proposes a multi-stream Convolutional Neural Network (CNN) model for multi-view HAR using depth and skeleton data. We also propose a novel and efficient depth descriptor, Edge Detected-Motion History Image (ED-MHI), based on Canny Edge Detection and Motion History Image. Also, the proposed skeleton descriptor, Motion and Orientation of Joints (MOJ), represent the appropriate action by using joint motion and orientation. Experimental results on two datasets of human actions: NUCLA Multiview Action3D and NTU RGB-D using a Cross-subject evaluation protocol demonstrated that the proposed system exhibits the superior performance as compared to the state-of-the-art works with 93.87% and 85.61% accuracy, respectively. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
  • Item
    Human action recognition using multi-stream attention-based deep networks with heterogeneous data from overlapping sub-actions
    (Springer Science and Business Media Deutschland GmbH, 2024) Rashmi, M.; Guddeti, R.M.R.
    Vision-based Human Action Recognition is difficult owing to the variations in the same action performed by various people, the temporal variations in actions, and the difference in viewing angles. Researchers have recently adopted multi-modal visual data fusion strategies to address the limitations of single-modality methodologies. Many researchers strive to produce more discriminative features because most existing techniques’ success relies on feature representation in the data modality under consideration. Human action consists of several sub-actions whose duration vary between individuals. This paper proposes a multifarious learning framework employing action data in depth and skeleton formats. Firstly, a novel action representation named Multiple Sub-action Enhanced Depth Motion Map (MS-EDMM), integrating depth features from overlapping sub-actions, is proposed. Secondly, an efficient method is introduced for extracting spatio-temporal features from skeleton data. This is achieved by dividing the skeleton sequence into sub-actions and summarizing skeleton joint information for five distinct human body regions. Next, a multi-stream deep learning model with Attention-guided CNN and residual LSTM is proposed for classification, followed by several score fusion operations to reap the benefits of streams trained with multiple data types. The proposed method demonstrated a superior performance of 1.62% over an existing method that utilized skeleton and depth data, achieving an accuracy 89.76% on a single-view UTD-MHAD dataset. Furthermore, on the multi-view NTU RGB+D dataset demonstrated encouraging performance with an accuracy of 89.75% in cross-view and 83.8% in cross-subject evaluations. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.