Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 10 of 11

Skeleton based Human Action Recognition for Smart City Application using Deep Learning
(Institute of Electrical and Electronics Engineers Inc., 2020) Rashmi, M.; Guddeti, R.M.R.
These days the Human Action Recognition (HAR) is playing a vital role in several applications such as surveillance systems, gaming, robotics, and so on. Interpreting the actions performed by a person from the video is one of the essential tasks of intelligent surveillance systems in the smart city, smart building, etc. Human action can be recognized either by using models such as depth, skeleton, or combinations of these models. In this paper, we propose the human action recognition system based on the 3D skeleton model. Since the role of different joints varies while performing the action, in the proposed work, we use the most informative distance and the angle between joints in the skeleton model as a feature set. Further, we propose a deep learning framework for human action recognition based on these features. We performed experiments using MSRAction3D, a publicly available dataset for 3D HAR, and the results demonstrated that the proposed framework obtained the accuracies of 95.83%, 92.9%, and 98.63% on three subsets of the dataset AS1, AS2, and AS3, respectively, using the protocols of [19]. Â© 2020 IEEE.
Skeleton-Based Human Action Recognition Using Motion and Orientation of Joints
(Springer Science and Business Media Deutschland GmbH, 2022) Ghosh, S.K.; Rashmi, M.; Mohan, B.R.; Guddeti, R.M.R.
Perceiving human actions accurately from a video is one of the most challenging tasks demanded by many real-time applications in smart environments. Recently, several approaches have been proposed for human action representation and further recognizing actions from the videos using different data modalities. Especially in the case of images, deep learning-based approaches have demonstrated their classification efficiency. Here, we propose an effective framework for representing actions based on features obtained from 3D skeleton data of humans performing actions. We utilized motion, pose orientation, and transition orientation of skeleton joints for action representation in the proposed work. In addition, we introduced a lightweight convolutional neural network model for learning features from action representations in order to recognize the different actions. We evaluated the proposed system on two publicly available datasets using a cross-subject evaluation protocol, and the results showed better performance compared to the existing methods. Â© 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Vision-based Hand Gesture Interface for Real-time Computer Operation Control
(Institute of Electrical and Electronics Engineers Inc., 2022) Praneeth, G.; Recharla, R.; Prakash, A.S.; Rashmi, M.; Guddeti, R.M.R.
Humans typically perform simple actions with hand gestures. If a computer interprets gestures, then human-computer interaction can be enhanced. This paper proposes hand gesture-based interface for controlling computer operations using deep learning and custom dataset. Â© 2022 IEEE.
Students’ affective content analysis in smart classroom environment using deep learning techniques
(Springer New York LLC barbara.b.bertram@gsk.com, 2019) Gupta, S.K.; Ashwin, T.S.; Guddeti, R.M.R.
In the era of the smart classroom environment, students’ affective content analysis plays a vital role as it helps to foster the affective states that are beneficial to learning. Some techniques target to improve the learning rate using the students’ affective content analysis in the classroom. In this paper, a novel max margin face detection based method for students’ affective content analysis using their facial expressions is proposed. The affective content analysis includes analyzing four different moods of students’, namely: High Positive Affect, Low Positive Affect, High Negative Affect, and Low Negative Affect. Engagement scores have been calculated based upon the four moods of students as predicted by the proposed method. Further, the classroom engagement analysis is performed by considering the entire classroom as one group and the corresponding group engagement score. Expert feedback and analyzed affect content videos are used as feedback to the faculty member to improve the teaching strategy and hence improving the students’ learning rate. The proposed smart classroom system was tested for more than 100 students of four different Information Technology courses and the corresponding faculty members at National Institute of Technology Karnataka Surathkal, Mangalore, India. The experimental results demonstrate the train and test accuracy of 90.67% and 87.65%, respectively for mood classification. Furthermore, an analysis was performed over incidence, distribution and temporal dynamics of students’ affective states and promising results were obtained. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
Affective database for e-learning and classroom environments using Indian students’ faces, hand gestures and body postures
(Elsevier B.V., 2020) Ashwin, T.S.; Guddeti, R.M.R.
Automatic recognition of the students’ affective states is a challenging task. These affective states are recognized using their facial expressions, hand gestures, and body postures. An intelligent tutoring system and smart classroom environment can be made more personalized using students’ affective state analysis, and it is performed using machine or deep learning techniques. Effective recognition of affective states is mainly dependent on the quality of the database used. But, there exist very few standard databases for the students’ affective state recognition and its analysis that works for both e-learning and classroom environments. In this paper, we propose a new affective database for both the e-learning and classroom environments using the students’ facial expressions, hand gestures, and body postures. The database consists of both posed (acted) and spontaneous (natural) expressions with single and multi-person in a single image frame with more than 4000 manually annotated image frames with object localization. The classification was done manually using the gold standard study for both Ekman's basic emotions and learning-centered emotions, including neutral. The annotators reliably agree when discriminating against the recognized affective states with Cohen's ? = 0.48. The created database is more robust as it considers various image variants such as occlusion, background clutter, pose, illumination, cultural & regional background, intra-class variations, cropped images, multipoint view, and deformations. Further, we analyzed the classification accuracy of our database using a few state-of-the-art machine and deep learning techniques. Experimental results demonstrate that the convolutional neural network based architecture achieved an accuracy of 83% and 76% for detection and classification, respectively. © 2020 Elsevier B.V.
Impact of inquiry interventions on students in e-learning and classroom environments using affective computing framework
(Springer Science and Business Media B.V. editorial@springerplus.com, 2020) Ashwin, T.S.; Guddeti, R.M.R.
Effective teaching strategies improve the students’ learning rate within academic learning time. Inquiry-based instruction is one of the effective teaching strategies used in the classrooms. But these teaching strategies are not adapted in other learning environments like intelligent tutoring systems, including auto tutors. In this paper, we propose an automatic inquiry-based instruction teaching strategy, i.e., inquiry intervention using students’ affective states. The proposed model contains two modules: the first module consists of the proposed framework for predicting the unobtrusive multi-modal students’ affective states (teacher-centric attentive and in-attentive states) using the facial expressions, hand gestures and body postures. The second module consists of the proposed automated inquiry-based instruction teaching strategy to compare the learning outcomes with and without inquiry intervention using affective state transitions for both an individual and a group of students. The proposed system is tested on four different learning environments, namely: e-learning, flipped classroom, classroom and webinar environments. Unobtrusive recognition of students’ affective states is performed using deep learning architectures. After student-independent tenfold cross-validation, we obtained the students’ affective state classification accuracy of 77% and object localization accuracy of 81% using students’ faces, hand gestures and body postures. The overall experimental results demonstrate that there is a positive correlation with r= 0.74 between students’ affective states and their performance. Proposed inquiry intervention improved the students’ performance as there is a decrease of 65%, 43%, 43%, and 53% in overall in-attentive affective state instances using the inquiry interventions in e-learning, flipped classroom, classroom and webinar environments, respectively. © 2020, Springer Nature B.V.
Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues
(Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2020) Banerjee, S.; Ashwin, T.S.; Guddeti, R.M.R.
In the modern era, there is a growing need for surveillance to ensure the safety and security of the people. Real-time object detection is crucial for many applications such as traffic monitoring, security, search and rescue, vehicle counting, and classroom monitoring. Computer-enabled laboratories are generally equipped with video surveillance cameras in the smart campus. But, from the existing literature, it is observed that the use of video surveillance data obtained from smart campus for any unobtrusive behavioral analysis is seldom performed. Though there are several works on the students’ and teachers’ behavior recognition from devices such as Kinect and handy cameras, there exists no such work which extracts the video surveillance data and predicts the behavioral patterns of both the students and the teachers in real time. Hence, in this study, we unobtrusively analyze the students’ and teachers’ behavioral patterns inside a teaching laboratory (which is considered as an indoor scenario of a smart campus). Here, we propose a deep convolution network architecture to classify and recognize an object in the indoor scenario, i.e., the teaching laboratory environment of the smart campus with modified Single-Shot MultiBox Detector approach. We used six different class labels for predicting the behavioral patterns of both the students and the teachers. We created our dataset with six different class labels for training deep learning architecture. The performance evaluation demonstrates that the proposed method performs better with an accuracy of 0.765 for classification and localization. © 2020, Springer-Verlag London Ltd., part of Springer Nature.
Human identification system using 3D skeleton-based gait features and LSTM model
(Academic Press Inc., 2022) Rashmi, M.; Guddeti, R.M.R.
Vision-based gait emerged as the preferred biometric in smart surveillance systems due to its unobtrusive nature. Recent advancements in low-cost depth sensors resulted in numerous 3D skeleton-based gait analysis techniques. For spatial–temporal analysis, existing state-of-the-art algorithms use frame-level information as the timestamp. This paper proposes gait event-level spatial–temporal features and LSTM-based deep learning model that treats each gait event as a timestamp to identify individuals from walking patterns observed in single and multi-view scenarios. On four publicly available datasets, the proposed system stands superior to state-of-the-art approaches utilizing a variety of conventional benchmark protocols. The proposed system achieved a recognition rate of greater than 99% in low-level ranks during the CMC test, making it suitable for practical applications. The statistical study of gait event-level features demonstrated retrieved features’ discriminating capacity in classification. Additionally, the ANOVA test performed on findings from K folds demonstrated the proposed system's significance in human identification. © 2021 Elsevier Inc.
Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
(Springer, 2023) Ghosh, S.K.; Rashmi, M.; Mohan, B.R.; Guddeti, R.M.R.
Human Action Recognition (HAR) is a fundamental challenge that smart surveillance systems must overcome. With the rising affordability of capturing human actions with more advanced depth cameras, HAR has garnered increased interest over the years, however the majority of these efforts have been on single-view HAR. Recognizing human actions from arbitrary viewpoints is more challenging, as the same action is observed differently from different angles. This paper proposes a multi-stream Convolutional Neural Network (CNN) model for multi-view HAR using depth and skeleton data. We also propose a novel and efficient depth descriptor, Edge Detected-Motion History Image (ED-MHI), based on Canny Edge Detection and Motion History Image. Also, the proposed skeleton descriptor, Motion and Orientation of Joints (MOJ), represent the appropriate action by using joint motion and orientation. Experimental results on two datasets of human actions: NUCLA Multiview Action3D and NTU RGB-D using a Cross-subject evaluation protocol demonstrated that the proposed system exhibits the superior performance as compared to the state-of-the-art works with 93.87% and 85.61% accuracy, respectively. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Exploiting skeleton-based gait events with attention-guided residual deep learning model for human identification
(Springer, 2023) Rashmi, M.; Guddeti, R.M.R.
Human identification using unobtrusive visual features is a daunting task in smart environments. Gait is among adequate biometric features when the camera cannot correctly capture the human face due to environmental factors. In recent years, gait-based human identification using skeleton data has been intensively studied using a variety of feature extractors and more sophisticated deep learning models. Although skeleton data is susceptible to changes in covariate variables, resulting in noisy data, most existing algorithms employ a single feature extraction technique for all frames to generate frame-level feature maps. This results in degraded performance and additional features, necessitating increased computing power. This paper proposes a robust feature extractor that extracts a quantitative summary of gait event-specific information, thereby reducing the total number of features throughout the gait cycle. In addition, a novel Attention-guided LSTM-based deep learning model with residual connections is proposed to learn the extracted features for gait recognition. The proposed approach outperforms the state-of-the-art works on five publicly available datasets on various benchmark evaluation protocols and metrics. Further, the CMC test revealed that the proposed model obtained higher than 97% Accuracy in lower-level ranks on these datasets. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results