Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
    (Springer, 2023) Ghosh, S.K.; Rashmi, M.; Mohan, B.R.; Guddeti, R.M.R.
    Human Action Recognition (HAR) is a fundamental challenge that smart surveillance systems must overcome. With the rising affordability of capturing human actions with more advanced depth cameras, HAR has garnered increased interest over the years, however the majority of these efforts have been on single-view HAR. Recognizing human actions from arbitrary viewpoints is more challenging, as the same action is observed differently from different angles. This paper proposes a multi-stream Convolutional Neural Network (CNN) model for multi-view HAR using depth and skeleton data. We also propose a novel and efficient depth descriptor, Edge Detected-Motion History Image (ED-MHI), based on Canny Edge Detection and Motion History Image. Also, the proposed skeleton descriptor, Motion and Orientation of Joints (MOJ), represent the appropriate action by using joint motion and orientation. Experimental results on two datasets of human actions: NUCLA Multiview Action3D and NTU RGB-D using a Cross-subject evaluation protocol demonstrated that the proposed system exhibits the superior performance as compared to the state-of-the-art works with 93.87% and 85.61% accuracy, respectively. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
  • Item
    Channel Pruning of Transfer Learning Models Using Novel Techniques
    (Institute of Electrical and Electronics Engineers Inc., 2024) Pragnesh, P.; Mohan, B.R.
    This research paper delves into the challenges associated with deep learning models, specifically focusing on transfer learning. Despite the effectiveness of widely used models such as VGGNet, ResNet, and GoogLeNet, their deployment on resource-constrained devices is impeded by high memory bandwidth and computational costs, and to overcome these limitations, the study proposes pruning as a viable solution. Numerous parameters, particularly in fully connected layers, contribute minimally to computational costs, so we focus on convolution layers' pruning. The research explores and evaluates three innovative pruning methods: the Max3 Saliency pruning method, the K-Means clustering algorithm, and the Singular Value Decomposition (SVD) approach. The Max3 Saliency pruning method introduces a slight variation by using the three maximum values of the kernel instead of all nine to compute the saliency score. This method is the most effective, substantially reducing parameter and Floating Point Operations (FLOPs) for both VGG16 and ResNet56 models. Notably, VGG16 demonstrates a remarkable 46.19% reduction in parameters and a 61.91% reduction in FLOPs. Using the Max3 Saliency pruning method, ResNet56 shows a 35.15% reduction in parameters and FLOPs. The K-Means pruning algorithm is also successful, resulting in a 40.00% reduction in parameters for VGG16 and a 49.20% reduction in FLOPs. In the case of ResNet56, the K-Means algorithm achieved a 31.01% reduction in both parameters and FLOPs. While the Singular Value Decomposition (SVD) approach provides a new set of values for condensed channels, its overall pruning ratio is smaller than the Max3 Saliency and K-Means methods. The SVD pruning method prunes 20.07% parameter reduction and a 24.64% reduction in FLOPs achieved for VGG16, along with a 16.94% reduction in both FLOPs and parameters for ResNet56. Compared with the state-of-the-art methods, the Max3 Saliency and K-Means pruning methods performed better in Flops reduction metrics. © 2024 The Authors.
  • Item
    Enhancing Deep Compression of CNNs: A Novel Regularization Loss and the Impact of Distance Metrics
    (Institute of Electrical and Electronics Engineers Inc., 2024) Pragnesh, P.; Mohan, B.R.
    Transfer learning models tackle two critical problems in deep learning. First, for small datasets, it reduces the problem of overfitting. Second, for large datasets, it reduces the computational cost as fewer iterations are required to train the model. Standard transfer learning models such as VGGNet, ResNet, and GoogLeNet require significant memory and computational power, limiting their use on devices with limited resources. The research paper contributes to overcoming this problem by compressing the transfer learning model using channel pruning. In current times, computational cost is more significant compared to memory cost. The convolution layer with fewer parameters contributes more to computational cost. Thus, we focus on pruning the convolution layer to reduce computational cost. Total loss is a combination of prediction loss and regularization loss. Regularization loss is the sum of the magnitudes of parameter values. The training process aims to reduce total loss. In order to reduce total loss, the regularization loss also needs to be reduced. Therefore, training not only minimizes prediction error but also manages the magnitude of the model's weights. Important weights are maintained at higher values to keep the prediction loss low, while unimportant weight values can be reduced to decrease regularization loss. Thus regularization adjusts the magnitudes of parameters at varying rates, depending on their importance. Quantitative pruning methods select parameters based on their magnitude, which improves the effectiveness of the pruning process. Standard L1 and L2 regularization focus on individual parameters, aiding in unstructured pruning. However, group regularization is required for structured pruning. To address this, we introduce a novel group regularization loss designed specifically for structured channel pruning. This new regularization loss optimizes the pruning process by focusing on entire groups of parameters belonging to the channel rather than just individual ones. This method ensures that structured pruning is more efficient and targeted. Custom Standard Deviation (CSD) is calculated by summing the absolute differences between each parameter value and the mean value. To evaluate the parameters of a given channel, both the L1 norm and CSD are computed. The novel regularization loss for a channel in the convolutional layer is defined as the ratio of L1 norm to CSD (L1Norm/CSD). This approach groups the regularization loss for all parameters within a channel, making the pruning process more structured and efficient. Custom regularization loss further improves pruning efficiency, enabling a 46.14% reduction in parameters and a 61.91% decrease in FLOPs. This paper also employs the K-Means algorithm for similarity-based pruning and evaluates three distance metrics: Manhattan, Euclidean, and Cosine. Results indicate that pruning by K-Means algorithms using Manhattan distance leads to a 35.15% reduction in parameters and a 49.11% decrease in FLOPs, outperforming Euclidean and Cosine distances using the same algorithm. © 2013 IEEE.