Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    Compression of Convolution Neural Network Using Structured Pruning
    (Institute of Electrical and Electronics Engineers Inc., 2022) Pragnesh, T.; Mohan, B.R.
    Deep Neural Network(DNN) is currently solving many real-life problems with excellent accuracy. However, de-signing a compact neural network and training them from scratch face two challenges. First, as in many problems, data-sets are relatively small; the model starts to overfit and has low validation accuracy. Second, training from scratch requires substantial computational resources. So many developers use transfer learning where we start from a standard model such as VGGNet with pre-trained weights. The pre-trained model, trained on a similar problem of high complexity. For example, for the Image Classification Problem, one can use VGG16, ResNet, AlexaNet, and GoogleNet. These pre-trained models are trained on ImageNet Dataset with millions of images of 1000 different classes. Such pre-trained models are enormous, and computation cost is huge during inference, making it unusable for many real-life situations where we need to deploy the model on resource-constrained devices. Thus, much work is going on to compress the standard pre-trained model to achieve the required accuracy with minimum computational cost. There are two types of pruning techniques. (i) Unstructured pruning: parameter-based pruning that prunes individual parameters. (ii) Structured pruning: Here, we prune a set of parameters that perform specific operations such as activation neurons and convolution operations. This paper focuses on structured pruning as it directly results in compression and faster execution. There are two strategies for structured pruning. (i) Saliency-based approach where we compute the impact of parameters on output and remove parameters with minimum value. The second one is similarities based where we find the redundant features and remove one of them such that pruning makes a minimum change in output. In this paper, we combine both the approach where we for the initial iteration we perform pruning based on saliency and later iteration; we perform pruning based on similarity-aware approach. Here we observed that a combined approach leads to better results for pruning. © 2022 IEEE.
  • Item
    Kernel-Level Pruning for CNN
    (Springer Science and Business Media Deutschland GmbH, 2023) Pragnesh, T.; Mohan, B.R.
    Deep learning solves many real-life problems with excellent accuracy, designing a model from scratch face two challenges. The first one is that the dataset size for many applications is relatively small, which leads to overfitting. The second challenge is that the computational cost for training will be very high when we have a huge dataset. Most developer prefers transfer learning where we choose a standard pre-train model like VGGNet, ResNet, GoogLeNet. These pre-train models are trained on a similar problem with a huge dataset. For example, for the Image Classification problem, most developers choose a model trained on the ImageNet dataset. ImageNet dataset has 1000 images of each 1000 different classes, so the total number of images is 1000 × 1000 of size 224 × 224 each. Pre-train models are extensive and computationally expensive during inference time, making it challenging to deploy in real-life applications. The recent trend in research is to compress deep neural networks to reduce computational cost and memory requirements. In this paper, we focus on kernel-level pruning. We have reduced pruning sparsity to 30 to 40% with a nominal drop in the accuracy to 7%. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
  • Item
    Comparative Study of Pruning Techniques in Recurrent Neural Networks
    (Springer Science and Business Media Deutschland GmbH, 2023) Choudhury, S.; Rout, A.K.; Pragnesh, T.; Mohan, B.R.
    In recent years, there has been a drastic development in the field of neural networks. They have evolved from simple feed-forward neural networks to more complex neural networks such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs). CNNs are used for tasks such as image recognition where the sequence is not essential, while RNNs are useful when order is important such as machine translation. By increasing the number of layers in the network, we can improve the performance of the neural network (Alford et al. in Pruned and structurally sparse neural networks, 2018 [1]). However, this will also increase the complexity of the network, and also training will require more power and time. By introducing sparsity in the architecture of the neural network, we can tackle this problem. Pruning is one of the processes through which a neural network can be made sparse (Zhu and Gupta in To prune, or not to prune: exploring the efficacy of pruning for model compression, 2017 [2]). Sparse RNNs can be easily implemented on mobile devices and resource-constraint servers (Wen et al. in Learning intrinsic sparse structures within long short-term memory, 2017 [3]). We investigate the following methods to induce sparsity in RNNs: RNN pruning and automated gradual pruning. We also investigate how the pruning techniques impact the model’s performance and provide a detailed comparison between the two techniques. We also experiment by pruning input-to-hidden and hidden-to-hidden weights. Based on the results of pruning experiments, we conclude that it is possible to reduce the complexity of RNNs by more than 80%. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
  • Item
    Comparing Different Sequences of Pruning Algorithms for Hybrid Pruning
    (Institute of Electrical and Electronics Engineers Inc., 2023) Pragnesh, T.; Mohan, B.R.
    Most developers face two significant issues while designing the architecture of a neural network. First, the available dataset for many real-life problems is relatively small, leading to overfitting. Second, When a dataset is large enough, the computation cost to train the model on a given dataset is enormous. Thus most developers use transfer learning with a standard model like VGGNet, ResNet, and GoogleNet. These standard models are memory and computationally expensive during inference, making them infeasible to deploy on resource-constrained devices. The recent research trend is to compress the standard model used for transfer learning to reduce memory and computing costs. In CNN, approximately 10% parameters are present in the convolution layer, contributing to 90% of computational cost, while 90% parameters are present in dense, contributing 10% of computational cost. This paper focuses on the structure pruning of parameters in the convolution layer to reduce computational costs. Here we explore and compare the following pruning technique, 1) Channel pruning with a quantitative score, 2) Kernel pruning with a quantitative score, 3) Channel pruning with a similarity score, and 4) Kernel pruning with a similarity score. Finally, as mentioned earlier, we try several combinations of pruning to form a hybrid pruning. © 2023 IEEE.