Please use this identifier to cite or link to this item: https://idr.nitk.ac.in/jspui/handle/123456789/7493
Title: A Hardware Accelerator Based on Quantized Weights for Deep Neural Networks
Authors: Sreehari, R.
Deepu, V.
Arulalan, M.R.
Issue Date: 2019
Citation: Lecture Notes in Electrical Engineering, 2019, Vol.545, , pp.1079-1091
Abstract: The paper describes the implementation of systolic array-based hardware accelerator for multilayer perceptrons (MLP) on FPGA. Full precision hardware implementation of neural network increases resource utilization. Therefore, it is difficult to fit large neural networks on FPGA. Moreover, these implementations have high power consumption. Neural networks are implemented with numerous multiply and accumulate (MAC) units. The multipliers in these MAC units are expensive in terms of power. Algorithms have been proposed which quantize the weights and eliminate the need of multipliers in a neural network without compromising much on classification accuracy. The algorithms replace MAC units with simple accumulators. Quantized weights minimize the weight storage requirements. Quantizing inputs and constraining activations along with weights simplify the adder as well as further reduce the resource utilization. A systolic array-based architecture of neural network has been implemented on FPGA. The architecture has been modified according to Binary Connect and Ternary Connect algorithms which quantize the weights into two and three levels, respectively. The final variant of the architecture has been designed and implemented with quantized inputs, Ternary connect algorithm and activations constrained to +1 and ?1. All the implementations have been verified with MNIST data set. Classification accuracy of hardware implementations has been found comparable with its software counterparts. The designed hardware accelerator has achieved reduction in flip-flop utilization by 7.5 times compared to the basic hardware implementation of neural network with high precision weights, inputs and normal MAC units. The power consumption also has got reduced by half and the delay of critical path decreased by three times. Thus, larger neural networks can be implemented on FPGA that can run at high frequencies with less power. � 2019, Springer Nature Singapore Pte Ltd.
URI: http://idr.nitk.ac.in/jspui/handle/123456789/7493
Appears in Collections:2. Conference Papers

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.