Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 2 of 2

Design and Implementation of Reconfigurable Neural Network Accelerator
(Institute of Electrical and Electronics Engineers Inc., 2022) Shenoy, M.S.; Ramesh Kini, M.
General-purpose CPUs are sluggish and inefficient when used for computationally intensive applications including in neural networks. It is preferable to develop specialized hardware that can do a large number of multiply-accumulate operations rapidly and efficiently to execute such applications. The Re-configurable Neural Network Accelerator (RNNA) architecture that has been designed is appropriate for a variety of neural network applications. The computational resource requirements vary depending on the application; hence, mapping the application to the available set of resources requires reconfigurability. The fundamental unit of the RNNA is composed of a variety of Multiply-Accumulate (MAC) units, registers, and Address Generation Units (AGU). When compared to the computation performed by a single MAC array, the RNNA with four MAC arrays reduces the time required by approximately 75%. On the Nexys4 DDR Artix-7 FPGA board, RNNA was tested and implemented with a clock frequency of up to 60MHz and power consumption of 0.243W. Â© 2022 IEEE.
Implementation of Reconfigurable Deep Learning Accelerator (RDLA) on PolarFire SoC
(IEEE Computer Society, 2023) Shenoy, M.S.; Ramesh Kini, M.
In neural networks and other computationally demanding applications, general-purpose CPUs are slow and ineffective. To run such applications, it is better to create specialized hardware capable of doing several multiply-accumulate operations quickly and effectively. For a wide range of neural network applications, the Reconfigurable Deep Learning Accelerator (RDLA) architecture has been developed. The fundamental unit of the RDLA is composed of a variety of Multiply-Accumulate (MAC) units, registers, and Address Generation Units (AGU). On the PolarFire SoC, RDLA was tested and implemented with a clock frequency of up to 62.5MHz for data processing. This paper shows the results testing with different images for a custom MNIST model with 4 layers with accuracy of 97.49% with power consumption of 1.85W. Â© 2023 IEEE.

Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results