FPGA based Simulation Acceleration of on-Chip Networks

Khyamling

Please use this identifier to cite or link to this item: https://idr.nitk.ac.in/jspui/handle/123456789/17034

Title:	FPGA based Simulation Acceleration of on-Chip Networks
Authors:	Khyamling
Supervisors:	Talawar, Basavaraj.
Keywords:	Department of Computer Science & Engineering;Network-on-chip (NoC);Field Programmable Gate Arrays (FPGAs);Simulation framework;Simulation Acceleration;Performance Analysis;DSP48E1;Block RAM;Adaptive Routing
Issue Date:	2021
Publisher:	National Institute of Technology Karnataka, Surathkal
Abstract:	As the number of processing cores in the Systems-on-Chip(SoC) increases, the traditional bus based interconnect will be the major bottleneck to achieving the performance required by modern applications. Further, bus based communication may not provide the required bandwidth and latency to the systems with intensive parallel communication. An efficient interconnection architecture is required to achieve high performance and scalability in many-cores SoC. The Network-on-Chip(NoC) architecture has emerged as the most promising interconnection architecture for the modern Chip Multiprocessor( CMP) and Multi/Many-Processor System-on-Chip(MPSoC) systems. The components in these systems, the cores, accelerators, memory blocks, and peripherals are interconnected using one or more NoCs composed of links and routers. The choice of router parameters and NoC topologies can have a significant impact on the overall performance of heterogeneous many-core systems. The evaluation methodologies of NoCs for future computing systems with a large number of interconnected components rely heavily on analytical models and simulations. The fast modeling of large scale NoCs have been done through analytical models with significant inaccuracy. Fast and flexible NoC simulator frameworks are needed for modeling the large scale NoC based heterogeneous many-core systems, which can deliver a high level of accuracy. Detailed software simulators used for design space exploration of NoCs, provide better accuracy than analytical modelings. However, software simulators are slow when simulating large scale NoCs for interconnection of various components. This thesis presents the optimization of software based NoC simulator and a Field programmable gate arrays(FPGA) based NoC simulation acceleration framework to address the issue of simulation speed, accuracy, and flexibility. Initial work in the thesis involves profiling of the Booksim2.0 software simulator, as it is used extensively for the design and evaluation of NoC architectures. The Booksim2.0 is profiled with the various NoC design parameters and memory configurations to analyze its performance. The performance analysis of Booksim2.0 is based on cache misses, memory usage, and hotspots. Profiling helped in applying focussed software optimization techniques on the simulator. Further, Booksim2.0 was parallelized using OpenMP and SIMD constructs to improve its overall performance. Going beyond software optimization, an FPGA based NoC simulation acceleration framework called YaNoC is proposed to explore the impact of microarchitectural parameters on the performance of the NoC. YaNoC supports for design space exploration of custom topologies with custom routing algorithm along with standard minimal routing algorithm for conventional NoCs. The YaNoC is used to study NoC architectures of a CMP using various traffic patterns, the results show that the YaNoC utilize fewer FPGA resources and is faster than the other state-of-art FPGA based NoC simulation acceleration platforms. The next challenge was to optimize the resources consumed by YaNoC. The FPGA fabric provides hard resources such as Block RAM(BRAM) and DSP48E1 units along with specialized interconnect. Most of the state-of-art FPGA based simulators utilize soft logic only for modeling the NoCs, leaving out the hard blocks to be unutilized. The Input buffer and crossbar functionality of NoC routers embed onto the hard block of Xilinx BRAM and DSP48E1 units thereby reducing the dependence on soft logic. A pure configurable logic block implementation and a hard block based implementation of the NoC router exhibit identical latency and performance behaviour. The utilization of hard units for the design of NoCs results in high performance with low cost design compared to state-of-art frameworks. Next, the design of an FPGA based parameterized framework called P-NoC with configurable Topology, Router and Traffic modules for performance evaluation and design space exploration has been presented. The P-NoC enables the designer to choose from a variety of architectural parameters like Input buffers, Virtual Channels, routing algorithms, traffic patterns, topology for exploration of NoC design. The P-NoC also supports a flexible communication model and traffic generation. In the last piece of work, an FPGA based NoC using a low latency router with a look ahead bypass(LBNoC) has been proposed. The LBNoC design targets the optimized ii area with improved network performance. The techniques such as a single-cycle router bypass, adaptive routing module, parallel Virtual Channel (VC), and Switch allocation, combined virtual cut through and wormhole switching, have been employed in the designing optimized LBNoC router. The LBNoC architecture consumes fewer hardware resources, reduction in average packet latency and gain in speedup than the state-of-art NoC architectures.
URI:	http://idr.nitk.ac.in/jspui/handle/123456789/17034
Appears in Collections:	1. Ph.D Theses

Files in This Item:

File	Description	Size	Format
Final_Thesis_Print_Khyamling.pdf		2.9 MB	Adobe PDF	View/Open

Show full item record