Journal Articles
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/19884
Browse
6 results
Search Results
Item Extending BookSim2.0 and HotSpot6.0 for power, performance and thermal evaluation of 3D NoC architectures(Elsevier B.V., 2019) Halavar, B.; Pasupulety, U.; Talawar, B.With the increase in number and complexity of cores and components in Chip-Multiprocessors (CMP) and Systems-on-Chip (SoCs), a highly structured and efficient on-chip communication network is required to achieve high-performance and scalability. Network-on-Chip (NoC) has emerged as a reliable communication framework in CMPs and SoCs. Many 2-D NoC architectures have been proposed for efficient on-chip communication. Cycle accurate simulators model the functionality and behaviour of NoCs by considering micro-architectural parameters of the underlying components to estimate performance, power and energy characteristics. Employing NoCs in three-dimensional integrated circuits (3D-ICs) can further improve performance, energy efficiency, and scalability characteristics of 3D SoCs and CMPs. Minimal error estimation of energy and performance of NoC components is crucial in architecture trade-off studies. Accurate modeling of re:Horizontal and vertical links by considering micro-architectural and physical characteristics reduces the error in power and performance estimation of 3D NoCs. Additionally, mapping the temperature distribution in a 3D NoC reduces estimation error. This paper presents the 3D NoC modelling capabilities extended in two existing state-of-the-art simulators, viz., the 2D NoC Simulator - BookSim2.0 and the thermal behaviour simulator - HotSpot6.0. With the extended 3D NoC modules, the simulators can be used for power, performance and thermal measurements through micro-architectural and physical parameters. The major extensions incorporated in BookSim2.0 are: Through Silicon Via power and performance models, 3D topology construction modules, 3D Mesh topology construction using variable X, Y, Z radix, tailored routing modules for 3D NoCs. The major extensions incorporated in HotSpot6.0 are: parameterized 2D router floorplan, 3D router floorplan including Through Silicon Vias (TSVs), power and thermal distribution models of 2D and 3D routers. Using the extended 3D modules, performance (average network latency), and energy efficiency metrics (Energy-Delay Product) of variants of 3D Mesh and 3D Butterfly Fat Tree topologies have been evaluated using synthetic traffic patterns. Results show that the 4-layer 3D Mesh is 2.2 × better than 2-layer 3D Mesh and 4.5 × better than 3D BFT variants in terms of network latency. 3D Mesh variants have the lowest Energy Delay Product (EDP) compared to 3D BFT variants as there is an 80% reduction in link lengths and up to 3 × more TSVs. Another observation is that the EDP of the 4-layer 3D BFT (with transpose traffic) is 1.5 × the EDP of the 4-layer 3D Mesh (with transpose traffic). Further optimizations towards a tailored 3D BFT for transpose traffic could reduce this EDP gap with the 4-layer 3D Mesh. From the 3D NoC heat maps, it was found that the edge routers in the floorplan of the tested 3D Mesh and 3D BFT topologies have the least ambient temperature. © 2019Item LBNoc: Design of low-latency router architecture with lookahead bypass for network-on-chip using FPGA(Association for Computing Machinery acmhelp@acm.org, 2020) Parane, K.; Prabhu Prasad, B.M.; Talawar, B.An FPGA-based Network-on-Chip (NoC) using a low-latency router with a look-ahead bypass (LBNoC) is discussed in this article. The proposed design targets the optimized area with improved network performance. The techniques such as single-cycle router bypass, adaptive routing module, parallel Virtual Channel (VC), and Switch allocation, combined virtual cut through and wormhole switching, have been employed in the design of the LBNoC router. The LBNoC router is parameterizable with the network topology, traffic patterns, routing algorithms, buffer depth, buffer width, number of VCs, and I/O ports being configurable. A table-based routing algorithm has been employed to support the design of custom topologies. The input buffer modules of NoC router have been mapped on the FPGA Block RAM hard blocks to utilize resources efficiently. The LBNoC architecture consumes 4.5% and 27.1% fewer hardware resources than the ProNoC and CONNECT NoC architectures. The average packet latency of the LBNoC NoC architecture is 30% and 15% lower than the CONNECT and ProNoC architectures. The LBNoC architecture is 1.15× and 1.18× faster than the ProNoC and CONNECT NoC frameworks. © 2020 Association for Computing Machinery.Item P-NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA(Springer, 2020) Parane, K.; Prabhu Prasad, B.M.; Talawar, B.The network-on-chip (NoC) has emerged as an efficient and scalable communication fabric for chip multiprocessors (CMPs) and multiprocessor system on chips (MPSoCs). The NoC architecture, the routers micro-architecture and links influence the overall performance of CMPs and MPSoCs significantly. We propose P-NoC: an FPGA-based parameterized framework for analyzing the performance of NoC architectures based on various design decision parameters in this paper. The mesh and a multi-local port mesh (ML-mesh) topologies have been considered for the study. By fine-tuning various NoC parameters and synthesizing on the FPGA, identify that the performance of NoC architectures are influenced by the configuration of router parameters and the interconnect. Experiments show that the flit width, buffer depth, virtual channels parameters have a significant impact on the FPGA resources. We analyze the performance of the NoCs on six traffic patterns viz., uniform, bit shuffle, random permutation, transpose, bit complement and nearest neighbor. Configuring the router and the interconnect parameters, the ML-mesh topology yields 75% lesser utilization of FPGA resources compared to the mesh. The ML-mesh topology shows an improvement of 33.2% in network latency under localized traffic pattern. The mesh and ML-mesh topologies have 0.53× and 0.1× higher saturation throughput under nearest neighbor traffic compared to uniform random traffic. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.Item An Efficient FPGA-Based Network-on-Chip Simulation Framework Utilizing the Hard Blocks(Birkhauser, 2020) Prabhu Prasad, B.M.; Parane, K.; Talawar, B.In multi-processor system-on-chips, on-chip interconnection plays a significant role. The type of on-chip architecture being used in an application decides the performance of that application. Hence, a quick and versatile network-on-Chip (NoC) simulator, particularly for the larger designs, is essential to explore and find the best suitable NoC configuration for individual applications. An FPGA-based NoC simulation framework has been proposed in this work. The crossbar switch of the NoC router with buffers and five ports has been embedded in the wide multiplexers of the DSP48E1 slices. The distinctive feature of dynamic mode functionality of the DSP48E1 slices every clock cycle depending on the control signals of multiplexer plays a crucial role in incorporating the crossbar functionality. A substantial decrease in the configurable logic blocks (CLBs) utilization of NoC topologies on the FPGA has been observed by embedding the functionality of the crossbar on the DSP48E1 slices. Since there is a reduction in the use of CLB resources employing the crossbar based on DSP48E1, topologies of larger sizes can be simulated. 6 × 6 Mesh topology with the DSP crossbar implementation consumes 36% fewer lookup tables (LUTs) and 40% fewer flip flops than the Mesh topology with CLB-based crossbar implementation. 41% fewer LUTs and 23% fewer slices are consumed by the proposed work with respect to the state-of-the-art CONNECT NoC generation tool. Compared to DART, a reduction of 86% and 80% in LUTs and slices has been observed with respect to the proposed work. Hoplite-DSP implements the unidirectional Torus topology with no buffers considering the deflective routing algorithm. The proposed work targets Mesh-based topologies with buffers and bidirectional ports with XY and look-ahead routing algorithms. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.Item FPGA friendly NoC simulation acceleration framework employing the hard blocks(Springer, 2021) Prasad, B.M.P.; Parane, K.; Talawar, B.A major role is played by Modeling and Simulation platforms in development of a new Network-on-Chip (NoC) architecture. The cycle accurate software simulators tend to become slow when simulating thousands of cores on a single chip. FPGAs have become the vehicle for simulation acceleration due to the properties of parallelism. Most of the state-of-the-art FPGA based NoC simulators utilize soft logic only for modeling the NoCs, leaving out the hard blocks to be unutilized. In this work, the FIFO Buffer and Crossbar switch functionalities of the NoC router have been embedded in the Block RAM (BRAMs) and the DSP48E1 slices with large multiplexer respectively. Employing the proposed techniques of mapping the NoC router components on the FPGA hard blocks, an NoC simulation acceleration framework based on the FPGA is presented in this work. A huge reduction in the use of the Configurable Logic Blocks (CLBs) has been observed when the FIFO buffer and Crossbar components of the NoC topology’s router micro-architecture are embedded in FPGA hard blocks. Our experimental results show that the topologies implemented employing the proposed FPGA friendly mapping of the NoC router components on the hard blocks consume 43.47% fewer LUTs and 41.66% fewer FFs than the topologies with CLB implementation. To optimize the latency of the NoC under consideration, a control unit called “buf_empty_checker” has been employed. A reduction in average latency has been observed compared to the CLB based topology implementation employing the proposed mapping. The proposed work consumes 10.88% fewer LUTs than the CONNECT NoC generation tool. Compared to DART, a reduction of 73.38% and 66.55% in LUTs and FFs has been observed with respect to the proposed work. The average packet latency of the proposed NoC architecture is 24.8% and 19.1% lesser than the CONNECT and DART architectures. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH, AT part of Springer Nature.Item LBF-NoC: Learning-Based Framework to Predict Performance, Power and Area for Network-On-Chip Architectures(World Scientific, 2022) Kumar, A.; Talawar, B.Extensive large-scale data and applications have increasing requests for high-performance computations which is fulfilled by Chip Multiprocessors (CMP) and System-on-Chips (SoCs). Network-on-Chips (NoCs) emerged as the reliable on-chip communication framework for CMPs and SoCs. NoC architectures are evaluated based on design parameters such as latency, area, and power. Cycle-accurate simulators are used to perform the design space exploration of NoC architectures. Cycle-accurate simulators become slow for interactive usage as the NoC topology size increases. To overcome these limitations, we employ a Machine Learning (ML) approach to predict the NoC simulation results within a short span of time. LBF-NoC: Learning-based framework is proposed to predict performance, power and area for Direct and Indirect NoC architectures. This provides chip designers with an efficient way to analyze various NoC features. LBF-NoC is modeled using distinct ML regression algorithms to predict overall performance of NoCs considering different synthetic traffic patterns. The performance metrics of five different (Mesh, Torus, Cmesh, Fat-Tree and Flattened Butterfly) NoC architectures can be analyzed using the proposed LBF-NoC framework. BookSim simulator is employed to validate the results. Various architecture sizes from 2×2 to 45×45 are used in the experiments considering various virtual channels, traffic patterns, and injection rates. The prediction error of LBF-NoC is 6% to 8%, and the overall speedup is 5000× to 5500× with respect to BookSim simulator. © 2022 World Scientific Publishing Company.
