Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 10 of 36

On the Cache Behavior of SPLASH-2 Benchmarks on ARM and ALPHA Processors in Gem5 Full System Simulator
(Institute of Electrical and Electronics Engineers Inc., 2014) Vikas, B.; Talawar, B.
Today cache size and hierarchy level of caches play an important role in improving computer performance. By using full system simulations of gem5, the variation in memory bandwidth, system bus throughput, L1 and L2 cache size misses are measured by running SPLASH-2 Benchmarks on ARM and ALPHA Processors. In this work we calculate cache misses, memory bandwidth and system bus throughput by running SPLASH2 benchmarks on gem5 Full System Mode. Our results show that L1 cache misses decrease as L1 cache size is varied from 16KB to 64KB. L1 cache misses are independent of L2 cache size after the program data resides in L2 cache. The memory bandwidth and system bus throughput decreases as L1 and L2 cache size increases. Â© 2014 IEEE.
A Crossbar Interconnection Network in DNA
(Institute of Electrical and Electronics Engineers Inc., 2015) Talawar, B.
DNA computers provide exciting challenges and opportunities in the fields of computer architecture, neural networks, autonomous micromechanical devices, and chemical reaction networks. The advent of digital abstractions such as the seesaw gates hold many opportunities for computer architects to realize complex digital circuits using DNA strand displacement principles. The paper presents a realization of a single bit, 2Ã—2 crossbar interconnection network built using seesaw gates. The functional correctness of the implemented crossbar was verified using a chemical reaction simulator. Â© 2015 IEEE.
Analysis of ring topology for NoC architecture
(Institute of Electrical and Electronics Engineers Inc., 2016) Kamath, A.; Saxena, G.; Talawar, B.
In recent years, Network on Chips (NoCs) have provided an efficient solution for interconnecting various heterogeneous intellectual properties (IPs) on a System on Chip (SoC) in an efficient, flexible and scalable manner. Virtual channels in the buffers associated with the core helps in introducing the parallelism between the packets as well as in improving the performance of the network. However, allocating a uniform size of the buffer to these channels is not always suitable. The network efficiency can be improved by allocating the buffer variably based on the traffic patterns and the node requirements. In this paper, we use ring topology as an underlying architecture for the NoC. The percentage of packet drops has been used as a parameter for comparing the performance of different architectures. Through the results of the simulations carried out in SystemC, we illustrate the impact of including virtual channels and variable buffers on the network performance. As per our results, we observed that varied buffer allocation led to a better performance and fairness in the network as compared to that of the uniform allocation. Â© 2015 IEEE.
Cache analysis and software optimizations for faster on-chip network simulations
(Institute of Electrical and Electronics Engineers Inc., 2016) Parane, K.; Prabhu Prasad, B.M.; Talawar, B.
Fast simulations are critical in reducing time to market in CMPs and SoCs. Several simulators have been used to evaluate the performance and power consumed by Network-on-Chips. Researchers and designers rely upon these simulators for design space exploration of NoC architectures. Our experiments show that simulating large NoC topologies take hours to several days for completion. To speedup the simulations, it is necessary to investigate and optimize the hotspots in simulator source code. Among several simulators available, we choose Booksim2.0, as it is being extensively used in the NoC community. In this paper, we analyze the cache and memory system behaviour of Booksim2.0 to accurately monitor input dependent performance bottlenecks. Our measurements show that cache and memory usage patterns vary widely based on the input parameters given to Booksim2.0. Based on these measurements, the cache configuration having least misses has been identified. We also employ thread parallelization and vectorization to improve the overall performance of Booksim2.0. The OpenMP programming model and SIMD are used for parallelizing and vectorizing the more time-consuming portions of Booksim2.0. Speedups of 2.93Ã— and 3.97Ã— were observed for the Mesh topology with 30 Ã— 30 network size by employing thread parallelization and vectorization respectively. Â© 2016 IEEE.
YaNoC: Yet another network-on-chip simulation acceleration engine using FPGAS
(IEEE Computer Society help@computer.org, 2018) Parane, K.; Talawar, B.; Prabhu Prasad, P.
In this paper, we present an FPGA based NoC simulation framework, YaNoC, that supports the creation of standard and custom topologies, design of routing algorithms, generation of various synthetic traffic patterns, and exploration of a full set of microarchitectural parameters. The framework supports all standard minimal routing algorithms for conventional NoCs and implements table based routing to support the creation of new routing algorithm. A custom topology called Diagonal Mesh (DMesh) has been evaluated using table based and a modified version of the XY routing algorithm. Mesh and DMesh topologies saturate at the injection rates of 45 % and 55 %. We find that the Table based routing implementation consumes 0.98Ã— fewer hardware resources than the conventional XY routing. We observed the speedup of 2548Ã— compared to the Booksim software simulator. YaNoC achieves speedup of 2.54Ã— and 25Ã— with respect to CONNECT and DART FPGA based NoC simulators. Â© 2018 IEEE.
Floorplan based performance evaluation of 3d variants of mesh and BFT networks-on-chip
(Institute of Electrical and Electronics Engineers Inc., 2018) Halavar, B.; Talawar, B.
Network on Chips(NoC) emerged as the reliable communication framework in CMPs and SoCs which enables in increase the number and complexity of cores. Many 2-D NoC architectures have been proposed for efficient on-chip communication. Cycle accurate simulators model the functionality and behavior of NoCs by considering micro-architectural parameters of the underlined components to estimate performance metric. Using 3D IC technology in NoC can lead to improved communication latency and power compared to their 2D counterpart with use of through-silicon via (TSVs) as vertical interconnect. In this paper, we explore the design space of 3D variants of the Mesh and Butterfly Fat Tree(BFT) NoCs using floorplan driven wire and TSV lengths. Analysed the performance of 2D and 3D variants of the Mesh and BFT topologies by injecting uniform traffic pattern. Results of our experiments show that, average network latency of a 4-layer 3D Mesh shows better on-chip communication performance compare to other 3D variants. In 4-layer 3D Mesh, on-chip communication performance is improved up to 2.2Ã— compare to 2D Mesh and 4.5Ã— compare to 4-layer 3D BFT. Â© 2018 IEEE.
Thermal Aware Design for Through-Silicon Via (TSV) based 3D Network-on-Chip (NoC) Architectures
(Institute of Electrical and Electronics Engineers Inc., 2018) Pasupulety, U.; Halavar, B.; Talawar, B.
Through-Silicon Vias(TSVs) are a type of on-chip interconnect used for communication between multiple layers of circuit elements in a 3D IC. Multiple TSVs form a vertical link connecting inter-layer elements in 3D Network-on-Chip(NoC) architectures. Microarchitectural parameters such as length, width, pitch, and operating frequency influence the total power consumed and heat dissipated by TSVs. Effective extraction of the heat between layers is a significant challenge in 3D NoCs. Modelling the power of the TSVs and the thermal profile of 3D NoCs accurately enable designers perform trade-off studies during the design phase. In this work, we evaluate the thermal behaviour of 2 layer 3D Mesh and CMesh NoC architectures. We extended HotSpot to provide support for the inclusion of a router-TSV circuit element as a part of the 3D NoC floorplan. For the 3D Mesh, the thermal behaviour was analyzed for the naive arrangement as well as a proposed thermally aware design of the router-TSV element. Additionally, the thermal effect of multiple cores sharing a single router-TSV in a CMesh architecture was investigated. Our experiments show that the average of the maximum temperatures of all the routers in the 4x8x2 thermal-aware 3D Mesh is lowered by 3% compared to the naive 3D Mesh design. Also, the average of the maximum temperatures of all the routers in a 3D CMesh is 7% more than the naive 3D Mesh and 9% more than the thermally aware 3D Mesh design. Â© 2018 IEEE.
Trace-Driven Simulation and Design Space Exploration of Network-on-Chip Topologies on FPGA
(Institute of Electrical and Electronics Engineers Inc., 2018) Sangeetha, G.S.; Radhakrishnan, V.; Prabhu Prasad, P.; Parane, K.; Talawar, B.
Networking On Chips is now becoming an extremely important part of the present and future of electronic technology. It is extensively used in Multiprocessor System-on-Chips and in Chip Multiprocessors. Using an NoC, the backend wiring involved has drastically reduced in an SoC. Further, SoCs with NoC interconnect operates at a higher operating frequency, mainly because the hardware required for switching and routing are simplified. The NoC researchers have relied on simulators based on performance and power to study the different factors of NoC such as algorithm in place, the topology, the buffer management and location schemes, the flow control and routing among others. In this paper, we present a trace-driven NoC architecture that gives the user access to realistic details about the resource utilization of NoC architectures and their individual components. This includes exploration of various design decision parameters of NoC by modeling them on a FPGA. The paper also presents the performance of these architectures by conducting trace-driven simulations using benchmarks like PARSEC. Different topologies are considered for experimentation purposes with different routing algorithms. Â© 2018 IEEE.
FPGA based NoC Simulation Acceleration Framework Supporting Adaptive Routing
(Institute of Electrical and Electronics Engineers Inc., 2018) Parane, K.; Prabhu Prasad, B.M.; Talawar, B.
In this paper, we present fast and param-eterized FPGA based Network-on-Chip (NoC) simu-lation acceleration framework with automated HDL generation engine. The framework supports the NoC architecture design parameters such as topology, rout-ing algorithms, link width, buffer size, flow control and traffic patterns. The parameterized, high perfor-mance and lightweight nature of proposed NoC based framework makes the ideal choice for NoC research studies. The Mesh based topologies have been con-sidered for the experimentation purpose. A congestion aware adaptive routing has been proposed along with the conventional XY routing. Also, parameters such as buffer depth, traffic pattern and flit width have been varied to observe the effect on the NoC behavior. The adaptive routing algorithm for Mesh based topologies has negligible FPGA area overhead compared to the conventional XY routing. Employing the adaptive routing algorithm, the average packet latency is reduced by 55 % under transpose traffic pattern when compared to the XY routing algorithm. The speedup of 2548x has been observed compared to Booksim software simulator. The proposed framework is 2.54x and 25x times faster compared to CONNECT and DART FPGA based simulators respectively. Â© 2018 IEEE.
Accurate Performance Analysis of 3D Mesh Network on Chip Architectures
(Institute of Electrical and Electronics Engineers Inc., 2018) Halavar, B.; Talawar, B.
With the increase in number and complexity of cores and components in CMPs and SoCs, a highly structured and efficient on-chip communication network is required to achieve high-performance and scalability. Network on Chips(NoC) emerged as the reliable communication framework in CMPs and SoCs. Many 2-D NoC architectures have been proposed for efficient on-chip communication. In this paper, we explore the design space of 3D NoCs using floorplan driven wire lengths and link delay estimation. We analyse the performance and cost of 2D and two 3D variants of the Mesh topology by injecting two synthetic traffic pattern for varying buffer space and floorplan based delays were considered to for the experiments. Results of our experiments show that for the injection rates from 0.02 to 0.2 the average network latency of a 4layer 3D Mesh is reduced up to 54% compared to its 2D counterpart. The on chip communication performance improved up to 2.2Ã— and 3.1Ã— in 4-layer 3D Mesh compare to 2D Mesh with uniform and transpose traffic patterns respectively. Â© 2018 IEEE.

Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results