Browsing by Author "Prabhu, P.B.M."
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item High-Performance NoCs Employing the DSP48E1 Blocks of the Xilinx FPGAs(2019) Prabhu, P.B.M.; Parane, K.; Talawar, B.The hard multiplexers of the Xilinx DSP48E1 slices have been employed to support the functionality of crossbar switch of the buffered five port Network-on-Chip (NoC) routers. This is possible due to the dynamic mode operation of the DSP48E1 slices per clock cycle based on the multiplexer control signals. As a result of this, a significant reduction in the soft logic (LUT+FF) utilization of the FPGA implementation of the 6� 6 Mesh topology has been observed. DSP based crossbar implementation of the 6� 6 Mesh topology consumes 36% fewer LUTs and 40% fewer FFs than the LUT based crossbar implementation. 38% less power consumption has been observed in the DSP based implementation. The proposed work utilizes 41% fewer LUTs compared to the state-of-the-art CON-NECT NoC generation tool. The latency reductions of 31% and 38% have been achieved by the proposed DSP48E1 based crossbar implementation over the LUT crossbar implementation of 8� 8 Mesh topology under the Uniform and Transpose traffic patterns. Also, the proposed DSP48E1 based implementation achieves the saturation throughput improvements of 1.4� and 1.6� over the LUT based implementation under Uniform and Transpose traffic patterns respectively. � 2019 IEEE.Item High-Performance NoCs Employing the DSP48E1 Blocks of the Xilinx FPGAs(IEEE Computer Society help@computer.org, 2019) Prabhu, P.B.M.; Parane, K.; Talawar, B.The hard multiplexers of the Xilinx DSP48E1 slices have been employed to support the functionality of crossbar switch of the buffered five port Network-on-Chip (NoC) routers. This is possible due to the dynamic mode operation of the DSP48E1 slices per clock cycle based on the multiplexer control signals. As a result of this, a significant reduction in the soft logic (LUT+FF) utilization of the FPGA implementation of the 6× 6 Mesh topology has been observed. DSP based crossbar implementation of the 6× 6 Mesh topology consumes 36% fewer LUTs and 40% fewer FFs than the LUT based crossbar implementation. 38% less power consumption has been observed in the DSP based implementation. The proposed work utilizes 41% fewer LUTs compared to the state-of-the-art CON-NECT NoC generation tool. The latency reductions of 31% and 38% have been achieved by the proposed DSP48E1 based crossbar implementation over the LUT crossbar implementation of 8× 8 Mesh topology under the Uniform and Transpose traffic patterns. Also, the proposed DSP48E1 based implementation achieves the saturation throughput improvements of 1.4× and 1.6× over the LUT based implementation under Uniform and Transpose traffic patterns respectively. © 2019 IEEE.
