Browsing by Author "Prabhu, P.B.M."

Now showing 1 - 2 of 2

High-Performance NoCs Employing the DSP48E1 Blocks of the Xilinx FPGAs
(2019) Prabhu, P.B.M.; Parane, K.; Talawar, B.
The hard multiplexers of the Xilinx DSP48E1 slices have been employed to support the functionality of crossbar switch of the buffered five port Network-on-Chip (NoC) routers. This is possible due to the dynamic mode operation of the DSP48E1 slices per clock cycle based on the multiplexer control signals. As a result of this, a significant reduction in the soft logic (LUT+FF) utilization of the FPGA implementation of the 6� 6 Mesh topology has been observed. DSP based crossbar implementation of the 6� 6 Mesh topology consumes 36% fewer LUTs and 40% fewer FFs than the LUT based crossbar implementation. 38% less power consumption has been observed in the DSP based implementation. The proposed work utilizes 41% fewer LUTs compared to the state-of-the-art CON-NECT NoC generation tool. The latency reductions of 31% and 38% have been achieved by the proposed DSP48E1 based crossbar implementation over the LUT crossbar implementation of 8� 8 Mesh topology under the Uniform and Transpose traffic patterns. Also, the proposed DSP48E1 based implementation achieves the saturation throughput improvements of 1.4� and 1.6� over the LUT based implementation under Uniform and Transpose traffic patterns respectively. � 2019 IEEE.
High-Performance NoCs Employing the DSP48E1 Blocks of the Xilinx FPGAs
(IEEE Computer Society help@computer.org, 2019) Prabhu, P.B.M.; Parane, K.; Talawar, B.
The hard multiplexers of the Xilinx DSP48E1 slices have been employed to support the functionality of crossbar switch of the buffered five port Network-on-Chip (NoC) routers. This is possible due to the dynamic mode operation of the DSP48E1 slices per clock cycle based on the multiplexer control signals. As a result of this, a significant reduction in the soft logic (LUT+FF) utilization of the FPGA implementation of the 6Ã— 6 Mesh topology has been observed. DSP based crossbar implementation of the 6Ã— 6 Mesh topology consumes 36% fewer LUTs and 40% fewer FFs than the LUT based crossbar implementation. 38% less power consumption has been observed in the DSP based implementation. The proposed work utilizes 41% fewer LUTs compared to the state-of-the-art CON-NECT NoC generation tool. The latency reductions of 31% and 38% have been achieved by the proposed DSP48E1 based crossbar implementation over the LUT crossbar implementation of 8Ã— 8 Mesh topology under the Uniform and Transpose traffic patterns. Also, the proposed DSP48E1 based implementation achieves the saturation throughput improvements of 1.4Ã— and 1.6Ã— over the LUT based implementation under Uniform and Transpose traffic patterns respectively. Â© 2019 IEEE.