Pudi, M.Srihari, P.Pardhasaradhi, B.2026-02-062022MysuruCon 2022 - 2022 IEEE 2nd Mysore Sub Section International Conference, 2022, Vol., , p. -https://doi.org/10.1109/MysuruCon55714.2022.9972592https://idr.nitk.ac.in/handle/123456789/29832The Kalman filter (KF) algorithm's low-power and low-area implementations are essential for both civilian and military applications. In KF, the tri-matrix multiplication (PGP and PGPT) consumes more cycles and transposition buffer hardware. Hence there is a strong need to develop an accelerator module to compute the tri-matrix multiplication without the transposition buffer module and in fewer cycles. This work presents an algorithm for direct or transposed tri-matrix multiplication without timing penalty and extra transposition buffer hardware unit. For an N-dimensional matrix multiplication (PG), the data is stored in circulant matrix form with N BRAMs for ease of write/read, and the resultant is stored in circulant form to enable the chained operation. The complexity of intermediate output (PG) and tri-matrix multiplication (PGP) complexity are O(N2 and O) (2N2) respectively. The KF algorithm with different state vectors is considered, and the tri-matrix multiplications are accelerated on NEXYS 4 DDR Artix-7 FPGA. The critical operating frequency for 2-D constant velocity (CV) and constant acceleration (CA) models operating with 200 MHz and 151 Mz, respectively. In contrast, the 3-D CV and CA models operate with 151 MHz and 115 MHz, respectively. © 2022 IEEE.acceleratorchained operationcirculant matrixFPGAtri matrix multiplicationFPGA Implementation of Tri-matrix Multiplication Accelerator using Circulant Matrices for Kalman Filters