Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
2 results
Search Results
Item Optimizing Split Algorithm Performance: A Heuristic Method for Enhanced Tensor Product Matrix Computations(Institute of Electrical and Electronics Engineers Inc., 2024) Bhowmik, B.; Kumar, S.; Raju, S.R.; Prakash, A.; Mense, O.Optimizing tensor product matrix computations is critical for enhancing computational efficiency in high-performance applications. Traditional algorithms, like the Split algorithm, often struggle due to the unique properties of each matrix involved. This paper presents a novel heuristic method that optimizes the selection of cutting points and matrix ar-rangement, significantly reducing redundant calculations and minimizing memory usage. The proposed approach adapts to the varying characteristics of tensor products, improving performance across different computational scenarios. Enhancing floating-point operation efficiency and CPU utilization delivers substantial speed and efficiency gains, particularly in large-scale tensor product matrix operations, offering a robust solution for complex computational tasks. © 2024 IEEE.Item Performance Analysis and Predictive Modeling of MPI Collective Algorithms in Multi-Core Clusters: A Comparative Study(Institute of Electrical and Electronics Engineers Inc., 2025) Reddy, M.R.V.S.R.S.; Raju, S.R.; Girish, K.K.; Bhowmik, B.Efficient communication is the foundation of parallel computing systems, enabling seamless coordination across multiple processors for optimal performance. At the core of this communication lies the Message Passing Interface, a crucial framework designed to facilitate data exchange between processors through collective operations. However, these MPI operations often face challenges, including fluctuating process counts, varying message sizes, and increased communication overhead. These issues can significantly impact execution times and scalability, leading to potential bottlenecks in large-scale systems. To address these concerns, this paper provides an in-depth evaluation of key MPI collective algorithms - Flat Tree, Chain, and Binary Tree - by examining their performance under varying configurations. By analyzing execution times and communication overhead, the study reveals the trade-offs inherent in each algorithm, offering insights into strategies for reducing communication costs. Through this analysis, we aim to provide valuable guidance to improve the efficiency and scalability of parallel computing, particularly in high-performance systems where communication efficiency is critical. © 2025 IEEE.
