Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
6 results
Search Results
Item Optimizing Split Algorithm Performance: A Heuristic Method for Enhanced Tensor Product Matrix Computations(Institute of Electrical and Electronics Engineers Inc., 2024) Bhowmik, B.; Kumar, S.; Raju, S.R.; Prakash, A.; Mense, O.Optimizing tensor product matrix computations is critical for enhancing computational efficiency in high-performance applications. Traditional algorithms, like the Split algorithm, often struggle due to the unique properties of each matrix involved. This paper presents a novel heuristic method that optimizes the selection of cutting points and matrix ar-rangement, significantly reducing redundant calculations and minimizing memory usage. The proposed approach adapts to the varying characteristics of tensor products, improving performance across different computational scenarios. Enhancing floating-point operation efficiency and CPU utilization delivers substantial speed and efficiency gains, particularly in large-scale tensor product matrix operations, offering a robust solution for complex computational tasks. © 2024 IEEE.Item An Integrated MPI and OpenMP Approach for Plasma Dynamics Simulations(Institute of Electrical and Electronics Engineers Inc., 2024) Prakash, Y.M.; Girish, K.K.; Verma, L.; Kumar, S.; Bhowmik, B.Plasma dynamics is the behavior exhibited by two or more charged species with respect to electric or magnetic fields. In high-performance computing (HPC) applications, it requires all these factors: the accuracy of parallel implementations, effective inter-process communication, and scalability with respect to workload. This paper points out the limitations in the current approaches to the plasma dynamics problems, and discusses the use of MPI continuation tasks and of its performance enhancement with OpenMP methods. Within the framework of the Vlasov-Poisson system, we develop theory of MPI continuation and describe techniques optimal for its use, which allows to efficiently combine communication with computation, which is quite a difficult task in most of the cases, especially in the case of multidimensional simulations. The results allow better insights on how to increase the level of parallelism and reduce the time to compute, which in turn fosters the formulation of more effective high-performance strategies and also the understanding of the parallelism in plasma simulations using the MPI standard. © 2024 IEEE.Item Optimizing Data Movement in Heterogeneous Computing: A LASSA-based Approach for Efficient Nucleation List Precomputation(Institute of Electrical and Electronics Engineers Inc., 2025) Bhowmik, B.; Girish, K.K.; Pandey, H.; Prabhanjans, P.In the rapidly evolving landscape of heterogeneous computing, the efficiency of data movement between CPUs and GPUs can make or break system performance. Despite advancements in parallel processing, existing methods for managing data transfers - particularly in GPU offloading scenarios - suffer from significant inefficiencies. These inefficiencies are particularly evident in nucleation list precomputation for non-equilibrium solidification models, where redundant data movements and complex dynamic work-sharing in OpenMP lead to significant performance overhead. To tackle this issue, this paper proposes a novel solution that integrates the Location-Aware Heap Static Single Assignment (LASSA) algorithm into the compilation process. This approach identifies and eliminates redundant memory copy operations, optimizing data transfers and reducing overhead. The findings reveal a dramatic performance boost, with up to a 9.6-fold increase in efficiency. By addressing the specific challenges of nucleation list precomputation, this work provides valuable insights into optimizing data movement in heterogeneous computing environments, paving the way for enhanced performance in parallel programming models. © 2025 IEEE.Item Taskgraph Framework: A Competitive Alternative to the OpenMP Thread Model(Institute of Electrical and Electronics Engineers Inc., 2025) Chavan, S.; Nile, P.; Kumar, S.; Bhowmik, B.OpenMP is the predominant standard for shared memory systems in high-performance computing (HPC), offering a tasking paradigm for parallelism. However, existing OpenMP implementations, like GCC and LLVM, face computational limitations that hinder performance, especially for large-scale tasks. This paper presents the Taskgraph framework, a novel solution that overcomes the limitations of traditional task dependency graphs (TDGs). Unlike conventional TDGs, which require costly reconstruction for dynamic program structures, the Taskgraph framework uses a taskgraph clause with a list of variables, enabling real-time adaptation without complete reconstruction. This approach significantly reduces overhead, making the Task-graph model highly efficient for tasks with minimal dependencies, offering a competitive alternative to the OpenMP thread model, and enhancing efficiency and adaptability in dynamic HPC environments. © 2025 IEEE.Item Efficient Parallel Algorithm for Detecting Longest Flow Paths in Flow Direction Grids(Institute of Electrical and Electronics Engineers Inc., 2025) Jayarukshi, K.; Agarwal, S.; Girish, K.K.; Goudar, S.; Bhowmik, B.High-performance computing (HPC) has transformed the capacity to address complex computational tasks across various scientific fields by enabling the efficient processing of large datasets and intricate simulations. In hydrological modeling, a critical task is identifying the longest flow channel within a catchment, which is essential for understanding water flow patterns and managing resources. However, existing geographic information system (GIS) algorithms for flow path identification often suffer from inefficiencies and inaccuracies. To address these challenges, this paper introduces innovative parallel methods utilizing Open Multi-Processing (OpenMP), a widely-used API that supports multi-platform shared-memory parallel programming. This approach optimizes the analysis of flow direction data, resulting in faster and more accurate identification of flow channels. The results demonstrate that the proposed method outperforms current approaches, offering substantial improvements in both performance and precision. These advancements have the potential to significantly enhance hydrological modeling practices and water resource management. © 2025 IEEE.Item Exploring Hidden Behaviors in OpenMP Multi-threaded Applications for Anomaly Detection in HPC Environments(Springer Science and Business Media Deutschland GmbH, 2025) Bhowmik, B.; Girish, K.K.; Mishra, P.; Mishra, R.In high-performance computing (HPC), multi-threaded applications using OpenMP face complex challenges in identifying hidden performance issues, often due to resource conflicts, software inefficiencies, and hardware anomalies. These subtle issues can significantly degrade performance and reduce system reliability. This paper introduces an innovative approach designed to address these concealed issues in OpenMP multi-threaded applications. The proposed method integrates a Random Forest classifier with anthropomorphic diagnosis to effectively identify and diagnose performance-affecting problems. The approach has demonstrated a remarkable ability to detect 90% of performance-affecting issues that are often obscured within complex HPC environments. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
