Conference Papers

Search Results

Now showing 1 - 4 of 4

Optimizing Performance of OpenMP Parallel Applications through Variable Classification
(Institute of Electrical and Electronics Engineers Inc., 2024) Kumar, S.; Talib, M.
OpenMP provides a versatile framework for parallel computing, allowing developers to transform sequential programs into parallel applications for shared-memory architectures efficiently. One of the central challenges in this transformation lies in accurately identifying appropriate parallel constructs and clauses, which are critical for maximizing performance and ensuring the correctness of the resulting parallel code. A particularly intricate aspect of this process is the classification of variables according to their data-sharing semantics, including first-private, private, last-private, shared, and reduction clauses. Manual classification is laborintensive and significantly susceptible to errors as the program's scale and complexity grow. Although various tools have been developed to assist with variable classification, they often rely on extensive data-dependence analyses and rigid classification schemes, limiting their effectiveness when applied to large-scale programs with complex scoping requirements. This paper presents a novel, cost-effective approach to automate and enhance the accuracy of variable classification in OpenMP parallelization. By reducing the manual effort required and improving the precision of parallel construct insertion, this approach aims to significantly optimize the performance of parallel applications, thereby advancing the utility and accessibility of OpenMP for a wide range of computational tasks. Â© 2024 IEEE.
Optimizing Split Algorithm Performance: A Heuristic Method for Enhanced Tensor Product Matrix Computations
(Institute of Electrical and Electronics Engineers Inc., 2024) Bhowmik, B.; Kumar, S.; Raju, S.R.; Prakash, A.; Mense, O.
Optimizing tensor product matrix computations is critical for enhancing computational efficiency in high-performance applications. Traditional algorithms, like the Split algorithm, often struggle due to the unique properties of each matrix involved. This paper presents a novel heuristic method that optimizes the selection of cutting points and matrix ar-rangement, significantly reducing redundant calculations and minimizing memory usage. The proposed approach adapts to the varying characteristics of tensor products, improving performance across different computational scenarios. Enhancing floating-point operation efficiency and CPU utilization delivers substantial speed and efficiency gains, particularly in large-scale tensor product matrix operations, offering a robust solution for complex computational tasks. Â© 2024 IEEE.
An Integrated MPI and OpenMP Approach for Plasma Dynamics Simulations
(Institute of Electrical and Electronics Engineers Inc., 2024) Prakash, Y.M.; Girish, K.K.; Verma, L.; Kumar, S.; Bhowmik, B.
Plasma dynamics is the behavior exhibited by two or more charged species with respect to electric or magnetic fields. In high-performance computing (HPC) applications, it requires all these factors: the accuracy of parallel implementations, effective inter-process communication, and scalability with respect to workload. This paper points out the limitations in the current approaches to the plasma dynamics problems, and discusses the use of MPI continuation tasks and of its performance enhancement with OpenMP methods. Within the framework of the Vlasov-Poisson system, we develop theory of MPI continuation and describe techniques optimal for its use, which allows to efficiently combine communication with computation, which is quite a difficult task in most of the cases, especially in the case of multidimensional simulations. The results allow better insights on how to increase the level of parallelism and reduce the time to compute, which in turn fosters the formulation of more effective high-performance strategies and also the understanding of the parallelism in plasma simulations using the MPI standard. Â© 2024 IEEE.
Taskgraph Framework: A Competitive Alternative to the OpenMP Thread Model
(Institute of Electrical and Electronics Engineers Inc., 2025) Chavan, S.; Nile, P.; Kumar, S.; Bhowmik, B.
OpenMP is the predominant standard for shared memory systems in high-performance computing (HPC), offering a tasking paradigm for parallelism. However, existing OpenMP implementations, like GCC and LLVM, face computational limitations that hinder performance, especially for large-scale tasks. This paper presents the Taskgraph framework, a novel solution that overcomes the limitations of traditional task dependency graphs (TDGs). Unlike conventional TDGs, which require costly reconstruction for dynamic program structures, the Taskgraph framework uses a taskgraph clause with a list of variables, enabling real-time adaptation without complete reconstruction. This approach significantly reduces overhead, making the Task-graph model highly efficient for tasks with minimal dependencies, offering a competitive alternative to the OpenMP thread model, and enhancing efficiency and adaptability in dynamic HPC environments. Â© 2025 IEEE.

Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results