Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 4 of 4

Recent trends in software and hardware for GPGPU computing: A comprehensive survey
(2010) Bayyapu, B.; Raghavendra, P.S.
With the growth of Graphics Processor (GPU) programmability and processing power, graphics hardware has become a compelling platform for computationally demanding tasks in a wide variety of application domains. This state of art paper gives the technical motivations that underlie GPU computing and describe the hardware and software developments that have led to the recent interest in this field. Â©2010 IEEE.
CSPR: Column only SPARSE matrix representation for performance improvement on GPU architecture
(2011) Bayyapu, B.; Raghavendra, P.S.
General purpose computation on graphics processing unit (GPU) is prominent in the high performance computing era of this time. Porting or accelerating the data parallel applications onto GPU gives the default performance improvement because of the increased computational units. Better performances can be seen if application specific fine tuning is done with respect to the architecture under consideration. One such very widely used computation intensive kernel is sparse matrix vector multiplication (SPMV) in sparse matrix based applications. Most of the existing data format representations of sparse matrix are developed with respect to the central processing unit (CPU) or multi cores. This paper gives a new format for sparse matrix representation with respect to graphics processor architecture that can give 2x to 5x performance improvement compared to CSR (compressed row format), 2x to 54x performance improvement with respect to COO (coordinate format) and 3x to 10 x improvement compared to CSR vector format for the class of application that fit for the proposed new format. It also gives 10% to 133% improvements in memory transfer (of only access information of sparse matrix) between CPU and GPU. This paper gives the details of the new format and its requirement with complete experimentation details and results of comparison. Â© 2011 Springer-Verlag.
A GPU framework for sparse matrix vector multiplication
(Institute of Electrical and Electronics Engineers Inc., 2014) Bayyapu, B.; Guddeti, G.R.M.; Raghavendra, P.S.
The hardware and software evolutions related to Graphics Processing Units (GPUs), for general purpose computations, have changed the way the parallel programming issues are addressed. Many applications are being ported onto GPU for achieving performance gain. The GPU execution time is continuously optimized by the GPU programmers while optimizing pre-GPU computation overheads attracted the research community in the recent past. While GPU executes the programs given by a CPU, pre-GPU computation overheads does exists and should be optimized for a better usage of GPUs. The GPU framework proposed in this paper improves the overall performance of the application by optimizing pre-GPU computation overheads along with GPU execution time. This paper proposes a sparse matrix format prediction tool to predict an optimal sparse matrix format to be used for a given input matrix by analyzing the input sparse matrix and considering pre-GPU computation overheads. The sparse matrix format predicted by the proposed method is compared against the best performing sparse matrix formats posted in the literature. The proposed model is based on the static data that is available from the input directly and hence the prediction overhead is very small. Compared to GPU specific sparse format prediction, the proposed model is more inclusive and precious in terms of increasing overall application's performance. Â© 2014 IEEE.
Predicting an optimal sparse matrix format for SpMV computation on GPU
(IEEE Computer Society help@computer.org, 2014) Bayyapu, B.; Guddeti, G.R.M.; Raghavendra, P.S.
Many-threaded architecture based Graphics Processing Units (GPUs) are good for general purpose computations for achieving high performance. The processor has latency hiding mechanism through which it hides the memory access time in such a way that when one warp (group of 32 threads) is computing, the other warps perform memory bound access. But for memory access bound irregular applications such as Sparse Matrix Vector Multiplication (SpMV), memory access times are high and hence improving the performance of such applications on GPU is a challenging research issue. Further, optimizing SpMV time on GPU is an important task for iterative applications like jacobi and conjugate gradient. However, there is a need to consider the overheads caused while computing SpMV on GPU. Transforming the input matrix to a desired format and communicating the data from CPU to GPU are non-trivial overheads associated with SpMV computation on GPU. If the chosen format is not suitable for the given input sparse matrix then desired performance improvements cannot be achieved. Motivated by this observation, this paper proposes a method to chose an optimal sparse matrix format, focusing on the applications where CPU to GPU communication time and pre-processing time are nontrivial. The experimental results show that the predicted format by the model matches with that of the actual high performing format when total SpMV time in terms of pre-processing time, CPU to GPU communication time and SpMV computation time on GPU, is taken into account. The model predicts an optimal format for any given input sparse matrix with a very small overhead of prediction within an application. Compared to the format to achieve high performance only on GPU, our approach is more comprehensive and valuable. This paper also proposes to use a communication and pre-processing overhead optimizing sparse matrix format to be used when these overheads are non trivial. Â© 2014 IEEE.

Conference Papers

Browse

Filters

Settings

Sort By

Results per page

Search Results