Publication Details
Fast Linear Algebra on GPU
Smrž Pavel, doc. RNDr., Ph.D. (DCGM)
GPU; parallel reduction; linear algebra;BLAS; OpenCL; CUDA
GPUs have been successfullyused for acceleration of many mathematical functions and libraries. A commonlimitation of those libraries is the minimal size of primitives being handled,in order to achieve a significant speedup compared to their CPU versions. Theminimal size requirement can prove prohibitive for many applications. It can beloosened by batching operations in order to have sufficient amount of data toperform the calculation maximally efficiently on the GPU. A fast OpenCLimplementation of two basic vector functions - vector reduction and vectorscaling - is described in this paper. Its performance is analyzed by runningbenchmarks on two of the most common GPUs in use - Tesla and Fermi GPUs fromNVIDIA. Reported experimental results show that our implementation significantlyoutperforms the current state-of-the-art GPU-based basic linear algebra libraryCUBLAS.
@inproceedings{BUT96982,
author="Lukáš {Polok} and Pavel {Smrž}",
title="Fast Linear Algebra on GPU",
booktitle="IEEE conference proceedings",
year="2012",
pages="1--6",
publisher="IEEE Computer Society",
address="Liverpool",
isbn="978-0-7695-4749-7",
url="https://www.fit.vut.cz/research/publication/10039/"
}