Computing Reviews

A high performance QDWH-SVD solver using hardware accelerators
Sukkari D., Ltaief H., Keyes D. ACM Transactions on Mathematical Software43(1):1-25,2016.Type:Article
Date Reviewed: 10/20/16

To make effective use of modern computer hardware, algorithms have to be carefully designed with concurrency in mind: in particular, they need to efficiently utilize the multiple cores provided by the computer’s general-purpose central processing units (CPUs) and/or dedicated graphics processing units (GPUs), taking into account for example the costs of transferring data from main memory to GPUs.

This paper demonstrates this by a high-performance implementation of the QR-based dynamically weighted Halley algorithm for computing the singular value decomposition of a matrix (QDWH-SVD). The algorithm proceeds in three stages of polar decomposition, symmetric eigensolver, and matrix-matrix multiplication; these are implemented on top of available numerical libraries for dense linear algebra, such as Intel MKL or MAGMA, which are effectively accelerated by GPUs. The overall algorithm strives to minimize data transfers from/to the GPU. Thus, although the QDWH-SVD framework uses up to two times more floating-point operations compared to standard solvers, it is up to four times faster by just using a single GPU. Furthermore, a multi-GPU implementation demonstrates with three GPUs a further speedup of two.

The paper is clearly written and systematically structured: it presents the design of the algorithm, analyzes its complexity, and examines its numerical accuracy, all in great detail. By various experiments, the performances of the individual algorithms are evaluated and compared with classical solvers. Further work will focus on algorithmic improvements to reduce the number of floating-point operations and to design a version of the framework for distributed memory systems.

Reviewer:  Wolfgang Schreiner Review #: CR144858 (1701-0063)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy