Previous |  Up |  Next

Article

Title: Performance of parallel QR factorization methods on the NVIDIA Grace CPU Superchip (English)
Author: Břichňáč, Vít
Author: Šístek, Jakub
Language: English
Journal: Programs and Algorithms of Numerical Mathematics
Volume: Proceedings of Seminar. Hejnice, June 23-28, 2024
Issue: 2024
Year:
Pages: 29-40
.
Category: math
.
Summary: This article studies several algorithms for QR factorization based on hierarchical Householder reflectors organized into elimination trees, which are particularly suited for tall-and-skinny matrices and allow parallelization. We examine the effect of various parameters on the performance of the tree-based algorithms. The work is accompanied with a custom implementation that utilizes a task-based runtime system (OpenMP or StarPU). The same algorithm is implemented in the PLASMA library. The performance evaluation is done on the recent NVIDIA Grace CPU Superchip. (English)
Keyword: QR factorization
Keyword: task-based programming
Keyword: NVIDIA Grace CPU
MSC: 65F05
DOI: 10.21136/panm.2024.03
.
Date available: 2025-06-02T07:38:57Z
Last updated: 2025-06-05
Stable URL: http://hdl.handle.net/10338.dmlcz/703227
.

Files

Files Size Format View
PANM_22-2024-1_6.pdf 467.9Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo