Abstract
The objective of this paper is to extend, in the context of multicore architectures, the concepts of tile algorithms [Buttari et al., 2007] for Cholesky, LU, and QR factorizations to the family of two-sided factorizations. In particular, the bidiagonal reduction of a general, dense matrix is very often used as a preprocessing step for calculating the Singular Value Decomposition. Furthermore, in the Top500 list of June 2008, 98 percent of the fastest parallel systems in the world were based on multicores. This confronts the scientific software community with both a daunting challenge and a unique opportunity. The challenge arises from the disturbing mismatch between the design of systems based on this new chip architecturehundreds of thousands of nodes, a million or more cores, reduced bandwidth and memory available to coresand the components of the traditional software stack, such as numerical libraries, on which scientific applications have relied for their accuracy and performance. The many-core trend has even more exacerbated the problem, and it becomes critical to efficiently integrate existing or new numerical linear algebra algorithms suitable for such hardware. By exploiting the concept of tile algorithms in the multicore environment (i.e., high level of parallelism with fine granularity and high-performance data representation combined with a dynamic data-driven execution), the band bidiagonal reduction presented here achieves 94 Gflop/s on a 12,000 × 12,000 matrix with 16 Intel Tigerton 2.4 GHz processors. The main drawback of the tile algorithms approach for the bidiagonal reduction is that the full reduction cannot be obtained in one stage. Other methods have to be considered to further reduce the band matrix to the required form.
Original language | English (US) |
---|---|
Article number | 4967575 |
Pages (from-to) | 417-423 |
Number of pages | 7 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 21 |
Issue number | 4 |
DOIs | |
State | Published - Apr 2010 |
Externally published | Yes |
Bibliographical note
Funding Information:The authors thank Alfredo Buttari for his insightful comments, which greatly helped to improve the quality of this paper. This work was supported in part by grants from the US National Science Foundation (NSF) and the US Department of Energy (DoE).
Keywords
- Bidiagonal reduction
- Multicores
- Singular value decomposition
- Tile algorithms
ASJC Scopus subject areas
- Signal Processing
- Hardware and Architecture
- Computational Theory and Mathematics