Real and complex fast fourier transforms on the Fujitsu VPP 500

Date

1994

Authors

Hegland, Markus

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Fast Fourier transforms parallelize well but need large amounts of communication. An algorithm which concentrates all the communication in one or two transposition steps is the transpose split algorithm. Different transposition algorithms can be used depending on data size and communication latency. A new transpose split algorithm for real and hermitian data is presented for one, two and three dimensional transforms. This algorithm is implemented on the Fujitsu VPP 500. The Fujitsu VPP 500 is a parallel processor with a moderate number of very fast vector processors connected by a crossbar switch. Each processor has a peak performance of 1.6 Gop/s and can simultaneously read and write 400 MByte/s. Very long vector length stride one implementations of multiple FFTs on one node [Hegland, Numerische Mathematik, to appear, 1994] are combined with optimized transpositions. One third of peak performance was achieved on a configuration with up to 11 processors.

Description

Keywords

fast vector processors, transposition algorithms, dimensional transforms

Citation

Source

Type

Working/Technical Paper

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

Downloads