中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
MPFFT: An Auto-Tuning FFT Library for OpenCL GPUs

文献类型:期刊论文

作者Li, Yan ; Zhang, Yun-Quan ; Liu, Yi-Qun ; Long, Guo-Ping ; Jia, Hai-Peng
刊名JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
出版日期2013
卷号28期号:1页码:90-105
关键词fast Fourier transform GPU OpenCL auto-tuning
ISSN号1000-9000
中文摘要Fourier methods have revolutionized many fields of science and engineering, such as astronomy, medical imaging, seismology and spectroscopy, and the fast Fourier transform (FFT) is a computationally efficient method of generating a Fourier transform. The emerging class of high performance computing architectures, such as CPU, seeks to achieve much higher performance and efficiency by exposing a hierarchy of distinct memories to software. However, the complexity of GPU programming poses a significant challenge to developers. In this paper, we propose an automatic performance tuning framework for FFT on various OpenCL GPUs, and implement a high performance library named MPFFT based on this framework. For power-of-two length FFTs, our library substantially outperforms the clAmdFft library on AMD GPUs and achieves comparable performance as the CUFFT library on NVIDIA GPUs. Furthermore, our library also supports non-power-of-two size. For 3D non-power-of-two FFTs, our library delivers 1.5x to 28x faster than FFTW with 4 threads and 20.01x average speedup over CUFFT 4.0 on Tesla C2050.
英文摘要Fourier methods have revolutionized many fields of science and engineering, such as astronomy, medical imaging, seismology and spectroscopy, and the fast Fourier transform (FFT) is a computationally efficient method of generating a Fourier transform. The emerging class of high performance computing architectures, such as CPU, seeks to achieve much higher performance and efficiency by exposing a hierarchy of distinct memories to software. However, the complexity of GPU programming poses a significant challenge to developers. In this paper, we propose an automatic performance tuning framework for FFT on various OpenCL GPUs, and implement a high performance library named MPFFT based on this framework. For power-of-two length FFTs, our library substantially outperforms the clAmdFft library on AMD GPUs and achieves comparable performance as the CUFFT library on NVIDIA GPUs. Furthermore, our library also supports non-power-of-two size. For 3D non-power-of-two FFTs, our library delivers 1.5x to 28x faster than FFTW with 4 threads and 20.01x average speedup over CUFFT 4.0 on Tesla C2050.
收录类别SCI
语种英语
公开日期2014-12-16
源URL[http://ir.iscas.ac.cn/handle/311060/16959]  
专题软件研究所_软件所图书馆_期刊论文
推荐引用方式
GB/T 7714
Li, Yan,Zhang, Yun-Quan,Liu, Yi-Qun,et al. MPFFT: An Auto-Tuning FFT Library for OpenCL GPUs[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2013,28(1):90-105.
APA Li, Yan,Zhang, Yun-Quan,Liu, Yi-Qun,Long, Guo-Ping,&Jia, Hai-Peng.(2013).MPFFT: An Auto-Tuning FFT Library for OpenCL GPUs.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,28(1),90-105.
MLA Li, Yan,et al."MPFFT: An Auto-Tuning FFT Library for OpenCL GPUs".JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 28.1(2013):90-105.

入库方式: OAI收割

来源:软件研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。