中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Bayer image parallel decoding based on GPU

文献类型:会议论文

作者Hu, Rihui1,2; Xu, Zhiyong1; Wei, Yuxing1; Sun, Shaohua3
出版日期2012
会议名称Proceedings of SPIE: Optoelectronic Imaging and Multimedia Technology II
会议日期2012
卷号8558
页码85581T
通讯作者Hu, R. (ustc_hui@126.com)
中文摘要In the photoelectrical tracking system, Bayer image is decompressed in traditional method, which is CPU-based. However, it is too slow when the images become large, for example, 2Kx2Kx16bit. In order to accelerate the Bayer image decoding, this paper introduces a parallel speedup method for NVIDA's Graphics Processor Unit (GPU) which supports CUDA architecture. The decoding procedure can be divided into three parts: the first is serial part, the second is task-parallelism part, and the last is data-parallelism part including inverse quantization, inverse discrete wavelet transform (IDWT) as well as image post-processing part. For reducing the execution time, the task-parallelism part is optimized by OpenMP techniques. The data-parallelism part could advance its efficiency through executing on the GPU as CUDA parallel program. The optimization techniques include instruction optimization, shared memory access optimization, the access memory coalesced optimization and texture memory optimization. In particular, it can significantly speed up the IDWT by rewriting the 2D (Tow-dimensional) serial IDWT into 1D parallel IDWT. Through experimenting with 1Kx1Kx16bit Bayer image, data-parallelism part is 10 more times faster than CPU-based implementation. Finally, a CPU+GPU heterogeneous decompression system was designed. The experimental result shows that it could achieve 3 to 5 times speed increase compared to the CPU serial method. © Copyright SPIE.
英文摘要In the photoelectrical tracking system, Bayer image is decompressed in traditional method, which is CPU-based. However, it is too slow when the images become large, for example, 2Kx2Kx16bit. In order to accelerate the Bayer image decoding, this paper introduces a parallel speedup method for NVIDA's Graphics Processor Unit (GPU) which supports CUDA architecture. The decoding procedure can be divided into three parts: the first is serial part, the second is task-parallelism part, and the last is data-parallelism part including inverse quantization, inverse discrete wavelet transform (IDWT) as well as image post-processing part. For reducing the execution time, the task-parallelism part is optimized by OpenMP techniques. The data-parallelism part could advance its efficiency through executing on the GPU as CUDA parallel program. The optimization techniques include instruction optimization, shared memory access optimization, the access memory coalesced optimization and texture memory optimization. In particular, it can significantly speed up the IDWT by rewriting the 2D (Tow-dimensional) serial IDWT into 1D parallel IDWT. Through experimenting with 1Kx1Kx16bit Bayer image, data-parallelism part is 10 more times faster than CPU-based implementation. Finally, a CPU+GPU heterogeneous decompression system was designed. The experimental result shows that it could achieve 3 to 5 times speed increase compared to the CPU serial method. © Copyright SPIE.
收录类别EI
语种英语
ISSN号0277786X
源URL[http://ir.ioe.ac.cn/handle/181551/7695]  
专题光电技术研究所_光电探测与信号处理研究室(五室)
作者单位1.Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu, Sichuan, 610209, China
2.Graduate School, Chinese Academy of Sciences, Beijing, 100039, China
3.750 Test Field of China Shipbuilding Industry Corporation, Kunming, Yunnan, 650051, China
推荐引用方式
GB/T 7714
Hu, Rihui,Xu, Zhiyong,Wei, Yuxing,et al. Bayer image parallel decoding based on GPU[C]. 见:Proceedings of SPIE: Optoelectronic Imaging and Multimedia Technology II. 2012.

入库方式: OAI收割

来源:光电技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。