中国科学院机构知识库网格系统: Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

文献类型：期刊论文


作者	Xiong QinGang 1,2; Li Bo 1,2; Xu Ji 1,2; Fang XiaoJian 1,2; Wang XiaoWei 1; Wang LiMin 1; He XianFeng 1; Ge Wei 1
刊名	CHINESE SCIENCE BULLETIN
出版日期	2012-03-01
卷号	57 期号:7 页码:707-715
关键词	asynchronous execution compute unified device architecture graphic processing unit lattice Boltzmann method non-blocking message passing interface OpenMP
ISSN号	1001-6538
通讯作者	Wang, XW
英文摘要	Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with mainstream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algorithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the one- and two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods.
WOS标题词	Science & Technology
类目[WOS]	Multidisciplinary Sciences
研究领域[WOS]	Science & Technology - Other Topics
关键词[WOS]	SIMULATION ; EQUATION ; GPUS ; FLOW ; MPI
收录类别	SCI
语种	英语
WOS记录号	WOS:000300771200001
公开日期	2013-10-26
版本	出版稿
源URL	[http://ir.ipe.ac.cn/handle/122111/4290]
专题	过程工程研究所_多相复杂系统国家重点实验室
作者单位	1.Chinese Acad Sci, Inst Proc Engn, State Key Lab Multiphase Complex Syst, Beijing 100190, Peoples R China 2.Chinese Acad Sci, Grad Univ, Beijing 100049, Peoples R China
推荐引用方式 GB/T 7714	Xiong QinGang,Li Bo,Xu Ji,et al. Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units[J]. CHINESE SCIENCE BULLETIN,2012,57(7):707-715.
APA	Xiong QinGang.,Li Bo.,Xu Ji.,Fang XiaoJian.,Wang XiaoWei.,...&Ge Wei.(2012).Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units.CHINESE SCIENCE BULLETIN,57(7),707-715.
MLA	Xiong QinGang,et al."Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units".CHINESE SCIENCE BULLETIN 57.7(2012):707-715.

入库方式： OAI收割

来源：过程工程研究所

下载0

Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

其他版本