Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
文献类型:期刊论文
作者 | Xiong QinGang1,2; Li Bo1,2; Xu Ji1,2; Fang XiaoJian1,2; Wang XiaoWei1; Wang LiMin1; He XianFeng1; Ge Wei1 |
刊名 | CHINESE SCIENCE BULLETIN
![]() |
出版日期 | 2012-03-01 |
卷号 | 57期号:7页码:707-715 |
关键词 | asynchronous execution compute unified device architecture graphic processing unit lattice Boltzmann method non-blocking message passing interface OpenMP |
ISSN号 | 1001-6538 |
通讯作者 | Wang, XW |
英文摘要 | Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with mainstream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algorithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the one- and two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods. |
WOS标题词 | Science & Technology |
类目[WOS] | Multidisciplinary Sciences |
研究领域[WOS] | Science & Technology - Other Topics |
关键词[WOS] | SIMULATION ; EQUATION ; GPUS ; FLOW ; MPI |
收录类别 | SCI |
语种 | 英语 |
WOS记录号 | WOS:000300771200001 |
公开日期 | 2013-10-26 |
版本 | 出版稿 |
源URL | [http://ir.ipe.ac.cn/handle/122111/4290] ![]() |
专题 | 过程工程研究所_多相复杂系统国家重点实验室 |
作者单位 | 1.Chinese Acad Sci, Inst Proc Engn, State Key Lab Multiphase Complex Syst, Beijing 100190, Peoples R China 2.Chinese Acad Sci, Grad Univ, Beijing 100049, Peoples R China |
推荐引用方式 GB/T 7714 | Xiong QinGang,Li Bo,Xu Ji,et al. Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units[J]. CHINESE SCIENCE BULLETIN,2012,57(7):707-715. |
APA | Xiong QinGang.,Li Bo.,Xu Ji.,Fang XiaoJian.,Wang XiaoWei.,...&Ge Wei.(2012).Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units.CHINESE SCIENCE BULLETIN,57(7),707-715. |
MLA | Xiong QinGang,et al."Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units".CHINESE SCIENCE BULLETIN 57.7(2012):707-715. |
入库方式: OAI收割
来源:过程工程研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。