Hostname: page-component-7dd5485656-6kn8j Total loading time: 0 Render date: 2025-10-24T09:51:58.015Z Has data issue: false hasContentIssue false

Implementation of Multi-GPU Based Lattice Boltzmann Method for Flow Through Porous Media

Published online by Cambridge University Press:  09 January 2015

Changsheng Huang
Affiliation:
School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China
Baochang Shi*
Affiliation:
School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China
Nanzhong He
Affiliation:
School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China
Zhenhua Chai
Affiliation:
School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China
*
*Email:hcshust@163.com(C. Huang), shibc@hust.edu.cn(B. Shi), nzhe@hust.edu.cn(N. He), hustczh@hust.edu.cn(Z. Chai)
Get access

Abstract

The lattice Boltzmann method (LBM) can gain a great amount of performance benefit by taking advantage of graphics processing unit (GPU) computing, and thus, the GPU, or multi-GPU based LBM can be considered as a promising and competent candidate in the study of large-scale fluid flows. However, the multi-GPU based lattice Boltzmann algorithm has not been studied extensively, especially for simulations of flow in complex geometries. In this paper, through coupling with the message passing interface (MPI) technique, we present an implementation of multi-GPU based LBM for fluid flow through porous media as well as some optimization strategies based on the data structure and layout, which can apparently reduce memory access and completely hide the communication time consumption. Then the performance of the algorithm is tested on a one-node cluster equipped with four Tesla C1060 GPU cards where up to 1732 MFLUPS is achieved for the Poiseuille flow and a nearly linear speedup with the number of GPUs is also observed.

Information

Type
Research Article
Copyright
Copyright © Global Science Press Limited 2015 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

[1]NVIDIA, NVIDIA CUDA Compute Unified Device Architecture: Programming Guide (Version 3.2).Google Scholar
[2]Benzi, R., Succi, S. and Vergassola, M., The lattice Boltzmann equation: theory and applications, Phys. Reports, 222(3) (1992), pp. 145197.CrossRefGoogle Scholar
[3]Chen, S. and Doolen, G. D., Lattice Boltzmann method for fluid flows, Annual Rev. Fluid Mech., 30(1) (1998), pp. 329364.CrossRefGoogle Scholar
[4]Tölke, J. and Krafczyk, M., TeraFLOP computing on a desktop PC with GPUs for 3D CFD, Int. J. Comput. Fluid Dyn., 22(7) (2008), pp. 443456.CrossRefGoogle Scholar
[5]Kuznik, F., Obrecht, C., Rusaouen, G. and Roux, J.-J., LBM based flow simulation using GPU computing processor, Comput. Math. Appl., 59(7) (2010), pp. 23802392.CrossRefGoogle Scholar
[6]Bailey, P., Myre, J., Walsh, S., Lilja, D. and Saar, M., Accelerating lattice Boltzmann fluid flow simulations using graphics processors, in: 2009 International Conference on Parallel Processing, Ieee, 2009, pp. 550557.Google Scholar
[7]Habich, J., Zeiser, T., Hager, G. and Wellein, G., Performance analysis and optimization strategies for a D3Q19 lattice Boltzmann kernel on nVIDIA GPUs using CUDA, Adv. Eng. Software, 42(5) (2011), pp. 266272.CrossRefGoogle Scholar
[8]Obrecht, C., Kuznik, F., Tourancheau, B. and Roux, J.-J., The TheLMA project: Multi-GPU implementation of the lattice Boltzmann method, Int. J. High Performance Comput. Appl., 25(3) (2011), pp. 295303.CrossRefGoogle Scholar
[9]Obrecht, C., Kuznik, F., Tourancheau, B. and Roux, J.-J., Multi-GPU implementation of the lattice Boltzmann method, Comput. Math. Appl., 65(2) (2013), pp. 252261.CrossRefGoogle Scholar
[10]Xian, W. and Takayuki, A., Multi-GPU performance of incompressible flow computation by lattice Boltzmann method on GPU cluster, Parallel Comput., 37(9) (2011), pp. 521535.Google Scholar
[11]Myre, J., Walsh, S., Lilja, D. and Saar, M. O., Performance analysis of single-phase, multiphase, and multicomponent lattice-Boltzmann fluid flow simulations on GPU clusters, Concurrency and Computation: Practice and Experience, 23(4) (2011), pp. 332350.Google Scholar
[12]Xiong, Q. G., Li, B., Xu, J., Fang, X. J., Wang, X. W., Wang, L. M., He, X. F. and Ge, W., Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units, Chinese Science Bulletin, 57(7) (2012), pp. 707715.CrossRefGoogle Scholar
[13]Obrecht, C., Kuznik, F., Tourancheau, B. and Roux, J.-J., Scalable lattice Boltzmann solvers for CUDA GPU clusters, Parallel Comput., 39(6–7) (2013), pp. 259270.CrossRefGoogle Scholar
[14]Feichtinger, C., Habich, J., Köstler, H., Hager, G., Ruede, U. and Wellein, G., A flexible patch-based lattice Boltzmann parallelization approach for heterogeneous gpu–cpu clusters, Parallel Comput., 37(9) (2011), pp. 536549.CrossRefGoogle Scholar
[15]Bernaschi, M., Fatica, M., Melchionna, S., Succi, S. and Kaxiras, E., A flexible high-performance lattice Boltzmann GPU code for the simulations of fluid flows in complex geometries, Concurrency and Computation: Practice and Experience, 22(1) (2010), pp. 114.CrossRefGoogle Scholar
[16]Bernaschi, M., Bisson, M., Fatica, M., Melchionna, S. and Succi, S., Petaflop hydrokinetic simulations of complex flows on massive GPU clusters, Comput. Phys. Commun., 184(2) (2013), pp. 329341.CrossRefGoogle Scholar
[17]Qian, Y., D’Humieres, D. and Lallemand, P., Lattice BGK model for Navier-Stokes equation, Europhys. Lett., 17(6) (1992), pp. 479484.CrossRefGoogle Scholar
[18]He, X., Zou, Q., Luo, L.-S. and Dembo, M., Analytic solutions of simple flows and analysis of nonslip boundary conditions for the lattice Boltzmann BGK model, J. Statist. Phys., 87(1) (1997), pp. 115136.CrossRefGoogle Scholar
[19]Habich, J., Performance Evaluation of Numeric Compute Kernels on nVIDIA GPUs, Master’s thesis, University of Erlangen-Nürnberg, 2008.Google Scholar
[20]Sangani, A. and Acrivos, A., Slow flow through a periodic array of spheres, Int. J. Multiphase Flow, 8(4) (1982), pp. 343360.CrossRefGoogle Scholar