We describe our multi-threading parallelization strategy of a general-purpose reservoir simulator (GPRS) based on a flexible Automatic Differentiation (AD) framework. Parallel Jacobian construction is achieved with a thread-safe extension of our AD library. For linear solution, we use a two-stage CPR (Constrained Pressure Residual) preconditioning strategy, combining the parallel multigrid solver XSAMG and the Block Jacobi technique with Block ILU(0) applied locally. The speedup of the full SPE 10 problem (1.1M cells) is about 5.0X on a dual quad-core Nehalem node. We then discuss the GPU parallelization of Nested Factorization (NF). The Massively Parallel NF (MPNF) algorithm was first introduced by Appleyard et al. (2011), where the 3D structured grid is divided into kernels, and each kernel is assigned a color such that no neighbouring kernels share the same color. Then, parallelism is exploited in the concurrent solution of all kernels with the same color. The most important aspects of our algorithm are: 1) coalesced memory access via special ordering of the matrix elements, and 2) application of the twisted factorization technique that further improves parallelism. With a 512-core Tesla M2090 GPU, the speedup of the full SPE10 problem is about 26X for single precision, and 19X for double precision.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error