- Home
- Conferences
- Conference Proceedings
- Conferences
Third EAGE Workshop on High Performance Computing for Upstream
- Conference date: October 1-4, 2017
- Location: Athens, Greece
- Published: 01 October 2017
25 results
-
-
Scale Out vs. Scale Up for Ultra-Scale Reservoir Simulation
Authors K. Mukundakrishnan, R. Gandham, K.P. Esler, D. Dembeck, J. Shumway and V. NatoliSummaryIt is an undisputed truth that demand for computational performance for simulating very large models in upstream applications is ever increasing. This demand can be met conceptually in one of two ways. “Scale-out”, implies exploiting additional computational nodes, while “scale-up” implies increasing the computational power, particularly floating point throughput and memory bandwidth of each node. In practice, these two approaches provide opposite bounds on a spectrum of cluster designs, from the use of many relatively weak, “thin” nodes, to a smaller number of powerful, “fat” nodes. The scale-out approach gained increasing dominance in HPC as scalability was prefered over absolute efficiency.
Over the past decade, however, energy efficiency has become the key performance limiter. For applications with significant communication requirements, including reservoir simulation, the use of scale-up fat nodes provides an opportunity to localize communications and minimize interconnect traffic, thereby increasing energy efficiency. However, harnessing fat fat nodes comprising of several extremely high-performance GPUs to achieve performance for implicit simulations requires careful software design and novel algorithmic approaches.
We will first present the algorithmic and computational challenges faced and the approaches needed to efficiently utilize the massive parallelism offered by such scaled-up nodes.
-
-
-
Selecting a CPU for Reservoir Simulation Optimized for Cost, Energy and Performance
Authors O. Al-Saadoon, M. Baddourah, A. Alturki and M. Al-HagriSummaryThe model validation with the actual experimental results will be shared. This paper presents guidelines for decisions on which CPU in a product family will meet an organization’s compute requirements with a balance of power consumption, performance, cost and user’s experience
-
-
-
Rapid Development of Seismic Imaging Applications Using Symbolic Math
Authors N. Kukreja, M. Louboutin, M. Lange, F. Luporini and G. GormanSummaryIn this talk, I will discuss our approach to the formulation and the performance optimization of finite difference methods for PDEs arising in FWI. Our framework consists of a stack of domain specific languages and optimizing compilers. The mathematical specification of a finite difference method is translated by a compiler, Devito, into C code, applying a sophisticated sequence of transformations. These include standard loop transformations, such as blocking and vectorization, as well as symbolic manipulations to reduce the unusually high arithmetic intensity of the stencils arising in forward and adjoint operators. These include common subexpressions elimination, factorization, code motion and approximation of transient functions. I will show the impact of these transformations on standard Intel Xeon architectures as well as on Intel Knights Landing. Compelling evidence points in the direction that our stencil kernels are significantly bound by the L1 cache. I will conclude discussing future challenges and goals of our work.
-
-
-
Achieving Computationally Scalable Parallelism in a Geostatistical Inversion Algorithm
Authors A. Ephanov, R. Bornard and H. DebeyeSummaryWe present a systematic approach to achieving computationally scalable parallelism in the context of geostatistical inversion. Our experience shows that efficient utilization of hardware requires recursive application of domain decomposition, ranging from multi-process model on a cluster of workstations to multithreading on individual CPU cores. Actual run-times and the degree of realistically achievable scalable parallelism depend on a multitude of factors. The list includes the project area size, the probabilistic model complexity, coarse scale of a stochastic iteration of the Multigrid Monte Carlo scheme, and hardware specifications, to name a few. However, overall, we were able to achieve close to optimal multithreaded scalability (where theoretically possible) on up to 32 CPU cores. We also observe that efficiency of multithreading becomes limited by the memory bus bandwidth at some point.
-
-
-
A Generic Multi-parameters FWI Framework Based on Symbolic Expressions
Authors E. Bergounioux, C. Rivera, B. Duquet and M. DolliazalSummaryThe extension of full waveform inversion to multi-parameter is the straightforward step towards achieving its full capability, i.e. extracting high resolution information about the sub-surface properties of the Earth based on seismic data. Elastic and visco-elastic FWI begin to be computationally affordable at large scale using the latest HPC technologies and thus to introduce the option to invert for more parameters such as shear wave velocity and attenuation. FWI code design is a key point to allow geophysicists to test new inversion parameterization with minimal code developments. In this study we present a framework based on the abstraction of the parameters inverted. We describe how our framework allows us to use different wave equation propagators with different optimization schemes in a transparent and flexible way. We also show how using a symbolic expression parser to automatically compute required chain rules can greatly reduce the effort needed to implement and test new parameterizations while preserving computational efficiency of the code. We finally present how this symbolic parser can be used to easily implement master-slave relationships between inverted and passive parameters. We then present some FWI tests using different types of parameterizations on the SEAM synthetic model.
-
-
-
HPC Cloud for O&G
By D. KlahrSummaryThis paper presents an evaluation of HPC Cloud offers for O&G pure HPC workflows (Reservoir Simulation and Seismic Imaging).
-
-
-
HPC Containers with Singularity
Authors P. Souza, G.M. Kurtzer, C. Gomez-Martin and P.M. Cruz e SilvaSummaryThe new container technology Singularity focuses on the compute portability, this work covers the creation of portable containers using multiple resources, and showing how simple is to deploy a complex MPI+CUDA+Infiniband seismic application across multiple supercomputers with only a regular user account.
-
-
-
Containerizing Parallel MPI-based HPC Applications
Authors A. Schonewille and A. BukhamsinSummarySoftware container technology based on Docker is a lightweight packaging and virtualization technology. These containers are used to package and run the application with all its dependencies in a portable image with minimum requirements. Because of the complexity and dependencies of HPC (High performance computing) applications, this concept can be used to pre-package HPC applications. Although Docker is already used in cloud computing, HPC implementations are lagging due to, what we believe, the nature of most HPC applications: inter-process communications and the shortcomings of spawning containerized MPI-based applications on compute resources. This paper shows our solution for containerizing HPC MPI applications and running them across multiple hosts connected with Infiniband interconnect.
-
-
-
Performances Analysis of a Hybridizable Discontinuous Galerkin Solver for the 3D Helmholtz Equations in Geophysical Context
Authors M. Bonnasse-Gahot, H. Calandra, J. Diaz and S. LanteriSummaryIn our work, we consider Discontinuous Galerkin methods (DGm) to solve 3D elastic equations in frequency domain. In 3D, the large size of the linear system represents a challenge even with the use of High Performance Computing (HPC).
Our solution is to develop a new class of DGm, a hybridizable Discontinuous Galerkin method (HDGm). It consists of expressing the unknowns of the initial problem in terms of the trace of the numerical solution on each face of the mesh cells.
We, first, compared the computational performances, in terms of CPU time and memory consumption, of the HDGm with the ones of a classical DGm and of a classical finite elements method (FEm).
Then, since the global matrix of HDGm is very sparse, we also compared performances of the two solvers when solving 3D elastic wave propagation over HDGms: a parallel sparse direct solver MUMPS ( MUltifrontal Massively Parallel sparse direct Solver) and a hybrid solver MaPHyS (Massively Parallel Hybrid Solver) which combines direct and iterative methods.
-
-
-
Better Productivity and Portable Finite Difference Wave Equation Propagators Using Directive Based Programming
Authors G. Hugues and H. CalandraSummaryHeterogeneous architectures such as GPU have demonstrated during the past decade to be highly efficient and provide better performance than conventional CPU for seismic depth imaging. GPUs requires the use of specific language extension that are considered as low-level programming model such as CUDA. Unfortunately, CUDA is not standard and not portable on multiple vendors and hardware. OpenACC is standard high-level programming model based on directives that aims to simplify GPU programming. Despite offering a more productive way to develop parallel code on GPU, OpenACC has been considered way less efficient than low-level programming models like CUDA or OpenCL. In order to understand what is the performance gap between those models, we have implemented finite difference wave equation propagators using OpenACC. In this paper, we are demonstrating that OpenACC can achieve up to 70% of efficiency when compared to the best CUDA implementation. From this experience, we have developed an optimization methodology based on an increased arithmetic intensity and applied it on acoustic and elastic wave equation for both isotropic and anisotropic media.
-
-
-
From CPU to GPU in Two Days: 3D Elastic Orthorhombic Modeling with OpenAcc
Authors V. Kazei, N. Masmoudi, J-W. Oh, C. Tzivanakis and T. AlkhalifahSummaryWavefield modeling is necessary in modern seismic imaging applications such as reverse time migration and full-waveform inversion. When the medium has complex structures such as salt bodies or carbonate reservoirs finite-difference methods (FDM) for wavefield simulation (extrapolation) are typically used to handle those cases. FDM allows us to simulate a multitude of realistic wave phenomena, but in some cases it makes our applications computationally intensive. When large numbers of sources and receivers are considered, a large number of wavefield extrapolations in the process of inversion is executed. To accelerate the 3-D wavefield simulation in elastic orthorhombic anisotropic media we rely on GPU technology. With the OpenAcc PGI compiler we create a pool of automatically managed memory that is shared between the CPU and GPU, thus achieving data management with minimal code modifications. We collapse the tightly nested loops used for velocity and stress updates which allows us to improve the execution time of the whole code by about ten percent. We report a performance speedup as we compare to a 16 core dual socket Haswell server of 1.15X on a K80 GPU and 2.32X when using the Pascal Tesla P100 GPU.
-
-
-
High-Performance Seismic Modeling with Finite-Difference Using Spatial and Temporal Cache Blocking
Authors V. Etienne, T. Tonellot, T. Malas, H. Ltaief, S. Kortas, P. Thierry and D. KeyesSummaryThe time-domain finite-difference method (TD-FDM) has been used in geophysics for decades for modeling and imaging. It is used intensively for applications that require accurate solutions for the wave equation such as reverse time migration (RTM) or full waveform inversion (FWI). In this study, we investigate how spatial and temporal cache blocking techniques can speed up computation in TD-FDM on multi-core architectures. We conducted our analysis on the Shaheen II supercomputer at the King Abdullah University of Science and Technology (KAUST) and present the current and achievable performances by using a Cache Aware Roofline Model (CARM). We briefly discuss the implementations and the benefits of spatial and temporal cache blocking techniques individually, and we provide preliminary results, which pave the way for achieving the TD-FDM’s maximum efficiency.
-
-
-
Optimizing Performance of a TTI RTM Finite Difference Kernel for x86 Instruction Set Architectures
Authors G. Skinner, A. St-Cyr, M. Bosmans and D. vanEijkerenSummaryWe examine the performance of a 3D finite difference seismic reverse time migration (RTM) kernel on systems with Intel® Xeon® processors and Intel® Xeon Phi™ x200 processors. Using an identical tuned RTM TTI source code, an system with one Intel Xeon Phi 7250 processor outperforms a system with two-socket Intel Xeon 2697v4 processors by a factor of 1.5.
-
-
-
Methods to Overlap Communication with Computation
Authors N. Kayum, A. Baddourah and O. HajjarSummaryIn this work, Intel® MPI technology and its benchmark code/applications are used to obtain a better understanding of CCO. We apply non-blocking point-to-point exchange to mask the communication time needed with computation time. Intel® MPI is accompanied with an open source MPI Benchmark packages which include a non-blocking collective operation benchmark. The benchmark demonstrates the communication time versus computation time needed to produce certain percentage of overlap. We begin by modifying the Intel non-blocking collective operation benchmark to cater to the message sizes and operations used in an in-house parallel reservoir simulator. The findings serve as a guide in identifying CCO locations in our code and in maximizing the progression of communication achieving further overlap. In this paper, we share the benchmarks modification made for preliminary analysis of the MPI exchange behavior, the results of using asynchronous progression versus manual progression, the use of the results in deciding the overlapping changes made in our simulator code and the performance benefits of the modifications made.
-
-
-
Improving the Truncated Spike Algorithm via Neumann Series Approximations
Authors S. Rodriguez Bernabeu, E.J. Sánchez, M. Hanzich and S. FernándezSummaryLarge memory overhead of LU decomposition during the factorization stage of the truncated SPIKE algorithm represents a common bottleneck. For large banded diagonally-dominant linear systems, their structure can be leveraged to avoid said LU decompositions. Specifically, Neumann series can be used to approximate these inverse, with smaller memory consumption complexity thus avoiding the excessive growth of the reduced linear system. Convergence of the Neumann series is ensured via a simple scaling of the input matrix. We present results showing the achieved accuracy providing bounds for memory and FLOP count.
-
-
-
Adaptive Parallelization of the Algorithm for Electromagnetic Logging Data Simulation in a 2D Formation Model
Authors D.Yu Kushnir, N.M. Tropin, G.V. Dyatlov and A. DashevskySummaryWe developed an OpenMP parallel version of the solver for numerical simulation of deep and extra-deep resistivity logging data. In this version, nested parallelization is implemented. Depending on the simulation task and CPU resources, we propose an algorithm for choosing the parallelization scheme that provides the fastest simulation.
-
-
-
Dynamic Host-Aware I/O Writers Assignment Among MPI Ranks in High Performance Computing Systems
Authors O. Hajjar, M. Baddourah, A. Alturki and R. Al-HarbiSummarya new and novel dynamic approach for writers selection by mapping writing tasks dynamically and adaptively to MPI processes at runtime. The objective of this method is not to deduce the optimal number of writers that should be used. Instead, the method determines and recommends better mapping that enhances the balance of I/O workload across hosts and maximize the overall I/O performance during the lifespan of the simulation run.
-
-
-
ExSeisPIOL: A Seismic Parallel I/O Library for Increasing Developer Productivity
Authors C.O. Broin, R. Short, M. Fishe, S. Delaney, S. Dagg, G. O’Brien, J.-T. Acquaviva and M. LysaghtSummaryWe have developed a SEG-Y-compatible seismic parallel I/O library (ExSeisPIOL) to address heavy data loads for seismic applications on emerging extreme-scale HPC systems, with a particular focus on seismic imaging. The efficient access of seismic files on HPC systems, including those that exploit emerging HPC technologies such as hierarchical storage, is a non-trivial development task.
The ExSeisPIOL has the twin design goal of delivering increased productivity and high performance. The library facilitates computational geophysicists during the development phase by reducing the time spent on I/O related optimisations and debugging, while retaining incumbent formats where applicable. In particular, we have placed considerable focus on abstracting away the details of seismic formats as well as low-level interfaces, such as MPI-IO. In addition, API calls are limited to those cases where there is an identified industry use-case so that the API is kept minimal. Common operations for pre-processing, such as file sorting, are also provided within the library, with bindings for C and C++ provided.
On the performance front, we have placed considerable attention on data-side optimisations, where we have demonstrated that performance improvements can deliver benefits across the entire installed application base for existing industry-standard seismic applications.
-
-
-
Advanced IO Implementation & Performances for Seismic Applications
Authors P.Y. Aquilanti, M. Hugues, S. Jha and H. CalandraSummaryOil & Gas companies are facing an increasing challenge on extracting performances from parallel IO filesystems for check-pointing and writing results. Those efforts will increasingly become an issue due to the complexity introduced by the integration of more sophisticated seismic imaging equations in order to sustain the need of more detailed images to face the next exploration and production challenges. As current and future generations High Performance Computing (HPC) systems are evolving toward an increase computing power, IO bandwidth is remaining relatively constant. Henceforth, the need to manage IO efficiently at scale will become a strict requirement.
We will present an implementation and optimization study of ADIOS in the context of seismic imaging. We will exhibit a performance study made on a proxy application and on a RTM for several computing kernels for different HPC systems in the context of check-pointing. We will show that using ADIOS can provide good performances, manage more efficiently IO and reduce meta-data contention. We will highlight that advanced IO libraries such as ADIOS provide the opportunity to overcome the challenges of maintaining performances with high level IO interfaces at scale.
-
-
-
IBM Data Cognitive Systems Strategy and Directions – Innovations for HPC, HPDA and Machine & Deep Learning Merging, in Oil and Gas
By P. VezolleSummaryHigh Performance Computing technologies and challenges continue to evolve. The explosion and the power of Cognitive and Artificial Intelligence open new ways to design the next generation of oil and gas applications and infrastructures. The wide range of applications and the common growing needs in computing power must be addressed by future hybrid and heterogeneous system solutions, as well as disruptive technologies like Quantum and Neurosynaptic computing. In
-
-
-
Machine Learning Ecosystem for the O&G Industry
Authors P. Demichel and E. OrlottiSummaryHPC and Big Data are progressing towards new challenges in an era of explosion of machine data. Computing power is a given, data management is the challenge. Memory-driven computing is going to change dramatically the systems architectures, and HPE is working on the building blocks that will enable this new paradigma of computing. The talk will present the innovative technologies and the roadmap that will bring to that new ecosystem.
-
-
-
The Three Silver Bullets Of HPC For Oil and Gas
By A. JonesSummaryHigh Performance Computing (HPC), or supercomputing, is a powerful tool for research and production use. However, HPC inhabits a complex and rapidly evolving technology landscape. This presents challenges for those seeking to use HPC, in terms of picking the right technologies, and exploiting them effectively. Active research programs and technology companies around the world continue to propose solutions for various aspects of the HPC difficulties.
Many of these proposed solutions are hyped as ‘silver bullets’, hoping to solve major challenges with how we use HPC or deliver disruptive improvements. This talk explores three of these ‘silver bullets’ to examine whether their will live up to their hype and what problems they will solve. The three silver bullets are: GPUs, cloud computing, new programming languages and domain specific languages.
Merely identifying the silver bullets is not enough – this talk will explore how each silver bullet affects oil and gas use cases of HPC. How can oil and gas users of HPC take advantage of these silver bullets? How will they drive the skills needs and software methods used by developers?
-
-
-
Big Data Role in the Upstream Business Research
Authors S.L. Nimmagadda and A. AseevSummaryAn offshore petroleum is the main focus of the current research with Big Data and high performance computing motivations in the study area. We undertake a joint exploration study focusing on Romanian offshore Black Sea (Western) basin, using volumes and varieties of datasets in a Big Data scale. The exploration datasets include more than 175 2D seismic lines, 30 km2 of 3D seismic data, information on more than 8 exploratory drilled wells including check-shot and VSP data including existing petrophysical and production data.
Big Data opportunities are explored in the current upstream business research by proposing data modelling, visualization and data interpretation schemes. In spite of the data quality issues in the study area, several isochrones, isochores and other geological information are integrated and made based on which depositional models are drawn for risk minimizing the exploration and field development plans. The conclusions are based on structural, strati-structural interpretation, organic geochemistry and identification of new opportunity areas. Several data models, visualization and interpretation artefacts can handle the volumes and varieties in Big Data scale minimizing the risk involved in the upstream business in the investigating area. Several new opportunities are identified in the shelf, slope and deep marine areas.
-
-
-
On a Robust Data Modelling Approach for Managing the Fractured Reservoirs in an Onshore Colombian Oil & Gas Field
Authors S.L. Nimmagadda, L. Chavez, J. Castaneda and A. LoboSummaryNeither the limits of elements and processes of the petroleum systems of Colombian sedimentary basins are known nor interpreted without ambiguities undermining the reservoir complexities and hampering the data integration in the upstream business. Partly it is due to poor understanding of the datasets and poorly articulated data modelling, visualization and interpretation artefacts in complex geological regimes. We propose an ontology based multidimensional warehouse repository approach with ontology constructs and models for various data sources, acquired from multiple domains of upstream business. We choose several data volumes, variety of multidimensional data attributes and their fact instances for interpreting seismically integrated geological horizons. Structure and fracture attribute map views are computed to ascertain the density of fractures and their orientations, calibrating the fracture signatures with production data existing within the interpreted faulted compartments. Field development plans are assessed based on new knowledge, obtained from domain ontology descriptions, exploring connections among multi-stacked fractured reservoirs. Though we find no structural bearing on the accumulations of oil and gas in the study area, the fracture density and orientation appear to have definite bearing on production. Integrated framework minimizes the ambiguity involved in the interpretation of fractures, their density and orientations in the study area.
-
-
-
Three Dimensional Parallel Sobel Seismic Fault Detection
Authors A. Al-Naeem, S. Al-Dossary and Z. ZhouSummaryThe Sobel filter is a discrete differentiation operator widely used in seismic image processing algorithms for automatic fault detection and extraction. The filter approximates the local gradient by combining derivatives of the amplitude between neighboring traces along the x, y, and z directions. However, 3D-Sobel Seismic Fault Detection algorithm runs very slowly and is computationally intensive, even with dual Sandy Bridge Xeon 8-core 2.6 GHz CPU, it requires more than one minute to process a 560×390×320 seismic volume data.
Herein we present a multiple parallel computing designs leveraging shared memory (OpnMp), distributed memory (MPI) and many-core Graphical processing Unit (GPU) to reduce total execution time of 3D-Soble Seismic Fault Detection; experiments demonstrate a significant speed up over serial CPU version. For a 250 MB poststack seismic volume, the speedup using the parallel algorithm with MPI was six times faster than the serial version of the algorithm.
We first introduce the new 3D-Sobel seismic fault detection algorithm and the parallel implementations, and then show the experimental results.
-