- Home
- Conferences
- Conference Proceedings
- Conferences
Fifth EAGE Workshop on High Performance Computing for Upstream
- Conference date: September 6-8, 2021
- Location: Online
- Published: 06 September 2021
1 - 20 of 21 results
-
-
Leveraging GPUs for matrix-free optimization with PyLops
Authors M. RavasiSummaryThe use of Graphics Processing Units (GPUs) for scientific computing has become mainstream in the last decade. Applications ranging from deep learning to seismic modelling have benefitted from the increase in computational efficiency compared to their equivalent CPU-based implementations. Since many inverse problems in geophysics relies on similar core computations – e.g. dense linear algebra operations, convolutions, FFTs – it is reasonable to expect similar performance gains if GPUs are also leveraged in this context. In this paper we discuss how we have been able to take PyLops, a Python library for matrix-free linear algebra and optimization originally developed for singe-node CPUs, and create a fully compatible GPU backend with the help of CuPy and cuSignal. A benchmark suite of our core operators shows that an average 65x speed-up in computations can be achieved when running computations on a V100 GPU. Moreover, by careful modification of the inner working of the library, end users can obtain such a performance gain at virtually no cost: minimal code changes are required when switching between the CPU and GPU backends, mostly consisting of moving the data vector to the GPU device prior to solving an inverse problem with one of PyLops’ solvers.
-
-
-
HPC in The Cloud MVP
Authors J. Pontvianne, D. Klahr and D. CooperSummary“As part of Total’s computing strategy, a minimum viable product (MVP) has been conducted in 2019 and 2020. The goal of which was to evaluate the feasibility of deploying High Performance Computing (HPC) workflows in the cloud. While many industries have begun shifting business workloads to cloud, HPC still remains on-premises for most companies, including many of our peers in the energy industry. We decided to perform full seismic and reservoir studies which are representative of our production workload. This MVP is a continuation of a Request For Information (RFI) in 2017 and a Proof Of Concept (POC) in 2018. The outcome is to provide recommendations whether we can consider the cloud (fully, partially or not at all) in our HPC procurements. We have evaluated the following claims of the cloud providers:
- flexibility and economic elasticity, such as on-demand deployment (pay-as-you-go).
- application specific provisioning (right-sizing resources).
- life cycle, allowing us quick access to cutting edge technologies.
- scalability, for quick expansion of resources.
- locality and availability, with global datacenters (e.g. covering disaster recovery).”
-
-
-
Toward an application of quantum computing in geophysics
Authors M. DukalskiSummaryQuantum computing offers a theoretical speed-up (in terms computational complexity) when performing certain computational tasks. Successful inclusion of quantum computing units in existing HPC solutions is contingent on identifying appropriate and realistic use-cases, of which, to date, there are very few candidates. This is particularly true for upstream business. This is because, there are few quantum algorithms, each coming with a number of caveats which ought to be carefully studied. We suggest a potential simple use-case in geophysics and 1D inverse scattering theory and provide detailed analysis of the requirements needed to be met in order to achieve the promised speed-up.
-
-
-
Performance Evaluation of Stencil Calculation in RTM Code
More LessSummaryPerformance of the stencil operation, which is widely used in the geoscience space is discussed. The characteristics of the stencil operation is high B/F demand, which can be accelerated through the provision of higher memory bandwidth per core/processor and the reduction of the number/frequency of memory accesses. NEC SCA can reduce memory access, and the NEC VE20 processor of SX-Aurora TSUBASA provides much higher memory bandwidth than other processors. Because of these advantages, the VE20 processor with SCA provides up to nine times higher performance than modern x86 processors.
-
-
-
Optimizing HPC Parameters for Reverse Time Migration
Authors R. SampathSummaryReverse Time Migration (RTM) is a key application used in seismic imaging and accounts for a significant portion of HPC resource utilization in the Oil & Gas exploration industry. The number of nodes used per shot and the corresponding domain decomposition can have a significant impact on RTM performance and project cycle time. In this work we will describe a method to automatically select the optimal number of nodes and the best domain decomposition for each shot in a RTM project. In addition to the computational savings and reduction in cycle time, the method presented here also improves the user experience for seismic processors as they only need to focus on tuning the parameters that affect the results and do not have to worry about the HPC parameters. This becomes even more important as compute environments become more heterogeneous and projects try to use all available compute resources in order to reduce cycle time. Also in a pay-per-use model, it is helpful to be able to predict the compute costs for a project.
-
-
-
HPC workload management for full resource utilization
Authors N. Bienati, L. Bortot, C. Fortini and J. PanizzardiSummaryIn industrial HPC applications, the maximization of the overall performance is a complex task: besides optimization aspects strictly related to numerical algorithms, many others must be taken into account. In particular, in addition to single kernel execution, it may be necessary to focus also on workflow execution, as well as it could be important to optimize the execution of a heterogeneous projects workload with dynamically changing priorities. Here we will discuss how we faced this challenge in the context of running large seismic imaging projects.
-
-
-
Cloud Elasticity Combined with Innovative Assisted History Match Accelerates Reservoir Risk Assessment
Authors C. Cosson, T. Taha, P. Ward, S. Tadepalli and D. TishechkinSummaryThis paper will discuss the added value of a cloud environment in an innovative risk assessment solution involving the batch run of a geomodelling application and a flow simulator. The project was conducted jointly by Emerson and AWS. The first phase of the collaboration focused on assessing cloud parameters on the performance of the software and the cost of completing typical activities. It was then applied to a case study using the Volve oil field on the Norwegian continental shelf* to draw comparisons with current on-premises installations and evaluate the solution from a reservoir management perspective. The study first showed that while the optimal cloud configuration for history matching is dependent on global parameters, it must be fine-tuned for each reservoir model. A method to reduce simulation time and cloud costs was established. The study then showed that use of the cloud has a positive impact on operations. The results are available within hours instead of days; same-day evaluation leads to faster decisions and improved operational efficiency. Moreover, enhanced stochastic analyses are now possible for those without high-performance on-premises clusters.
-
-
-
Performance Characterization of a Vector Architecture for Seismic Applications
Authors V. Etienne, A. Momin, L. Gatineau and S. MomoseSummaryExplicit time-domain finite-difference (TD-FD) methods are largely used in seismic exploration. They are at the heart of wave-equation based geophysical algorithms such as Reverse Time Migration and Full Waveform Inversion. Due to the ever-increasing amount of acquired seismic data and the need for higher resolution to optimize oil production, it is crucial to deploy TD-FD on High Performance Computing (HPC) platforms. In this work, we explore the performance reachable on vector architectures. The study is done on a traditional scalar CPU to get a performance baseline, and on a vector solution which was heavily used in the past by the O&G industry.
-
-
-
GEOSX: a multiphysics, multiscale, reservoir simulator for HPC
Authors H. GrossSummaryGEOSX is an open-source, exascale-ready, multiphysics simulator for geological formations. This simulator is currently developed by Lawrence Livermore National Laboratory, Stanford University, and Total. GEOSX is designed to address several types of complex simulation use cases, including geological storage of carbon dioxide (CCUS). Numerical simulations of such operations require coupling between mass/energy transfers and rock geomechanics over large formations and for long simulation periods. GEOSX provides such simulation solutions for mixed-architecture systems. The code is not limited to a specific, unique, target architecture by the use of two libraries called RAJA and CHAI. Here, we present the essential functionalities and underlying building blocks of GEOSX. Examples of use cases are shown. We discuss challenges encountered along the way and especially those related to multiphysics modeling. Last, we provide all references required to access, download the sources, and build GEOSX (under LGPL 2.1 license).
-
-
-
Nonlinear Preconditioning for Two-phase Flows
More LessSummaryUsing a classical Newton-Krylov method to solve the resulting nonlinear system of two-phase flows in porous media often suffers from slow convergence or failure in line search. We propose two nonlinear elimination preconditioning strategies to handle this issue by performing subspace correction to remove the local strong nonlinearities. Numerical experiments show that the proposed methods are more robust and faster than the existing method with respect to some physical and numerical parameters, and scalable to thousands of processes.
-
-
-
Up-to-date assessment of 3D frequency-domain full waveform inversion based on the sparse multifrontal solver MUMPS
Authors P. R. Amestoy, J.-Y. L’Excellent, C. Puglisi, A. Buttari, T. Mary, M. Gerest, L. Combe and S. OpertoSummaryEfficient frequency-domain Full Waveform Inversion (FWI) can be applied on long-offset/wide-azimuth stationary-recording seabed acquisitions carried out with ocean-bottom cables (OBC) and ocean bottom nodes (OBN) since the wide angular illumination provided by these surveys allows for limiting the inversion to a few discrete frequencies. In the frequency domain, the forward problem is a boundary value problem requiring the solution of large and sparse linear systems with multiple right-hand sides. In this study, we revisit the potential of the massively-parallel sparse multifrontal solver MUMPS to perform efficiently the multi-source forward problem of 3D visco-acoustic FWI. The execution time and memory consumption of the solver are further improved by exploiting the low rank properties of the sub-blocks of the dense frontal matrices, the sparsity of the right-hand sides (seismic sources) and the work in progress on the use of mixed precision arithmetic. We revisit a 3D OBC case study from the North Sea in the 3.5~Hz-13~Hz frequency band using between 10 and 70 nodes of the Jean-Zay supercomputer of IDRIS and show that, even without exploiting low rank properties, problems involving 50 millions of unknowns and probably more can be tackled today with this technology.
-
-
-
Hybridized discretizations for seismic wave simulations
More LessSummaryWe demonstrate three hybridized discretizations that can be stably applied to the wave propagation problem. By confining the most demanding discretization to small subdomains, these techniques have the potential to significantly reduce the computational resources required to perform the routine tasks in seismic studies.
-
-
-
-
GPU accelerated FWI using the Open Concurrent Computing Abstraction (OCCA)
Authors A. St-Cyr, S. Reker, S. Frijters, S. Chawdhary, A. Panda, S. Banerjee, H. Knibbe and M. MurugananthamSummaryAdapting high-performance software to various architecture while ensuring performance is a challenging endeavor more so in our industry where HPC drives explorations and production activities. Seismic exploration is impossible without seismic imaging and velocity model building which both entirely rely on supercomputers. In this work, we describe how we ported our existing proprietary seismic libraries to GPUs & how this single effort will work out for many other architectures.
-
-
-
Application of the vectorization library NSIMD to the EFISPEC3D kernel
Authors G. Quintin, S. Jubertie, F. De Martin and K. PéouSummaryWe show in this work that using the NSIMD vectorization library allowed to obtain better performances on the EFISPEC3D kernel, a spectral-finite-element method to solve the forward seismic wave propagation problem. Moreover the same code without modification can be compiled to target different SIMD extensions with different vector sizes without degrading performances.
-
-
-
Toward High Performance Asynchronous RTM with Temporal Blocking and Buffered I/O
More LessSummaryDuring the forward and backward modeling in Reverse Time Migration (RTM), stencil computations constitute one of the main computationally intensive components. Their classic implementation based on Spatial Blocking (SB) is subject to performance limitation on modern multicore architectures due to several reasons, including non-uniform memory access, memory bandwidth starvation, load imbalance, and limited data locality. The Multicore Wavefront Diamond-tiling Temporal Blocking technique (MWD-TB) introduced in (Malas, PhD thesis 2015, Malas et al., SIAM SciCo 2015, Malas et al., ACM Trans 2017) aims at reducing the memory bandwidth requirement of stencil computations by increasing cache reuse within successive time steps. The authors in (Akbudak et al. IJHPCA 2020) integrate the MWD-TB technique into the modeling phase and the authors in (Qu et al., KAUST Tech Report 2020) eventually embed it into the full RTM using in-memory I/O operations snapshotting for the imaging condition and illustrate with the Salt3D dataset. In this paper, we further enable Out-Of-Core (OOC) I/O snapshotting operations on the Lustre parallel file system using the buffering strategy from MLBS (Alturkestani et al., EuroPar 2020). We present preliminary results using the Marmoussi 3D dataset.
-
-
-
Leveraging DAOS file system for seismic data storage
Authors M. Moawad, A. Nasr, O. Marzouk, K. ElAmrawi, P. Thierry, J. Lombardi and M. ChaarawiSummaryDAOS-SEIS mapping layer is introduced to the seismic community, utilizing the evolving DAOS technology, to solve some of the seismic IO bottlenecks caused by the SEGY data format through leveraging the graph theory in addition to the DAOS object-based storage to design and implement a new seismic data format natively on top of the DAOS storage model in order to accelerate data access, provide in-storage compute capabilities to process data in place and to get rid of the serial seg-y file constraints. The DAOS-SEIS API is built on top of the DAOS file system(dfs) and seismic data is accessed and manipulated using the DAOS-SEIS API after accessing the root seismic dfs object. The mapping layer is perfectly utilizing the graph theory and the object storage to split the acquisition geometry represented by the traces headers away from the time-series data samples
-
-
-
Opensource RTM using DPC++ programming model
Authors A. Ayyad, A. Nasr, E. Nasr, I. Mounir, O. El-Maihy, M. Samier, M. El-Sherbiny, Z. Osama, K. Elamrawi, S. Gogar and P. ThierrySummaryIn this work we present oneAPI - an opensource specification- and DPC++ Programming Language . We explain why DPC++ is an efficient language for programming different devices and for different vendors and briefly introduce Reverse Time Migration (RTM) as a use case and demonstrate how its finite difference kernel is implemented in DPC++.
-
-
-
Improving GPU throughput of reservoir simulations using NVIDIA MPS and MIG
Authors R. Gandham, Y. Zhang, K. Esler and V. NatoliSummaryIn this paper we demonstrated that the overall simulation throughput of full-GPU reservoir simulators can be further improved significantly without any modifications to the software, using NVIDIA’s Multi-Processing-Service and Multi-Instance-GPU infrastructure. For models with just a few thousand cells, a throughput increase of 7x is achieved while for problems with a million cells a 60% improvement is achieved using MPS. Furthermore, when using either MPS or MIG, the smaller models can achieve 80% of the peak achievable performance of larger models. In the context of uncertainty quantification workflows, these performance improvements are significant.
-