- Home
- Conferences
- Conference Proceedings
- Conferences
Sixth EAGE High Performance Computing Workshop
- Conference date: September 19-21, 2022
- Location: Milan, Italy
- Published: 19 September 2022
20 results
-
-
Speeding up the 2+2+1 Method via GPU Computing for Estimating Local Travel Time Operators in Nonlinear Beamforming
Authors Y. Sun, I. Silvestrov and A. BakulinSummaryNonlinear beamforming is an effective method to enhance the quality of noisy seismic data. It uses local traveltime operators to describe local wavefronts, and then stack neighboring traces guided by these operators to enhance data quality. The 2+2+1 method is a pragmatic solver for estimating local traveltime operators from input data, but its calculation efficiency is not satisfying when the solution space is big. We speed up the 2+2+1 method using graphics processing unit (GPU) computing with the Compute Unified Device Architecture (CUDA) programming language. We introduce our GPU-based 2+2+1 algorithm, and demonstrate its efficiency improvement using a field data example. A speed-up factor of ∼10 is obtained compared to the CPU version of the 2+2+1 method.
-
-
-
Hybrid Classical-Quantum Computing in Geophysical Inverse Problems: The Case of Quantum Annealing for Residual Statics Estimation
Authors S.G. Van der Linde, M. Dukalski, M. Möller, N.M.P. Neumann, F. Phillipson and D. RovettaSummaryRecent progress in geophysics can be attributed to developments in heterogeneous HPC architectures, with one of the next major leaps being forecasted to be due to quantum computers. It is, however, very difficult to find the right combination of hardware, algorithms and a use-case. This is especially true for applications which have to be simultaneously: relevant and operating at scales where problems become difficult to solve using classical means. Maximizing stack-power for improved near surface characterization and velocity model building, an NP-hard combinatorial optimization problem, appears to naturally fit a particular type of quantum computing known as quantum annealing. We present the quantum-native formulation of this problem. Furthermore, in order to improve the probability of success we embed it in a hybrid classical-quantum workflow. We present the results on controlled experiments run using a 5000-qubit machine and discuss the impact of different classical to quantum problem re-formulations.
-
-
-
An Explicit Solution Suited for HPC to Calculate the Implicit FDs
More LessSummaryThe finite difference solution of the second-order acoustic wave equation is a fundamental algorithm in seismic exploration. In contrast to the explicit finite difference (EFD) schemes that usually suffer from the so-called “saturation effect”, the implicit FD schemes can obtain a spectral-like resolution with a relatively short operator length. Unfortunately, these implicit schemes are not widely implemented because band matrices need to be solved implicitly, which is not suitable for high-performance computing (HPC). We introduce an explicit solution to overcome this limitation by applying causal and anti-causal integrations, and the new solution can be proved to be equivalent to the traditional implicit LU decomposition solution. Besides, we also compare the accuracy of the new schemes with traditional EFD schemes up to the 32nd order, and the results indicate that the optimized implicit FD scheme is more accurate. The computational cost for the newly proposed scheme is standard 8th order EFD plus two causal and anti-causal integrations, which can be realized recursively.
-
-
-
From Seismic Imaging to Wind Turbine Modelling: The Benefits of Vector Computing
Authors V. Etienne, L. Gatineau and M. IkutaSummaryThe reduction of greenhouse gas emissions to limit the global temperature increase is a major challenge for our society. In response, most of governments have setup roadmaps to reduce their carbon footprint (mainly CO2 emission) in the near future and have committed in international agreements on climate change such as the COP21. To reach these objectives, O&G companies are expected to play a pivotal role. At present, we witness significant investments from the oil majors to develop renewable energies with the aim to complete their portfolio besides the traditional exploitation of fossil resources.
In this work, we discuss the computing requirements of traditional applications for O&G. Based on this knowledge, we investigate how the development of renewable energies could impact the HPC ecosystem resulting from the introduction of novel algorithms. For each application, we discuss its compute footprint and explain how it can be handled efficiently on the NEC SX-Aurora TSUBASA, in the following referred to as Vector Engine (VE), an architecture specially designed for HPC workloads.
-
-
-
3D Onshore Shallow Velocity Model Building Using Full Waveform Inversion with GPUs and Static Task Distribution
Authors Y.S. Kim, M. Dmitriev and H.J. AlSalemSummaryFull Waveform Inversion (FWI) can establish an accurate shallow velocity model as a wave-based solution. The accurate velocity model is essential for generating high-fidelity seismic images to find reservoirs. Moreover, its successful implementation on a 3D onshore dataset depends on the quality of low-frequency components of seismic data. A two-stage waveform inversion workflow has been carried out to obtain a high-resolution shallow velocity model on the 3D land dataset to overcome this problem. As a first stage of the workflow, we update the velocity model to minimize the time difference of the first arrivals between field and synthetic data. In the second stage, we perform traveltime-oriented FWI to update the model by minimizing phase differences of seismic events between field and modeled data. Most of the functions used in waveform inversion are written by the compute unified device architecture (CUDA) program language to optimize performances further and reduce computing time. In addition, we switch on the static task distribution to our waveform inversion implementations instead of the dynamic task distribution, which is proven a powerful scheduler in the reverse time migration program.
-
-
-
Error-Bounded Lossy Compression in Reverse Time Migration
Authors M. Dmitriev, T. Tonellot, H.J. AlSalem and S. DiSummaryIn reverse time migration (RTM), shot and receiver wavefield are correlated in the opposite temporal direction. Therefore, the source wavefield should be saved to the hard disk or memory at all imaging time steps. For production 3D seismic surveys, this leads to enormous storage requirements and can significantly impact I/O throughput and the overall performance of RTM. To address these issues, we are using snapshot data compression. We comprehensively evaluate three state-of-the-art error-bounded lossy compressors (ZFP, SZ3, and Bitcomp) for RTM application, using different metrics such as compression ratio and compression and decompression throughput. Finally, we analyze the impact of compression errors on the snapshot reconstruction and final stacked image quality.
-
-
-
GPU Accelerated Computing Towards a Fast and Scalable Seismic Wave Modelling in SEISCOPE SEM46 Code
Authors J. Cao, R. Brossier, E. Cabrera, J. De la Puente, L. Métivier and A. TarayounSummaryModelling of seismic wave propagation is widely used in the study and imaging of the Earth’s interior. Especially for the techniques of reverse time migration and full waveform inversion, an accurate and efficient seismic wave modelling solver plays a key role in handling complex and large-scale problems. In addition to developing novel modelling algorithms from a mathematical standpoint, it is necessary to study how to implement existing methods on modern heterogeneous high-performance computing (HPC) platforms, including at least one type of accelerator. Utilizing GPUs as accelerators has been shown to be attractive in the geophysical applications. To benefit from this multi-threaded architecture, we explore its implementation in the spectral element modelling (SEM) engine of our full waveform modelling and inversion code SEM46. Based on the features of SEM algorithm and the Cartesian-based structured mesh in SEM46, we investigate GPU kernels with three different parallel prototypes with particular focus on the modelling accuracy and computational efficiency. The memory constraint in the 3D implementation is addressed by domain decomposition using multiple GPUs with CUDA-AWARE MPI to achieve direct GPU-to-GPU communications. The resulting GPU solver exhibits a high speedup and excellent scaling over multi-GPUs, making it promising for large-scale realistic 3D problems.
-
-
-
HPC Implementations on SEM46: A 3D Modeling and Inversion Code for Anisotropic Visco-Elastic Coupled Acoustic Media
Authors A. Tarayoun, R. Brossier, J. Cao, S. Jauré, S. Laforêt and L. MétivierSummaryFull Waveform Inversion (FWI) is a high-resolution seismic imaging technique dedicated to the reconstruction of the mechanical properties of the Earth’s subsurface. With the developments of hybrid HPC platforms and upcoming Exascale architectures, FWI implementations needs to be adapted to exploit such improved hardware. In this study we present the main HPC related implementations of a code named SEM46, developed in the SEISCOPE project and designed for crustal-scale exploration. It allows to move towards an anisotropic viscoelastic and acoustic engine with an affordable computational cost thanks to several key elements: flexible Cartesian-based deformed mesh, two Message Passing Interface (MPI)-based parallelism levels - one over seismic sources - one over domain decomposition, several choices to store and/or re-compute the incident fields needed for the gradient building allowing an adaptation to the architecture characteristics and finally an optimized modeling kernel for the product of the displacement vector by the stiffness matrix. Different optimizations are investigated, making it possible to significantly improve the computational performances. They combine vectorization and re-arrangement on specific loops to decrease the computational time, and optimizations such as new computations on the fly and use of a fixed point coding approach to decrease the memory bandwidth pressure.
-
-
-
Performance and Best Practices to Run Finite Difference Kernel in the Cloud using Devito
Authors M. Hugues and S. TadepalliSummaryAs the energy industry transition to hydrocarbon alternatives for automotive and electricity production, fossil energy continues to be needed in those segments as well as solvent, plastic, solvent and consumer goods. The discovery of oil and gas-bearing formation is an increasing challenge as easily accessible resources has depleted. It requires to go deeper in the earth crust and image more complex geological topologies of the subsurface. Seismic imaging is key to understand the subsurface velocities and is one of the most demanding workloads for high performance computing. The need for high resolution image led to higher frequency processing and more complex wave equation. Compute and storage requirements have grown accordingly to accommodate those needs. Cloud computing is an attractive technology that provides the benefit of quickly access additional compute and storage capability for new algorithms development or production projects. In this paper, we present architecture best practices and performance recommendation for finite difference kernel method such as RTM and FWI. Devito is used to illustrate performance and runtime guidance on the latest AMD Milan and Intel Icelake instances. We will show performance using different compilers and flags as well as MPI, OpenMPI and hybrid on single and multi-instances.
-
-
-
ROCSten: The Accelerated Stencil Library for AMD Instinct™ GPUs
More LessSummaryROCSten library is a highly tuned stencil patterns predesigned to tackle different memory problems. It is taking advantages of the quantitative improvements provided by AMD Instinct™ MI200 microarchitecture. As AMD Instinct™ MI250 can achieve a decent performance at reasonable power consumption due to features like unique interconnects between GCDs with the largest HBM2e memory per GPU to tackle the largest and most demanding problems.
ROCSten is anticipated to provide the most used stencil patterns for many domains like Oil&Gas with some hints to couple the stencil computation with wave updates. Our preliminary results show a great potential on AMD Instinct™ MI250 compared to the latest competitive high-end accelerator. ROCSten can achieve ∼2x on AMD Instinct MI210 compared to Nvidia A100 80GB
-
-
-
Accelerating and Optimizing Oil and Gas Exploration Planning Using Quantum Inspired Classical Computing or Vector Annealing
Authors D. Pathania, S. Momose, T. Nishimura and M. IkutaSummaryOil and Gas exploration or drill sequence planning is a difficult optimization problem which infeasible to be solved by classical computers today. They are called combinatorial optimization problems (or NPHard). Solving such challenge can yield tremendous benefits in terms of improved success ratio and rescued cost and resources required. These problems are expected to be solved by Quantum Computers (QC) called Quantum Annealing (QA). The current evolution of the quantum hardware or limited availability of actual qubits. This limitation makes the pure QA solution infeasible for solving large-scale real-world problems or large number of qubits. For overcoming limitation and simulating annealing, Vector Engine Accelerator (VE) has the capability of simulating 100K fully connected qubits per PCIe based card. QA works using quantum fluctuations. Quantum Simulated Annealing (SA) or Vector Annealing (VA) works simulating the quantum effect using meta-heuristic approach on a classical computer. VA on VE solution has advantage in comparison to other SA on classical computers is large number of qubits support and capability of accelerating the same. In the presentation we will be showcasing not only the capability to solve combinatorial optimization problem but also user can accelerate the performance and making a real-world applicable solution.
-
-
-
CPU Kernels Energy: Maximization of the Throughput in Power Capped Environment.
Authors F. Pautre, A. Mrabet, P. Thierry, L. Saugé, P. Demichel, J. Blanc and V. ArslanSummaryThe component’s roadmaps expose a drastic increase of the peak power and TDP for the current and next generations of components.. This research and talk are focused on the power capping challenge. The main goal of this presentation is to demonstrate that highly optimized and tuned codes from the HPC community have a new opportunity to better balance performance and energy consumption. The first benefit is naturally a reduction at the cost of the energy used to run a specific workload, but our main interest is the maximization of the aggregate throughput of the datacenter.
-
-
-
Performance Preserving Portability across GPU Architectures
Authors S. Banerjee, A. Panda, S. Reker, S. Frijters, D. Cha, V. Aggarwal and A. St-CyrSummaryFor developers, one of the challenges to maintain their code base is to address the issue of interoperability or portability of the application across different platform / GPU architectures. In this report, we shall explore tools and methodologies to enable porting of existing software to different GPU architectures with minimal changes, albeit keeping the performance number nearly the same, if not better.
-
-
-
Open Benchmarking Platform for Data Inversion Methods
Authors G. Gorman, F. Luporini, A. St-Cyr, A. Loddoch, A. Souza, K. Hester, P. Witte, F. Dupros and M. ArayaSummaryTaking inspiration from success stories such as the SEAM Open Data, we bring together expertise from multiple institutions to devise a platform that provides a trustworthy representation of the challenges in real-life seismic inversion.
-
-
-
Lightning Talk : Acceleration of Reservoir Navigation With Edge Parallel Computing
Authors A. Cheryauka and T. GeeritsSummaryReal-time mapping of formations, detection of structural anomalies and evaluation of near-well reservoir properties can be achieved through logging while drilling (LWD). Hydrocarbon deposits have typically a quasilayered structure, however, complex situations with strong 3-D features like faults or salt domes are becoming increasingly common. Here, we extend the Green’s function methodology to compute fully 2–3D scalar wave fields. This finds application in elastodynamic forward and inverse scattering, where often times, as a first approximation, elastic coupling between P- and S-waves is ignored. However, we anticipate that our computational approach will equally benefit borehole acoustics (e.g., LWD borehole acousto-elastodynamic scattering), reservoir acoustics (e.g., distributed acoustic sensing) and reservoir seismics (e.g., surface seismic survey and vertical seismic profiling) applications. In this paper, we explore fully multi-dimensional geological models and accelerate the physics simulations with edge parallel computing. Along the road, high-performance LWD navigation can achieve safe, fast and confident well placement. We developed highly efficient and scalable computational means for modeling acoustic responses in multi-dimensional reservoir. The results demonstrate nearly 2 orders of magnitude acceleration for small-to-medium problems through edge computing. Multi-physics LWD, inverse reservoir mapping and monitoring as well as variety of AI/ML tasks can benefit from our platform-agnostic approach.
-
-
-
ML Based Dispersion Filter for Finite-Differences
Authors A. Agnihotri, N. Kumar, A. Chandran and A. St-CyrSummaryThe architecture of choice for ML workloads are currently GPUs. Moreover, many of our more traditional workloads such as full waveform inversion (FWI) and least-squares reverse time migration (LSRTM) are now able to leverage GPU computing. While pure ML algorithms are showing good promise for solving our imaging problems, a significant amount of research lies ahead before moving to production. A shortcut is to combine traditional modeling algorithms with the existing established strengths of ML based ones.
In this work, we propose such a marriage. We show that is it possible to remove any type of dispersion (time & space) present in finite-differences (FD) by using a carefully crafted DNN. FD are mainly employed for wave propagation in FWI and LSRTM. This enables the use of a very low-order solver (4th order) to generate high-quality (high-order) solutions.
-
-
-
I/O Requirements of Common HPC Seismic Applications and How to Minimize I/O Bottleneck
By R. GautamSummaryI/O from seismic algorithms are challenging to manage. ML workloads have demanding I/O that are different than seismic algorithms. Compute to I/O resources imbalance is real. We investigated I/O profiles from our heavy I/O demanding applications. Based on our analysis, we have implemented a way to alleviate file system bottlenecks.
-
-
-
Performance-Aware Build System for HPC and AI Workloads
Authors P. Souza Filho, G. Renaud, J. Sparks and M. AltSummaryWe propose to extend an existing CI/CT/CD build system to add an auto-benchmark step at the end of the pipeline. The auto-benchmark step will execute micro-benchmarks and the application with a pre-defined dataset(s) and parameters on selected on-premises and cloud resources, recording the performance of each run. The collected performance information can be plugged into a 3rd party application catalog or more interestingly into a 3rd party job submission portal/APIs, which can recommend or auto select the optimal target machine + configuration for a given workload. We demonstrate how we applied this auto-benchmark process to a finite difference wave-propagation kernel and how the collected data can assist the decision on where and how to run it.
-
-
-
Toward Full Cluster Resource Utilization
Authors N. Bienati, L. Bortot and J. PanizzardiSummaryWith real world HPC applications, it is unlikely that all the resources of a cluster node can be fully used simultaneously by a single application. However, if a single application completely exhausts one of the node’s resources, it might prevent the possibility for other applications to exploit the remaining resources, which are likely to remain partially unused.
Moreover, since most of the seismic imaging algorithms are embarrassingly parallel, in the context of a seismic imaging multi-user and multi-project scenario the challenge is often to efficiently execute a very large number of asynchronous and loosely coupled tasks. This combines with the nature of modern full wave imaging and velocity analysis, which results in variability of memory requirement and computing unit occupancy, that can both span several orders of magnitude.
In this presentation we will discuss what are the recommendable actions that can be taken to maximize cluster resource utilization whenever the majority of the workload has these characteristics.
-
-
-
Hybrid Workflows For Large - Scale Scientific Applications
Authors I. Colonnelli and M. AldinucciSummaryLarge-scale scientific applications are facing an irreversible transition from monolithic, high-performance oriented codes to modular and polyglot deployments of specialised (micro-)services. The reasons behind this transition are many: coupling of standard solvers with Deep Learning techniques, offloading of data analysis and visualisation to Cloud, and the advent of specialised hardware accelerators.
Topology-aware Workflow Management Systems (WMSs) play a crucial role. In particular, topology-awareness allows an explicit mapping of workflow steps onto heterogeneous locations, allowing automated executions on top of hybrid architectures (e.g., cloud+HPC or classical+quantum). Plus, topology-aware WMSs can offer nonfunctional requirements OOTB, e.g. components’ life-cycle orchestration, secure and efficient data transfers, fault tolerance, and cross-cluster execution of urgent workloads. Augmenting interactive Jupyter Notebooks with distributed workflow capabilities allows domain experts to prototype and scale applications using the same technological stack, while relying on a feature-rich and user-friendly web interface.
This abstract will showcase how these general methodologies can be applied to a typical geoscience simulation pipeline based on the Full Wavefront Inversion (FWI) technique. In particular, a prototypical Jupyter Notebook will be executed interactively on Cloud. Preliminary data analyses and post-processing will be executed locally, while the computationally demanding optimisation loop will be scheduled on a remote HPC cluster.
-