1887

Abstract

Summary

This study presents an energy-aware benchmarking of 3D orthogonal Full Waveform Inversion (FWI) on the Shaheen III-CPU supercomputer, targeting sustainable high-performance computing (HPC) practices. FWI is a computationally intensive seismic imaging method, and its scalability demands careful evaluation under power-constrained environments. The benchmarking involves 28 configurations, combining seven power caps (200W–400W) and four OpenMP thread counts (48–192). Performance was assessed by measuring execution time and energy usage (in watt-hours), revealing key trade-offs between speed and energy efficiency.

Results demonstrate that increasing thread count improves performance only under sufficient power caps (≥280W). Beyond 144 threads, however, gains diminish or reverse, especially at lower power caps due to parallel efficiency limitations. Notably, the configuration with maximum resources (192 threads at 400W) did not yield the best performance-per-watt. Instead, mid-range configurations (e.g., 144 threads at 280–320W) provided an optimal balance between runtime and energy consumption.

The findings underscore that for energy-constrained scientific computing, maximizing hardware usage does not guarantee best performance. Instead, tuning for power efficiency is essential. These insights serve as a guide for selecting sustainable configurations for seismic imaging and other compute-intensive applications on modern HPC systems.

Loading

Article metrics loading...

/content/papers/10.3997/2214-4609.2025643011
2025-10-06
2026-02-11
Loading full text...

Full text loading...

References

  1. Curtis-Maury, M., A.Shah, M.Dzierwa, F.Blagojevic, D. S.Nikolopoulos, and J. S.Vetter, 2008, Online power-performance adaptation of multithreaded programs using hardware event-based prediction: Proceedings of the 20th International Conference on Supercomputing (ICS), 157–166.
    [Google Scholar]
  2. Hackenberg, D., T.Ilsche, R. Schöne, D.Molka, M.Schmidt, and R.Geyer, 2015, Power measurement techniques on standard compute nodes: A quantitative comparison: 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 1045–1054.
    [Google Scholar]
  3. Rountree, B., D. K.Lowenthal, M.Schulz, B. R.de Supinski, and V. W.Freeh, 2009, Adagio: Making DVS practical for complex HPC applications: Proceedings of the 23rd International Conference on Supercomputing (ICS), 460–469.
    [Google Scholar]
  4. Rountree, B., M.Schulz, D. K.Lowenthal, B. R. de Supinski, and V. W.Freeh, 2012, Beyond DVFS: A first look at performance under a hardware-enforced power bound: 2012 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 947–958.
    [Google Scholar]
  5. Tarantola, A., 1984, Inversion of seismic reflection data in the acoustic approximation: Geophysics, 49, 1259–1266.
    [Google Scholar]
  6. Virieux, J., and S.Operto, 2009, An overview of full-waveform inversion in exploration geophysics: Geophysics, 74, WCC1–WCC26.
    [Google Scholar]
/content/papers/10.3997/2214-4609.2025643011
Loading
/content/papers/10.3997/2214-4609.2025643011
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error