Designing a supercomputer to satisfy the needs of future applications and workloads within a given power envelope while considering the rapidly evolving high technology environment is not an easy task.

In this frame, the prediction of performance can be used for many different needs from designing a new micro architecture or memory hierarchy to defining the interconnection and storage of the future.

Several tools already exist for analyzing the different aspects of application characterization and performance prediction. They have, however, so far rarely been connected due to their different precisions and resolutions.

Based on a first approximation of the application behavior, mostly involving memory bandwidth (BW) and floating point (FP) demands, we can demonstrate that realistic performance predictions can be easily obtained at the application level for single and multiple node configurations.


Article metrics loading...

Loading full text...

Full text loading...


  1. Andreolli, C., Thierry, P., Borges, L., SkinnerG. and Yount, C.
    [2014] Characterization and optimization methodology applied to stencil computations. Chapter 23 in Book High Performance parallelism pearls: multicore and many-core programming approaches. ISBN 9780128021187.
    [Google Scholar]
  2. https://software.intel.com/en-us/articles/eight-optimizations-for-3-dimensional-finite-difference-3dfd-code-with-an-isotropic-iso.
  3. Heirman, W., SarkarS., CarlsonT.E., HurI. and Eeckhout, L.
    [2012] Power-Aware Multi-Core Simulation for Early Design Stage Hardware/Software Co-Optimization. International Conference on Parallel Architectures and Compilation Techniques (PACT), 2012.
    [Google Scholar]
  4. Dongarra, J., Luszczek, P. and Petitet, A.
    [2003] The linpack benchmark: past, present and future. Concurrency and Computation: Practice and Experience, 15(9), 803–820, doi:10.1002/cpe.728.
    https://doi.org/10.1002/cpe.728 [Google Scholar]
  5. Imbert, D., Immadouedine, K., Thierry, P., Chauris, H. and Borges, L.
    [2011] Tips and tricks for finite difference and i/o-less fwi. Expanded Abstracts, 81 Annual SEG Meeting and Exposition (September 18–23, San Antonio), Soc. Expl. Geophys., 3174–3178.
    [Google Scholar]
  6. McCalpin, J.D.
    [1991–2007] Stream: Sustainable memory bandwidth in high performance computers. Tech. rep., University of Virginia, Charlottesville, Virginia, a continually updated technical report. http://www.cs.virginia.edu/stream/.
    [Google Scholar]
  7. Williams, S., Waterman, A. and Patterson, D.
    [2009] Roofline: an insightful visual performance model for multicore architectures. Communications of the ACM - A Direct Path to Dependable Software.
    [Google Scholar]
  8. Intel Software development emulator
    Intel Software development emulatorhttps://software.intel.com/en-us/articles/intel-software-development-emulator.

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error