Full text loading...
We propose a runtime method to partial or fully override the MPI inside the container, by one version that is optimized for the target machine. Our approach does not require a container image rebuild/update and doesn’t require a match between the host and the container OS. We executed a high order 3D stencil using two nodes with two MPI processes per node (PPN) to demonstrate the performance difference by the original container with Intel MPI, and an overridden container with Cray and MVAPICH2 tuned for the target machine Slingshot fabric.