Monday, July 27 • 8:00am - 12:00pm
Tutorial: Optimization and Tuning of MPI and PGAS Applications using MVAPICH2 and MVAPICH2-X Libraries

MVAPICH2 software, supporting the latest MPI 3.0 standard, delivers best performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, 10/40 GigE/iWARP and RoCE networking technologies. MVAPICH2-X software package provides support for hybrid MPI+PGAS (UPC, OpenSHMEM and CAF) programming models with unified communication runtime. MVAPICH2 and MVAPICH2-X software libraries (http://mvapich.cse.ohio-state.edu) are powering several supercomputers in the XSEDE program including Gordon, Keenland, Lonestar4, Trestles and Stampede. These software libraries are being used by more than 2,350 organizations world-wide in 75 countries to extract the potential of these emerging networking technologies for modern systems. As of April '15, more than 248,000 downloads have taken place from this project's site. These software libraries are also powering several supercomputers in the TOP 500 list like Stampede, Tsubame 2.5 and Pleiades.
A large number of XSEDE users are using these libraries on a daily-basis to run their MPI and PGAS applications. However, many of these users and the corresponding system administrators are not fully aware of all features, optimizations and tuning techniques associated with these libraries. This tutorial is aimed to address these concerns. Further, as accelerators such as GPUs and MICs are commonly available on XSEDE resources, we present design supports and optimization techniques for such systems. We will start with an overview of the MVAPICH2 and MVAPICH2-X libraries and their features. Next, we will focus on installation guidelines, runtime optimizations and tuning flexibility in-depth. An overview of configuration and debugging support in MVAPICH2 and MVAPICH2-X will be presented. Support for GPUs and MIC enabled systems will be presented. Advanced optimization and tuning of MPI applications using the new MPI-T feature (as defined by MPI-3 standard) in MVAPICH2 will also be discussed. The impact on performance of the various features and optimization techniques will be discussed in an integrated fashion. Further, we present case study of application redesign to take advantage of hybrid MPI+PGAS programming models.

Majestic C

