Please read these instructions before posting any event on Fermilab Indico

Indico search will be reestablished in the next version upgrade of the software:


This search is only for public events. Restricted events are not available.


IMPORTANT! Indico has been upgraded. Please let us know as soon as possible if you find any issues and email

22-28 July 2018
Kellogg Hotel and Conference Center
EST timezone

Lattice QCD on modern GPU systems

Jul 28, 2018, 9:45 AM
Big Ten A (Kellogg Hotel and Conference Center)

Big Ten A

Kellogg Hotel and Conference Center

219 S Harrison Rd, East Lansing, MI 48824
Plenary Algorithms and Machines Plenary


Dr Mathias Wagner (NVIDIA)


In the 10 years since the creation of the QUDA library for Lattice QCD on NVIDIA GPUs the hardware and software features of GPU systems have evolved dramatically. Not only has the raw Dslash kernel performance on a single GPU improved by more than one order of magnitude but also modern GPUs are often deployed in "Fat Nodes" with up to 8 GPUs. We report on the techniques that QUDA implements to achieve high performance on these modern GPU architecture by exploiting the features of modern NVIDIA GPUs, like Unified Memory, GPU Direct and NVLink-connections between GPUs and to IBM Power CPUs. We discuss the impact of these optimizations and present scaling results for QUDA on DGX-1 based clusters and Summit. Finally, we will give an outlook on future directions. In particular we preview strong scaling and programmability improvements by using NVSHMEM, an OpenSHMEM implementation for GPUs as well as QUDA on NVSwitch-based systems like DGX-2 with 16 fully interconnected GPUs.

Primary author

Dr Mathias Wagner (NVIDIA)


Dr Kate Clark (NVIDIA)

Presentation Materials