Please read these instructions before posting any event on Fermilab Indico

Indico search will be reestablished in the next version upgrade of the software:


This search is only for public events. Restricted events are not available.


IMPORTANT! Indico has been upgraded. Please let us know as soon as possible if you find any issues and email

22-28 July 2018
Kellogg Hotel and Conference Center
EST timezone

Three Dirac operators on two architectures

Jul 23, 2018, 5:10 PM
103 (Kellogg Hotel and Conference Center)


Kellogg Hotel and Conference Center

219 S Harrison Rd, East Lansing, MI 48824
Algorithms and Machines Algorithms and Machines


Dr Stephan DURR (University of Wuppertal)


A simple minded approach to implement three discretizations of the Dirac operator (Brillouin, Wilson, staggered) on two architectures (KNL and core_i7) is presented. The idea is to use a high-level compiler along with OpenMP parallelization and SIMD pragmas, but to stay away from cache-line optimization and/or assembly-tuning. The implementation is for Nv right-hand-sides, and this extra index is used to fill the SIMD pipeline. On one KNL node single precision performance figures for Nc=3, Nv=12 read 640 Gflop/s, 320 Gflop/s, and 520 Gflop/s for the three discretization schemes, respectively.

Primary author

Dr Stephan DURR (University of Wuppertal)

Presentation Materials