Snowmass Computational Frontier Workshop

US/Central
Benjamin Nachman (LBNL), Oliver Gutsche (Fermi National Accelerator Laboratory), Steven Gottlieb (Indiana Univ.)
Description

Every half-decade or so the US high energy physics community engages in a planning process that looks ahead five to ten years to prioritize possible future directions and projects.  There used to be a meeting lasting several weeks in Snowmass, Colorado for this exercise.  Although we no longer have a long meeting there, the name Snowmass has stuck.  The previous plan was called Snowmass 2013, and we are now working on Snowmass 2021, which will culminate with a large meeting July 11-20 in Seattle and a report later that Fall.  Details can be found at the wiki snowmass21.org

The planning is organized by "Frontiers," and we would like to introduce the Computational Frontier.  It is important that experiments and groups doing large scale computations be well represented in the Computational Frontier.  The main page in the wiki for this frontier is here:

 https://snowmass21.org/computational/start

The work within this frontier is organized into seven topical groups:

CompF1: Experimental Algorithm Parallelization

CompF2: Theoretical Calculations and Simulation

CompF3: Machine Learning

CompF4: Storage and processing resource access (Facility and Infrastructure R&D)

CompF5: End user analysis

CompF6: Quantum computing

CompF7: Reinterpretation and long-term preservation of data and code

Each topical group has its own mailing list and slack channel.  Details can be found at the link above for the Computational Frontier, where you will also find links to pages with details about each topical group.

In August 2020, we are pleased to invite the community to our kick-off Computational Frontier meeting. The meeting will take place (virtually) on August 10 and 11.  This site serves as the website for this workshop.  At the meeting, each topical group will present its charge and plans for gathering input from the community. We hope you will attend.

The ZOOM connection details for the plenary sessions and the parallel sessions have been pinned in the #comp_frontier_topics channel on the Snowmass2021 slack (instruction to join at bottom of https://snowmass21.org)

Registration
Registration
  • Monday, 10 August
  • Tuesday, 11 August
    • Community Feedback
      • 54
        Network Requirements and Computing Model R&D for the HL-LHC Era

        Summary: Recent estimates of network capacity requirements for the HL-LHC era indicate that these cannot be met through technology evolution and price/performance improvements alone within a constant budget. An in-depth consideration of HL-LHC Computing Model is thus needed, and an R&D program to formulate, design and prototyping of the new Model is recommended. This program could take advantage of current development projects that provide the capability to set up, allocate and end-to-end network paths with bandwidth guarantees, and to coordinate the use of network resources with computing and storage resources.
        2020 Update in the Outlook for Network Requirements:
        In January, at the 43rd LHCOPN/LHCONE meeting at CERN , the LHC experiments expressed the need for Terabit/sec links by the start of HL-LHC operations in 2027-28, preceded by the usual Computing and Storage (and Network) challenges starting during LHC Run3 (2021-4). This was reinforced by the requirements presented by the DOMA project which “foresees requiring 1 Tbps links by HL-LHC (ballpark) to support WLCG needs. This is for the network backbones and larger sites…”
        The quoted network capacity requirements are an order of magnitude greater than what is available now through the present national and transoceanic networks based on 100GE links. As discussed at the LHCONE meeting, in the GNA-G Leadership group meeting that followed, and in the HEPIX Techwatch technology tracking group, these requirements cannot be accommodated solely through the exploitation of technology evolution within a constant budget. As a result, the further development of managed end-to-end services for the LHC and other science programs and the associated plans presented in this note, could be of pivotal importance. Work in this direction should also be guided by DOMA statements that “caching/latency hiding will be important. DOMA is exploring XCache as a mechanism, which provides latency hiding and support for diskless sites (with regional data lakes). Production of AODs (using RAW) will be a network driver, especially regionally. Effectively the ‘site’ is expanded to encompass a ‘region’.”
        • It was agreed in subsequent discussions that the HEPIX Technology Watch WG and/or the Global Network Advancement (GNA-G) leadership group that was formed in the fall of 2019 , can help define how much of it can be satisfied through technology evolution by 2027, and by 2024 in the preparatory phase.
        • The rest will involve a change in paradigm including the end-to-end services involving sites and networks, and orchestration, as is being developed in projects such as SENSE, SANDIE and NOTED (described below). Ongoing discussions should continue to conceptualize and define what the new class(es) of service required entail.
        • An important part of this is the persistent testbed being deployed by SENSE in collaboration with AutoGOLE and other collaborating projects. This is proceeding starting with the current SENSE testbed sites, plus extensions to UCSD, CERN, Starlight in Chicago, and a few other sites in the US and overseas.

        Speaker: Prof. Harvey Newman (Caltech)
      • 10:08
        Switch speaker
      • 55
        High-performance computing for global fits of parton distributions

        I summarize computational needs for determination of parton distribution functions at (N)NNLO accuracy. Our experience with the latest CT18 global analysis of NNLO PDFs indicates the need for the benchmarked infrastructure for accurate and fast determination of PDFs from QCD data. Reduction of the current PDF uncertainties to meet the targets of the HL-LHC EW precision program and BSM searches will require substantial computing resources and coordination within community. I will discuss some ideas along these lines discussed within the EF06 topical group.

        Speaker: Pavel Nadolsky (Southern Methodist University)
      • 10:18
        Switch speaker
      • 56
        Jas4pp - a Data-Analysis Framework for Physics and DetectorStudies

        https://www.snowmass21.org/docs/files/summaries/CompF/SNOWMASS21-CompF5-001.pdf

        Speaker: Sergei Chekanov (ANL)
      • 10:28
        Switch speaker
      • 57
        Measuring Python adoption by CMS physicists using GitHub API

        In the pipeline from detector to published physics results, the last step, "end-user analysis," is the most diverse. It can even be hard to discover what tools are being used, since the work is highly decentralized among students and postdocs, many of whom are working from their home institutes (or their homes).

        However, GitHub offers a window into CMS physicists' analysis tool preferences. For the past 7 years, CMSSW has been hosted on GitHub, and GitHub's API allows us to query the public repositories of users who have forked CMSSW, a sample dominated by CMS physicists and consisting of 19,400 user-created (non-fork) repositories.

        In these 7 years, we see a clear reduction in the use of C++ and increase in the use of Python and Jupyter notebooks. 2019 marks the first year in which CMS physicists have created more Python repositories (excluding Jupyter) than C or C++. Finally, we can also search the code in these repositories for substrings that quantify the adoption of specific physics, plotting, and machine learning packages.

        Understanding how physicists do their work can help us make more informed decisions about software development, maintenance, and training.

        Speaker: Jim Pivarski (Fermilab)
      • 10:38
        Switch speaker
      • 58
        Information Technologies for the HL-LHC in 2030 and Beyond

        The computing, storage and communications challenges of the HL LHC era will extensively use emerging technologies which are currently in various stages of conception and pre-specification, so they are not yet on the computing and more broadly, the experimental roadmap. I will briefly introduce the physics and technology barriers in terms of computational and storage nanoscale feature sizes, the energy required for signaling much beyond 1 Tbps, the application pulls from the Internet of Things to exascale computing to the developing 6G wireless standard expected to emerge circa 2030 with 1 Tbps links. I will point to current authoritative information sources such as the IEEE International Roadmap for Devices and Systems (IRDS) discussing the challenges and visionary approaches being taken to meet the challenges make the transition to the Beyond CMOS era of 2030-2040.

        Speaker: Prof. Harvey Newman (Caltech)
      • 10:48
        Switch speaker
      • 59
        The Great Beyond at the Exascale: Dynamical Simulations of the Frezzotti-Rossi model

        We give a very short outlook on the computational aspects of dynamical simulations for the study of the Frezzotti-Rossi model of elementary particle mass generation. Having recently demonstrated via lattice simulations that the non-perturbative mechanism exists, we now plan to investigate the compelling theoretical case that within this framework, we will be able to relate all elementary particle masses to a unique energy scale. More concretely, we hope to relate the Higgs mass to the W and/or top quark masses and further, to predict with 20-30% accuracy the scale of new physics to guide future experimental efforts, all without the shortcomings of technicolor and other composite Higgs models.

        We expect that the simulations required to achieve these goals will be about an order of magnitude more complex and expensive than current state of the art lattice QCD simulations, but we have a roadmap involving quenched and partially quenched setups to proceed via several milestones to our final set of dynamical ensembles, which will certainly require Exascale supercomputing resources and a plethora of algorithmic innovations.

        If we succeed, this will be the first time since Wilson that lattice field theory would be a tool driving discovery and not "merely" a computational approach for non-perturbative aspects of the Standard Model.

        Speaker: Dr Bartosz Kostrzewa (High Performance Compung & Analytics Lab, University of Bonn, Germany)
      • 10:58
        Switch speaker
      • 60
        BSM Global Fits and GAMBIT

        In this lightning talk, I will introduce the GAMBIT (Global and Modular BSM Inference Tool) framework, a tool for doing global fits of particle physics models to a range of experimental results, including those from colliders, astrophysical and terrestrial dark matter searches, cosmology, neutrino experiments, and precision measurements. I will also briefly discuss the fits that have been undertaken with the code and computational challenges that we have encountered in those efforts.

        Speaker: Prof. Jonathan Cornell (Weber State University)
      • 11:08
        Switch speaker
      • 61
        Graph Data Structures and Graph Neural Networks in High Energy Physics with the ExaTrkX Project

        We present a set of techniques studied by the ExaTrkX collaboration for classification and regression of large scale high energy physics data. Using graph structures and geometric machine learning, we observe excellent performance with particle tracking algorithms on silicon trackers and high-granularity calorimeters for HL-LHC, as well as LArTPCs for neutrino experiments. Promising future research directions include jet reconstruction, particle identification, and particle flow algorithms. We argue that these techniques are viable solutions to the scaling problem of traditional track finding algorithms in the era of experiments such as the HL-LHC, and present results of performance scaling against collision event size. We also argue for the use of heterogeneous solutions, such as distributing training and inference across CPUs, GPUs and TPUs.

        Speaker: Daniel Murnane (Lawrence Berkeley National Laboratory)
      • 11:18
        Switch speaker
      • 62
        Particle Physics and Machine Learning in Education

        The strong and growing role of machine learning (ML) in particle physics is well established and appropriate given the complex detectors and large data sets at the foundational layer of our science. Increasingly, Physics departments are offering curricula to their undergraduate and graduate students that focus on the intersection of data science, machine learning and physics. In this talk, we provide some perspective on the potential role of particle physics in ML education and present some of the opportunities and challenges in the form of open questions for our community to explore.

        Speaker: Mark Neubauer (University of Illinois at Urbana-Champaign)
    • 11:30
      Break
    • CompF1: Experimental Algorithm Parallelization
      Conveners: Giuseppe Cerati (Fermilab), Katrin Heitmann (Argonne National Laboratory), Walter Hopkins (Argonne National Laboratory)
    • CompF2: Theoretical Calculations and Simulation
      Conveners: Daniel Elvira (Fermilab), Peter Boyle (Brookhaven National Laboratory), ji qiang (LBNL)
      • 66
        Theoretical Calculations and Simulation Panel Discussion (CF2)

        Panel Members

        Event Generators - Hugh Gallagher, Stephen Mrenna
        Accelerator Modelling - Eric Stern
        Detector Modelling - Vincent Pascuzzi, Krzysztof Genser
        Theory (Lattice) - Andreas Kronfeld
        Theory (Perturbative) - Andreas von Manteuffel
        Cosmic Simulations - Salman Habib

        Moderator Questions

        Classical computing:

        Q1. For each field, please explain what science do you want to do, and do you expect to be able to achieve your goals with projected computing resources?

        Q2. For each field, please estimate the fraction of the computing cycles that use algorithms that are highly parallel, and viable to port to parallel architectures (vector/accelerator etc..).

        Q3. Do you currently use HTC, or HPC, or a mix of computing resources? (How) do you expect this to change in future?

        Q4. How much human effort is required to support software development or adaptation for new machines?
        (e.g.
        How big is US effort? Is it part of an international effort?
        How does this compare and fit in? Size of code?
        Number of FTE years to port to acceleration? Languages considerations such as OpenMP offload, SYCL, or CUDA etc..?
        Any difficulties? Plan for long-term code user support?)

        Q5. Do you need DOE computing lab expert support for (software) R&D; funding such as ECP or SciDAC.
        (e.g. How much? Do you have collaboration with applied math people? Is there any need for advanced numerical methods?)

        Q6: What will your requirements for data storage be?
        (e.g. Volume? Bandwidth? Distribution? Integrity guarantees? Life cycle? Data sharing?)

        Machine Learning

        Q1 For each field, do you expect to use machine learning in your main algorithms 10 years from now? What application benefits do you expect from MI in your area?

        Q2 Please describe the degree to which you expect to use machine learning in 10 years. What level of certainty do you have?

        Q3 Are you able to use commercial ML packages, like TensorFlow, Baidu, Theano, Torch, or do you need custom software? Do you need a programme of education in ML methods?

        Quantum Computing

        Q1 For each area, do you expect to engage with the development of quantum computing as scientific activity?

        Q2 For each area, do you expect quantum computers to help solve your computational problems in the next 10 years? 20 years? Are quantum algorithms understood?

        Q3 Is there activity or engagement with quantum algorithm programming?

    • CompF3: Machine Learning
      Conveners: Daniel Whiteson (UC Irvine), Kazuhiro Terao (SLAC National Accelerator Laboratory), Kazuhiro Terao, Phiala Shanahan (Massachusetts Institute of Technology)
    • CompF4: Storage and processing resource access (Facility and Infrastructure R&D)
      Conveners: Frank Wuerthwein (UCSD), Robert Gardner (University of Chicago), Wahid Bhimji (NERSC, Berkeley Lab)
    • CompF5: End user analysis

      If you're looking for connection information, register to get an email with connection information.

      If you run into any issues, contact us through our slack channel #compf05-useranalysis or mailing list.

      Email listserv@fnal.gov, no subject, message body: subscribe SNOWMASS-COMPF05-USERANALYSIS Firstname Lastname

      Conveners: Amy Roberts (CU Denver), Gavin Davies (University of Mississippi), Peter Onyisi (University of Texas at Austin)
    • CompF6: Quantum computing
      Conveners: Gabriel Perdue (Fermilab), Martin Savage (INT), Travis Humble (Oak Ridge National Laboratory)
      • 84
        Quantum simulation and hardware co-design
        Speaker: Raphael Pooser (Oak Ridge National Laboratory)
      • 85
        Quantum computing for event generators
        Speaker: Christian Bauer (LBNL)
      • 86
        Quantum algorithms for quantum sensing

        Comments from Andrew Sornborger

        Speaker: Andrew Sornborger
      • 87
        NISQ-era Quantum Devices for HEP
        Speaker: Norbert Linke (Joint Quantum Institute, University of Maryland)
      • 88
        Algorithm Development for beyond NISQ-era devices
        Speaker: Nathan Wiebe
      • 89
        Quantum Networks for HEP

        Nicholas A. Peters, Michael McGuigan, Panagiotis Spentzouris

        Speaker: Nick Peters
      • 90
        Issues in HEP relevant to QML, decoherence, and quantum foundations

        Andreas Albrecht, Andrew Sornborger, Patrick Coles

        Speaker: Andreas Albrecht
      • 91
        Search Strategies for new particles with SRF cavities

        Search Strategies for new particles with SRF cavities

        Speaker: Alexander Romanenko (Fermilab)
    • CompF7: Reinterpretation and long-term preservation of data and code
      Conveners: Kyle Cranmer (NYU), Matias Carrasco Kind (NCSA/University of Illinois), Mike Hildreth (Notre Dame University)
    • 13:30
      Break
    • Community Feedback: Panel Discussion
      • 98
        Panel Discussion

        we are planning a panel discussion with the topical groups of the Computational Frontier and the audience about the next steps in the Snowmass process for software and computing.

        We want to ask the panelists and the audience questions about what they learned in the sessions so far about topics that span several topical groups or even frontiers. We want to discuss which topics need follow up, for example in the form of smaller workshops or dedicated discussion meetings.

        We hope this helps us to plan the further steps on the way to our Snowmass report.

    • 15:00
      Break
    • Summary and Outlook
      • 99
        "Experimental Algorithm Parallelization" Topical Working Group
        Speakers: Giuseppe Cerati (Fermilab), Katrin Heitmann (Argonne National Laboratory), Walter Hopkins (Argonne National Laboratory)
      • 15:40
        Speaker switch
      • 100
        "Theoretical Calculations and Simulation" Topical Working Group
        Speakers: Daniel Elvira (Fermilab), Peter Boyle (Brookhaven National Laboratory), ji qiang (LBNL)
      • 15:51
        Speaker switch
      • 101
        "Machine Learning" Topical Working Group
        Speakers: Daniel Whiteson (UC Irvine), Kazuhiro Terao (SLAC National Accelerator Laboratory), Kazuhiro Terao, Phiala Shanahan (Massachusetts Institute of Technology)
      • 16:02
        Speaker switch
      • 102
        "Storage and processing resource access (Facility and Infrastructure R&D)"" Topical Working Group
        Speakers: Frank Wuerthwein (UCSD), Robert Gardner (University of Chicago), Wahid Bhimji (NERSC, Berkeley Lab)
      • 16:13
        Speaker switch
      • 103
        "End user analysis" Topical Working Group
        Speakers: Amy Roberts (CU Denver), Gavin Davies (University of Mississippi), Peter Onyisi (University of Texas at Austin)
      • 16:24
        Speaker Switch
      • 104
        "Quantum computing" Topical Working Group
        Speakers: Gabriel Perdue (Fermilab), Martin Savage (INT), Travis Humble (Oak Ridge National Laboratory)
      • 16:35
        Speaker Switch
      • 105
        "Reinterpretation and long-term preservation of data and code" Topical Working Group
        Speakers: Kyle Cranmer (NYU), Matias Carrasco Kind (NCSA/University of Illinois), Mike Hildreth (Notre Dame University)
      • 16:46
        Speaker Switch
      • 106
        Snowmass 2021 Computational Frontier
        Speakers: Benjamin Nachman (LBNL), Oliver Gutsche (Fermi National Accelerator Laboratory), Steven Gottlieb (Indiana Univ.)