Mar 18 – 22, 2021
Stony Brook, NY
US/Eastern timezone

Towards an Interpretable Data-driven Trigger System for High-Throughput Physics Facilities

Mar 19, 2021, 12:20 PM
Stony Brook, NY

Stony Brook, NY

Online [US/EST Timezone]


David Miller (University of Chicago)


Data-intensive science is increasingly reliant on real-time processing capabilities and machine learning workflows, in order to filter and analyze the extreme volumes of data being collected. This is especially true at the intensity frontier of particle physics. Data filtering algorithms, or trigger algorithms, at the LHC drive the data curation process, funneling event records with certain features into categories that are predefined based on the labels extracted by the trigger algorithms. The design, implementation, monitoring, and usage of these trigger algorithms is resource-intensive and can include significant blindspots. The menu of trigger algorithms is manually designed based on domain knowledge (involving ~100 data filters). In this presentation, we introduce a new data-driven approach for de- signing and optimizing high-throughput data filtering and trigger systems such as those in use at physics facilities like the LHC. Concretely, our goal is to replace the current hand-designed trigger system with a data-driven trigger system with a minimal run-time cost, while preserving the distribution of the output. We introduce key insights from interpretable predictive modeling and cost-sensitive learning in order to account for non-local inefficiencies in the current paradigm and construct a cost-effective data filtering and trigger model that does not compromise physics coverage. We next plan to use this model to expand to a self-driving and continuous learning triggering algorithm, that will allow us to discover new physics without extensive knowledge of the parameter space.

Primary author

David Miller (University of Chicago)

Presentation materials