Foundations of a Fast, Data-Driven, Machine-Learned Simulator


We introduce a novel strategy for machine-learning-based fast simulators, which is the first that can be trained in an unsupervised manner using observed data samples to learn a predictive model of detector response and other difficult-to-model transformations. Across the physical sciences, a barrier to interpreting observed data is the lack of knowledge of a detector's imperfect resolution, which transforms and obscures the unobserved latent data. Modeling this detector response is an essential step for statistical inference, but closed-form models are often unavailable or intractable, requiring the use of computationally expensive, ad-hoc numerical simulations. Using particle physics detectors as an illustrative example, we describe a novel strategy for a fast, predictive simulator called Optimal Transport based Unfolding and Simulation (OTUS), which uses a probabilistic autoencoder to learn this transformation directly from observed data, rather than from existing simulations.  Unusually, the probabilistic autoencoder???s latent space is physically meaningful, such that the decoder becomes a fast, predictive simulator for a new latent sample, and has the potential to replace Monte Carlo simulators. We provide proof-of-principle results for Z-boson and top-quark decays, but stress that our approach can be widely applied to other physical science fields.

Recorded Meeting Video: