# **ONLINE MACHINE LEARNING BASED EVENT SELECTION FOR COMET PHASE-I** Yuki Fujii (Monash University), M. Miyataki, Y. Nakazawa, L. Pinchbeck, H. Yoshida, K. Ueno, M. J. Lee The 23rd International Workshop on Neutrinos from Accelerators

5th August 2022, Salt-Lake city, Utah





### INTRODUCTION

- Everyone wants a fast, high efficiency and high purity trigger
  - +cost-effectiveness, redundancy, quick realisations etc...
  - ► In general, it directly determines the signal efficiency (=experimental sensitivity) of our experiments
  - ► We want more data, more physics, more, more & more...
  - Several solutions
    - ➤ Trigger-less (offline trigger w/ GPUs), hardware level vetos, extremely fast data pipeline + gigantic data storage...

In the second second

➤ This talk is based on <u>arXiv:2010.16203</u> (Y. Nakazawa *et.al.*) + new studies mainly done by M. Miyataki







## **COMET EXPERIMENT PHASE-I**

- > Searching for a  $\mu$ -e conversion with sensitivity of  $O(10^{-15})$  in its Phase-I

  - ► Requires ~10<sup>18</sup> total stopping muons per 15<sup>2</sup> days  $\rightarrow$  10<sup>10</sup>  $\mu$ /sec
- Soft many secondary particles will be expected inside the detectors
- See Sam Dekkers talk for more details



Y. Fujii, NuFact2022, Salt-Lake city, Utah



#### > Muon beam produced by impinging the 8 GeV proton beam onto the graphite target

### 8GeV, 3.2kW Proton Beam • Quick realisation to achieve ×100 better sensitivity than the current upper limit • First 90° of transport solenoid • Using a set of Cylindrical Detectors (CyDet), to avoid the direct muon beam • Direct beam profile measurement using StrECAL prototype

COMET Phase-I technical design report, PTEP, Vol 2020, Issue 3, March 2020, 033C01, https://doi.org/10.1093/ptep/ptz125





### **CYLINDRICAL DETECTOR (CYDET)**



▲ others ► CDC C. Wu, et.al. <u>DOI:10.1016/j.nima.2021.165756</u>

Signal electrons' trajectories fully contained inside the volume

► CTH Y. Fujii, *et.al.* <u>DOI:10.5281/zenodo.6781368</u>

Y. Fujii, NuFact2022, Salt-Lake city, Utah





#### $\sim$ ~5,000 wires, 20 full-stereo layers for momentum measurement, typical drift time < 400ns

- > 2 layers of 64 segmented plastic scintillator rings at both ends/of GDC for the timing measurement Suppress accidental events and low momentum particles by taking four-fold coincidence ve
  - background





## **TRIGGER REQUIREMENTS**

- Strong fake trigger suppression
  - Expected 4 fold coincidence rate is ~90kHz from fake events in CTH
    - $\rightarrow$  DAQ system requires <13kHz trigger rate (bottleneck = data processing rate)
  - > At least 1/7 further suppression is needed while keeping the high signal acceptance
- ► Fast online event selection
  - $\blacktriangleright$  Less than 7  $\mu$ sec latency is allowed (limited by the online buffer size)
- ► Flexibility

  - > Availability of the timely modification for possible changes in situations (BG rate, etc) Multiple triggers (bi-products, calibrations, BG enriched etc.)
- ► Stability
- Y. Fujii, NuFact2022, Salt-Lake city, Utah



## **COMET CENTRAL TRIGGER SYSTEM IN PHASE-I**

#### $\succ$ FC7 + FCT

- Make a final trigger decision based on CDC trigger info + CTH trigger info + accelerator info
- Distribute the trigger signal & a 40MHz common clock to all readout and trigger modules



























## **COTTRI SYSTEM (2)**

#### ► COTTRI CDC FE

- Purely digital processing board by utilising FPGA (Kintex-7) and Multi-Gibabit data Transfer technologies (MGT link)
- ► 10 boards cover 100 CDC readout boards corresponding to 4,800 wires
- Perform hit classifications to identify more signal-like hits compared to other proton/low-e hits
- Send those information to COTTRI merger board through MGT link

Y. Fujii, NuFact2022, Salt-Lake city, Utah





#### 10 layers PCB

#### DAQ PC MB







#### **COTTRI SYSTEM (3) 10 COTTRI FEs** COTTRI FES 233 mm Nhit data receiver TRI CDC FE (same Nhit by RECBE GT links) DisplayPort ×10 Total Nhit Counter COTTRI MB V1 (O) Spenit OMET by the CDC active section KEK JAPAN 160 FPGA: Kintex-7 for each CTH module Perform event classifications based on cpc-trigger decision by send find ODC trigger info to the central trigger system (FC7) via MGT 11111111 CDC-trigger sender . The second of the by the CTH module **Central system**













## **DECISION TREE BASED HIT CLASSIFICATION (2)**

Actual implementation

- > Perform hit classification by configuring look-up tables (LUTs) with GBDT weighting tables
- ➤ One COTTRI CDC FE covers 10 RECBEs = 480 wires, 6-bit (2-bit ADC+neighbouring ADCs) data/each as input, decision tree's score as 6-bit output (larger = signal-like)

> Only one or two clock cycles for the score calculation



Y. Fujii, NuFact2022, Salt-Lake city, Utah









Y. Fujii, NuFact2022, Salt-Lake city, Utah

signal event acceptance

## **TRIGGER FULL CHAIN TEST**

- Electronics full-chain test with a partial CDC
  - CTH FE for GBDTs







### SUMMARY

- GBDT based online hit classification w trajectory fake trigger
- Achieved a 96% signal efficiency with less than 13 kHz fake trigger rate from the original rate of ~90 kHz based on the simulation study
- A COTTRI system has been designed and full chain test was performed in success with the GBDT's LUTs already implemented
- > Obtained 3.2  $\mu$ sec latency much shorter than the requirement of 7.5  $\mu$ sec



► GBDT based online hit classification was proposed to extremely suppress the non-

See details in Y. Nakazawa's PhD thesis



### SUMMARY

- trajectory fake trigger
- > Achieved a 96% signal efficiency with less than 13 kHz fake trigger rate from the original rate of  $\sim$ 90 kHz based on the simulation study
- > A COTTRI system has been designed and full chain test was performed in success with the GBDT's LUTs already implemented
- > Obtained 3.2  $\mu$ sec latency much shorter than the requirement of 7.5  $\mu$ sec
- ► We want more!
  - New bi-product trigger, Sustainable data management, etc.

Y. Fujii, NuFact2022, Salt-Lake city, Utah



► GBDT based online hit classification was proposed to extremely suppress the non-

See details in Y. Nakazawa's PhD thesis

 $\blacktriangleright$  Further BG suppression  $\rightarrow$  Wider timing window (=larger signal acceptance),



## NEURAL NETWORK BASED EVENT CLASSIFICATION (1)

- NNs can be alternative (or additive) to the cut-based event classification after the GBDT hit classifier
  - ► Pros
    - Excellent pattern recognition capability especially with the deep neural networks
    - ► Various softwares available for the quick model evaluations
    - ► Much faster than the arithmetic calculations in general
  - ≻ Cons
    - Difficult model conversion from networks to the real firmware
    - ► Heavy resource usage (DSP/LUT/BRAM) for the calculation
    - Calibrations(?) uncertainty estimation(?)
- Y. Fujii, NuFact2022, Salt-Lake city, Utah



## **NEURAL NETWORK BASED EVENT CLASSIFICATION (1)**

- > NNs can be alternative (or additive) to the cut-based event classification after the GBDT hit classifier
  - > Pros
    - > Excellent pattern recognition capability especially with the deep neural networks
    - Various softwares available for the quick model evaluations
    - > Much faster than the arithmetic calculations in general
  - ► Cons
    - $\rightarrow$  New tools available (hls4ml) Difficult model conversion from networks to the real fir  $\rightarrow$  Sparse networks with ce usage (DSP/LUT/BRAM) for the calculation ➤ Calibrations(?) uncertainty estimation(?) → Not to be covered today model quantisations
- Y. Fujii, NuFact2022, Salt-Lake city, Utah





## **MODEL CONSTRUCTION (1)**

#### General workflow of the NN development for FPGA using hls4ml



https://fastmachinelearning.org/hls4ml/concepts.html

Y. Fujii, NuFact2022, Salt-Lake city, Utah



## MODEL CONSTRUCTION (2)

- ► What do we (users) do (in general)?
  - 1. Data preparations and formatting
  - 2. Model selections
  - 3. Parameters' tuning (# of layers, sparseness, resolutions etc.)
    - Grid scanning, built-in/customised tuners, etc.
  - Performance evaluation
    - Accuracy, latency, stability etc
  - 5. Resource check
    - Select your FPGA chip and see whether resource is available

Y. Fujii, NuFact2022, Salt-Lake city, Utah



## MODEL CONSTRUCTION (3)

- > As a first test, we made sets of toy MC for signal/background events for NN training/test
  - > 5% noise events randomly distributed with/without the arch (signal-like) pattern
- Quantised and sparse Multi layer perceptron (QMLP) was tentatively chosen
  - > Few hyper-parameters tuned roughly by utilising a Keras built-in Bayesian optimiser





| Resource usage @Kintex-7 xc7k355T-FFG9 |     |    |   |  |
|----------------------------------------|-----|----|---|--|
| BRAM                                   | DSP | FF | L |  |
| 0                                      | 0   | 5  |   |  |

Latency estimated to be 260 clock cycles

= 130ns @200MHz



#### JJT







## **FIRMWARE DEVELOPMENT (1)**

#### Structure of the "test" firmware



Y. Fujii, NuFact2022, Salt-Lake city, Utah



### **FIRMWARE DEVELOPMENT (1)**

#### Structure of the "test" firmware







## FIRMWARE DEVELOPMENT (2)

#### ► Firmware simulation with Vivado

| Untitled 2*                          |                |          |                       |            |
|--------------------------------------|----------------|----------|-----------------------|------------|
| Q 🖬 🔍 Q 🔀 📲                          | I∢ ⊨ ∎ =       | ±r   +Γ  | 「⇔ │ ⇒ Г │ <b>⊪</b> → |            |
|                                      |                |          | 40-000 ns             |            |
| Name                                 | Value          | 0.000 ns | 50.000 ns             | 100.000 ns |
| U COTTRI_VALID                       | 1              |          |                       |            |
| U CLK200MSYS                         | 1              | 100000   |                       |            |
| U AP_START                           | 1              |          |                       |            |
| U AP_RESET                           | 0              |          |                       |            |
| > 😻 SCORE[239:0]                     | 0000420440c200 |          |                       |            |
| 14 MLP_DONE                          | 1              |          | 1.                    | 25ns       |
| 18 MLP_IDLE                          | 0              |          |                       |            |
| U MLP_READY                          | 1              |          |                       |            |
| <pre>18 CONST_SIZE_IN_1_VALID</pre>  | 1              |          |                       |            |
| <pre>18 CONST_SIZE_OUT_1_VALID</pre> | 1              |          |                       |            |
| <pre>18 SIGNAL_OUT_VALID</pre>       | 1              |          |                       |            |
| 18 BG_OUT_VALID                      | 1              |          |                       |            |
| SIGNAL_OUT[15:0]                     | 00000011110010 |          | *****                 | XXXXXX     |
| > W BG_OUT[15:0]                     | 00000000101010 |          | *****                 | XXXXXX     |
| > W CONST_SIZE_IN_1[15:0]            | 40             |          |                       |            |
| > V CONST_SIZE_OUT_1[15:0]           | 2              |          |                       |            |
|                                      |                |          |                       |            |
|                                      |                |          |                       |            |
|                                      |                |          |                       |            |

Y. Fujii, NuFact2022, Salt-Lake city, Utah





## HARDWARE TEST (1)

- ► Actual NN firmware module (QMLP) was implemented into the COTTRI MB
  - ➤ Write MC signal/BG data pattern into FE via UDP protocol & send them to MB via **2.4 Gbps** MGT link
  - > NN classification performed inside the FPGA & outputs were checked by using Vivado ILA debug core



Updated at: 2022 Jul 20 19:17:00 Y. FUJII, NUFACTZUZZ, Salt-Lake city, Utah



## HARDWARE TEST (2)

### > NN firmware implementation results (just obtained in the last week!)

Waveform - hw\_ila\_1

| Q   + _   ♂   ▶ ≫          | 📕 🕞 🔍              | Q 2                  | •[ ] ( ] )     | 1 <u>1</u> 1 <u>1</u> | •F   Fe   • | F 14            |                  |               |
|----------------------------|--------------------|----------------------|----------------|-----------------------|-------------|-----------------|------------------|---------------|
| ILA Status: Idle           |                    |                      | 1,007          |                       |             |                 |                  |               |
| Name                       | Value              | <b>I ,00</b> 6       | I <b>,</b> 007 | ,D08                  | 1,009       | <b>, , 01</b> 0 | <sup> ,0  </sup> | <b>1,</b> 012 |
| > ₩NUM_OF_VALID_FE[7:0]    | 01                 |                      |                |                       |             |                 |                  |               |
| 16 COTTRI_VALID            | 0                  |                      |                |                       |             |                 |                  |               |
| > W DpRxDataOut[239:0]     | 04004204108100     | 8800000D4 🗆          | D40042041 🗆    | 0440c2001 🗆           | 001081001   | 001081D09 🗖     | 001081143 🗖      | 04908a14      |
| > WINPUTDATA_DEBUG[239.0]  | 0000420440c200     |                      |                |                       |             |                 | UUUU42U44Uc      | 200108100     |
| liå cottri sys rst         | U                  |                      |                |                       |             |                 |                  |               |
| 16 MI P_IDI F              | 0                  |                      |                |                       |             |                 |                  |               |
| 16 MLP_READY               | 1                  |                      |                |                       |             |                 |                  |               |
| 16 MLP_DONE                | 1                  |                      |                |                       |             |                 |                  |               |
| 15 SIGNAL_OUT_VALID        | 1                  |                      |                |                       |             |                 |                  |               |
| BG_OUT_VALID               | 1                  |                      |                |                       |             |                 |                  |               |
| 16 CONST_SIZE_IN_1_VALID   | 1                  |                      |                |                       |             |                 |                  |               |
| 15 CONST SIZE OUT 1 VALID  | 1                  |                      |                |                       |             |                 |                  |               |
| > W SIGNAI _OUT[15:0]      | 00000011110010     |                      |                |                       |             |                 |                  |               |
| > WBG_OUT[15:0]            | 00000000010101     |                      |                | I                     |             | I               | 1                |               |
| > WCONST_SIZE_IN_1[15:0]   | 40                 |                      |                |                       |             |                 |                  |               |
| > ₩ CONST_SIZE_OUT_1[15:0] | 0002               | 0                    |                |                       |             |                 |                  |               |
| 15 COTTRI                  | 0<br>38 -<br>32 -  | Inpi                 | ıt data        | 2                     |             |                 |                  |               |
|                            | 10 -<br>8 -<br>6 - |                      |                |                       | D           | ata i           | n CO             | TT            |
|                            | 4-                 | <b>/</b> **`         |                |                       | Οι          | itput           | s of C           | <u>)</u> MI   |
|                            | 0 -                | <u> </u>             |                |                       |             |                 |                  |               |
|                            | 0 3                | » » ×<br>Updated at: | 2022 Jul 20 1  | 9:17:23               |             |                 |                  |               |

Y. Fujii, NuFact2022, Salt-Lake city, Utah





Received data from COTTRI FE

Output of Signal Classifier ~0.95

Output of BG Classifier ~0.08

Both output values are consistent with both simulation and offline outputs









- Comparisons for10 signal/BG events
  - Online classifier shows similar but worse performance compared to the offline MLP models
  - More events to be checked
- This is a very preliminary test in order to establish the workflow of NN implemented FPGA
  - ► More resources available
  - ► Further optimisations available







### **SUMMARY AND PROSPECTS**

- > A fast and highly efficient trigger is essential in the COMET Phase-I experiment
  - Better trigger, more physics
- Online machine learning algorithms inside FPGAs are being developed
  - ► GBDT based hit classification was developed and the simulation study showed 96% signal efficiency + 13 kHz trigger rate with a very short net-latency,  $3.2 \mu sec$
- Additional NN based event classification was proposed and the development has begun
  - Potential increasing of the signal sensitivity by factor of two
  - Sparse QMLP model can be realised with very low FPGA resources
  - > We established the workflow and the NN-based firmware was designed, generated and tested with a real FPGA board

Y. Fujii, NuFact2022, Salt-Lake city, Utah



Thank you!

# BACK UP



### ► QMLP model structure

#### Very sparse model was chosen for the first trial

| • | <b>Dense</b><br>q_dense_1   |
|---|-----------------------------|
| 1 |                             |
|   | <b>Dense</b><br>Input_Dense |
|   |                             |
|   | Activation<br>relu1         |
|   |                             |







Y. Fujii, NuFact2022, Salt-Lake city, Utah







-