

# ASIC Group

Location in Instrumentation Division



# Instrumentation Division **Radiation Detectors Platform**

key capabilities enabling BNL Science Programs:

★ Application Specific Integrated Circuits (ASICs)

accelerator applications























https://www.bnl.gov/instrumentation/

# ASIC Group

People, Tools, Collaborations, Areas of Activities and Research Interests



# **ASIC People and CAD/EDA Tools**

Expertise in low-noise, low power, large mixed-signal designs

₱ 7 full time ASIC designers

(5 PhDs + 2 in PhD program, including industrial background)

ongoing targeted hiring for satisfying needs across multiple programs

pen for hosting guest/visitors and grad/post-grad students

∄ hand-in-hand with in-house TDAQ, PCB, sensors and other groups

### Design tools and methodologies

industry-standard tools from Cadence, Siemens (Mentor), etc. (analog on top or digital on top flows)

analog: full custom flow (VSE, VLE, ADE/Spectre, AMS, PVS, XACT3D-PEX)

digital RTL2GDS: functional simulation, logic synthesis, automated P&R, parasitics extraction, static timing analysis (XCELIUM/GENUS/INNOVUS/QUANTUS QRC/TEMPUS)

library characterization: custom standard cell libraries for designs for extreme environments: cryogenics and radiation verification: IR drop (VOLTUS/ VOLTUS-fi), functional (SV), physical (PVS, Calibre DRC/ERC/LVS)

device modeling: TCAD, FEM solvers, transistor model parameter extraction

(Silvaco ATHENA-ATLAS-VICTORY, Maxwell, UTMOTS4)

foundry PDK's: TSMC CMOS 350nm, 250nm, 180nm, 130nm, 65nm, GF CMOS and BiCMOS SiGe 130nm, 90nm, + specialized processes: monolithic CIS on HR, sensors co-design, High-Voltage etc.

access to foundries via: MOSIS, CERN-IMEC Foundry Services, IMEC and directly

A packaging: in house custom and through commercial sources







♠LArASIC\_P4 180 nm♠ALFE2 130 nm

• AVG\_DEV 65 nm



# ASIC Collaborations, Areas of Activities and Research Interests

#### **Collaborations:**

大 Universities: SMU, UMich, UPenn, MIT, Georgia Tech, Columbia, USF, UIUC

木 National Laboratories: FNAL, LBNL, ORNL, NRL

★ Industry: several industrial partners + more collaborators

International: CERN, OMEGA, KIT Karlsruhe, AGH Krakow, UBonn cryostat at IO operating down to 4K

sensors / probe station at CFN

RF readout of gubits or quantum

Support for fast-developing scales and functionalities of modern microelectronics:

★ meet customer's needs using established processes (130nm, 65nm)

reaching for emerging technologies for R&D (28nm, specialized processes)

#### Areas of Activities and Research Interests:

#### Low-noise and low-power

custom analog front-end matched to a specific sensor

木 front-end circuits optimized for amplitude & time-resolution

→ data, event driven or zero-suppressed readout methodologies

Layout of 4K RF tests structures CryoCMOS With 5.12 GHz center frequency

VCO and QVCO for PLL with CML divider and interfaces

QRFIC P1



#### Cryogenic operation

木 readouts for Noble liquid TPCs – R&D on RO electronics for DUNE's VD TPC, nEXO (IAr, IXe)

木 long lifetime reliability

太 development & maintenance of spice-type model parameters and characterization (.lib) of standard cell libraries

 $\rightarrow$  RF electronics for quantum sensors (4K,  $\leq$ 1K)



V.Manthena et al, "A 1.2-V 6-GHz Dual-Path Charge-Pump PLL Frequency Synthesizer for Quantum Control and Readout in CMOS 65-nm Process", 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication 6 Conference (UEMCON), 2020, pp. 0570-0576

# ASIC Collaborations, Areas of Activities and Research Interests

#### Radiation-Hardness

immunity to TID, NIEL and SEE effects:

- process (inherent to process)
- design (achieved through proper design techniques)
   development of methodologies for radiation hardness
- development of methodologies for SEE immunity
- \* exploration of next-gen for HEP CMOS and BiCMOS processes

#### Hybrid-pixel detectors

spectroscopic detectors for X-ray detection (BES, BER, NASA)

#### Lightweight detectors, 3D-IC and HDI

Y edgeless and gapless, highly granular pixel detectors with extended functionalities

development of large area sensors for the EIC vertex and tracking layers

\* event-driven and neuromorphic suitable arrayed readouts

#### Embedded AI and neuromorphic processing

data-science-driven co-design methodologies for FE ASICs

\* matrix processing with new electron devices: memristors

ibraries from AI tools to Verilog and Spice, 3rd gen. NN (SNN)





design flow for the synthesis methodology for FE embedded ANNs



block diagram of the analog in-pixel front-end circuit.



# ASIC developments - examples

**DUNE 3-ASIC system** 

FE for LAr Calorimeter in ATLAS

**Event Driven Readout for Pixel Detectors** 

Data Streaming or Sparsified Readout



### **DUNE 3-ASIC IAr TPC Readout**

• Integrated electronics in cryostat (giant TPC with ~5 m drift distance) in liquid Ar (80 K, not accessible for lifetime of DUNE experiment → HCE reliability)

Waveforms are digitized at 2 MHz and read out without zero suppression

· Electronics circuits are mounted near the sense wires,

- Amplifier and Shaper 16 channels
- ADC 16 channels
- Data Merger and Serializer 2 × 1.25 Gbps
- 3-ASIC readout for DUNE far detector:
  - Front-End: LArASIC (180 nm) by BNL,
  - Time interleaved ADC: ColdADC (65 nm) by LBNL, FNAL, BNL)
  - Data concentrator/transceiver: ColdDATA
     by FNAL, BNL digital implementation and P&R
- One 10 kTon FD-1HD detector has:
  - 3000 x 128-channel Front End Mother Boards with 24000 x FE ASICs, 24000 x ADC ASICs, 6000 COLDATA ASICs

Total 12000 1.28 Gbps links (9.2 Tbps of waveform data)





To Slow Control

### **DUNE 3-ASIC IAr TPC Readout**



#### Last modification:

- "ledge" Removed effect (saturation) causing dead time
- Improved BGR reference to avoid ~10% "no-startup" chips
- improved linearity and range of calibration DAC
- addition of strength to input pad ESD protections
- solving reset-quiescient current mismatch problem at LNT Improving stability of output
  - single-to-differential converter
  - Application of recommended DRC rules where possible



LAVASIC P5/P5B

To FT&WIEC

250 wafers LArASIC production run for DUNE ~75k P5 and 75k P5B chips is under way

# LArASIC P5/P5B Yield QA/QC testing

LArASIC MPW met all the DUNE requirements → fabricated ~1800 P5 and 1800 P5B chips (eng. run) for ProtoDUNE II

| LArASIC<br>Chips | Temp. | Tested<br>Chips# | Good #<br>(All channels<br>are normal) | Yield    |
|------------------|-------|------------------|----------------------------------------|----------|
| P5               | RT    | 49               | 49                                     | 100 %    |
| P5B              | RT    | 1642             | 1635*                                  | ~99.57 % |
| P5B              | LNT   | 317              | 317                                    | 100 %    |

P5B has improved input ESD protection compared to P5

\*Only 1 out of 16 channels in each of the two chips are non-functional



#### **LArASIC** performance with differential interface







Brookhaven<sup>®</sup> National Laboratory

Low Noise

**High Linearity** 

Low Crosstalk

### **FE for LAr Calorimeter in ATLAS**



### **LAr Calorimeter in ATLAS**

#### Requirements for FEB2 Preamplifier – Shaper (PA):

- 4 Channel input, 9 Channel output (4 x LG/HG + Trigger sum of 4 channels)
- Input impedance and Dynamic range programmability (25 Ohm 10 mA, 50 Ohm 2mA)
- Input impedance tuning < 2.5 % steps
- Peaking time tuning (15 ± 5 ns, 1 ns steps)
- Preamplifier DC level tuning 200, 2.3 V ± 50 mV
- DC output tuning,  $600 \text{ mV} \pm 360 \text{ mV}$ , 30 mV steps

Two Front-End designs were carried out and compared







#### Choice of ALFE PA:

- ALFE PA was selected for the FEB2 due to its excellent noise performance and power supply noise rejection, Non-Linearity < 0.2 % (HG) over full DR, no change in performance under irradiation up to 1 Mrad (beyond specifications)</li>
- 48,832 ASICs will be required to populate LAr Calorimeter's FEB2

#### **LAYER SUM BOARD**

# **Event Driven Readout (EDWARD)**

Introduction

The poster introduces an efficient system for collecting sparse data originating in multiple sources that operate asynchronously, ultimately sending data to the central data acquisition system in such a way that there is no direct relationship between spatial position of the channel and the order of the channels to be transmitted. The protocol and hardware architecture were developed for ASICs destined for reading out 1D or 2D multichannel radiation sensors that can be micro-strip or pixelated radiation sensors. The presented system can be used to read out both digital and analog data from the channels. It is done via shared digital data buses and analog wires.

#### In-channel logic



- This is a logic presented in each channel and its function is to manage readout transactions between the channel and global peripheries.
- When the data ready flag 'rdy' is set by the back end electronic (e.g. peak found. ADC conversion done) the controller block issues the read request 'req' immediately.
- When 'reg' is active, the readout phaser block is sensitive to the transition to the active logic state on the channel acknowledge input 'ack'.
- This transition can be describes as receiving an acknowledge token with assigned expiration time, after which 'ack' switches back to the inactive state. > The first token initiates readout transaction.



A single transaction may consist of multiple readout phases in which different data (including data from adjacent channels) may be transmitted sequentially and uninterrupted by requests

The maximum number of phases is determined by the number of flip-flops in the phaser chain. However, the actual number of phases can be dynamically reduced by various 'cfg' configurations.

- Only one bit of the redaout control 'rdo' is active during each transaction This active bit is used to enable the corresponding bank of tristate buffers and transmission gates.
- After the last phase is processed done flag 'dne' is set and in result next token initiates reset procedure for in-channel logic during which 'reg' is

**Default bus state** 

The 'rgo' output from the arbitration tree is effectively the logical sum of requests from all channels. This signal, however, is not synchronized in any way with the acknowledgement tokens - the request may come after token expiration or come too late. and the token will not be able to start the transaction due to too short duration of the active state in the channel. For this reason, a mechanism should be provided to distinguish between data derived from readout of a channel and an empty state. This has been implemented as a network of up and down pulls that delineate empty data. The pattern thus determined can then be discarded onchip by a peripheral circuit or off-chip in the acquisition system.



Fig.1 Block diagram of event driven readout system

#### **Synchronization**

The data are latched inside the output periphery by the clock 'clk'. Latching of the data synchronizes the readout with the data acquisition system. Data are latched before generation of each new token, vielding a new set of latched data for each token. Data can be sent serially off the chip. The serialization clock is used therefore for generating readout tokens through its appropriate division, whereas the duty cycle and frequency of the divided clock 'clko' can be decided with a significant level of a latitude.

D.S. Gorni, et al., "Event driven readout architecture with non-priority arbitration for radiation detectors", 2022 JINST 17 C04027

#### **TWEPP 2021**

#### **Arbitration tree**



An arbitration cell upon receiving read request signals 'reqX', selects one of the read request signals and routes an acknowledge token 'acki' that reaches this cell as routed from another arbitration cell located above in the arbitration tree to the direction from which the read request signal has been accepted. Routing is done in a form of gating of the acknowledge signal with the use of grant signals (gnt0, gnt1) generated by the arbiter. The arbiter decides which of the two read request signals is selected and there is no priority between signals. When two read request signals arrive simultaneously, one of them is selected, whereas the selection is random.

Basically, almost all the arbitration cells need to be able not only to decide which of the two read request signals can be services but also whether new read request signals arrive during the active level of the acknowledge signal. The latter goal is rising a need of arbitrating between the read request signals and the acknowledge signals, leading to the general concept of the readout control system with arbitration that is operated without distributing any system clock.

#### **Design and results**



In Fig.6 layout of the pixel matrix consisting of 64 channels is presented. Physical design was implemented with the use of the tools for automatic P&R and TSMC 65nm Standard Cell Library with added designs for Seitz' arbiters. The squares shown in the figure are placeholders for the analog



Transistor level simulation results are shown in Fig. 5 7. During transaction each channel sends its address S (6bits) and group sends its address (8bits). Merged 6 value is observed on digital bus. Config '00' result in 0 one readout phase and '01' in two phases. It is N worth to note how token is passed from one channel in to other after transaction is done. Token is reused and no dead time is observed.

### **All-Digital Platform for Pixel Detectors**

#### EDWARD -

#### Event Driven With Access and Reset Decoder

- receives notification about channel ready to be read out (rdy signal),
- 2. sends request (req signal) to access shared bus,
- 3. transmits request signal to synchronization unit (rqo signal) with simultaneous arbitration if there are multiple requests,
- 4. transmits acknowledge token (acki signals) to channel (ack signal) that wins arbitration = granting permission for exclusive access to bus,
- 5. lets channel drive its data to bus,
- 6. defines access time frame to channel and, if necessary, lets several data packets from same channel uninterruptedly in multiple phases,
- 7. switches immediately, without dead time, between channels if there is still at least one readout request after completing current readout,
- 8. establishes default bus state if no channel is currently being read out.



Universal All-Digital
Platform for
Implementation of
ConfigurationTestability-Readout
Functionalities within
Pixel Detectors

32 × 32 pixels matrix obtained by tiling 4 × 4 basic groups that is suitable for tiling into still larger matrix sizes. All pins are placed on one side for easy connections to peripheral circuitry logic



 $8 \times 8$  pixels base group layout for a 100 × 100  $\mu$ m<sup>2</sup> pixel detector. Each brown square is space left for AFE (size =  $90 \times 90 \ \mu$ m<sup>2</sup>).



that is added on top 15

# Data Streaming or Sparsified Readout

- Detector readout concepts in multiple experiments are based on a very similar principle of:
  - pulse shaping signals with Front-End shaping filters,
  - Digitization of waveforms (or digitization of detected signal extrema),
  - performing signal processing in the digital domain (extraction of additional features: amplitude, ToA, occurrence of pileups, etc.),
  - zero suppression or compression of data and conditioning for transmission,
  - transmission off the detector on high-speed, serial links.

#### Streaming readout

#### VMM3 features:

- 64 input channels / ASIC
- digital output for amplitude and ToA (but measured in analog way)
- low power <10mW/ch</li>
- $\tau_p$ =25, 50, 100, 200 ns (both polarities), and gain 0.5, 1, 3, 4.5, 6, 9, 12, 16 mv/fC
- TAC slope 60, 100, 350, 650 ns
- neighbor logic, SPI for configuration
- buffering latency FIFO (up to 64 → ANALOG events)
- 8b/10b encoding
   Brookhaven



Architecture of the VMM. Most of the block diagram describes one of the identical 64 channels of the chip (The VMM readout system, BNL-213684-2020-JAAM)

VMM
legacy successful BNL design
in GF 130 (former IBM 8RF-DM) process



VMM3/3a developed 2015-2017 10M FETs VMM3a – production version

- Suitable for variety of application
- Nevertheless, its developmental version with upgraded functionalities is currently discussed

### Scalable Next-Gen. Detector Front-End

- Optimized two-ASIC solution:
  - separates high-sensitivity analog from mixed-signal and digital circuits,
  - · allows optimal allocation of functionality with fewer risks,
  - speeds up timeline through independent, parallel development paths,
  - does not significantly increasing packaging complexity v.s. "single ASIC
  - under active development for upcoming experiments such as nEXO



### **Two-ASIC Solution for nEXO**

nEXO light readout SiPM Interposer and Electronics Daughterboard Interconnections - Components



- I. nEXO photo detection RO similar to DUNE TPC RO with combined ADC and data transmission in one chip while settling on a separate Front-End as a modified LArASIC for reading out SiPMs
- Two-ASIC solution seems to be the optimal solution, allowing best using fabrication processes, separating analog and digital functionalities and allowing independent optimization of both
- 3. Also, the two ASIC solution allows maximal flexibility, i.e. one universal ADC-transmission ASIC can be coupled to a variety of front-end ASIC

# Streaming v.s. Data Push





Digital Front-End ASIC can be designed allowing both strategies adapting to data rates, power that can be dissipated in the fiducial volume, confidence to processes data, etc.

### **Two-ASIC Solution for nEXO**



- Pushing most processing into the digital domain, including
  - a. finding of extremum and time of arrival
  - b. handling pileups
  - c. extraction of additional pulse features
- 2. Emphasis on power consumption
- 3. Use of 1.2 low voltage for transmitter and receiver circuits
- Configurable transmission modes with data push as baseline (<50Mbps), but also allowing sending unprocessed data (~500Mbps)
- 5. Simplified interfaces: one link up and one link down
- Combining all signals encoded into one down-link receiver
- 7. Configurability of embedded processing

Main design components: FE buffers and multiplexor, time-interleaved ADC, DSP (Al or FIR filters), transmission encoder, serializer, X Gbps line driver, CDR-PLL, command and slow control interpreter, embedded testability, LDOs, etc.



# **Building Block #1: Fast, LP ADC**

#### Low-power 12-bit hybrid ADC design in 65 nm CMOS:

- Overall: 8-bit SAR (MSB) + 5-bit digital slope (LSB).
- SAR/digital slope boundary uses 1 redundant bit for robustness, thus resulting in 12-bit resolution.
- Asynchronous successive approximation (SAR) converter:
  - Fully-differential charge redistribution architecture.
  - Split capacitor DAC (C<sub>unit</sub> = 20 fF) using an energy-efficient merge-and-split (MS) switching scheme.
  - Uses one redundant conversion cycle (9 cycles for 8 bits) to obtain robustness to comparator noise and V<sub>REF</sub> settling error.
- Asynchronous digital slope (DS) converter:
  - Asynchronous (self-timed) delay line-based architecture to ensure low power and robustness to PVT variations.
  - DS capacitors ( $C_0 = C_{unit}/8 = 2.5 \text{ fF}$ ) are laid out within the SAR DAC to minimize SAR-DS gain mismatch and simplify calibration.
- Additional features:
  - On-chip reference buffer w/ cancellation of switching transients.
  - On-chip digital calibration of comparator offset; supports off-chip calibration of DAC capacitor mismatch.





#### Preliminary performance specifications

| Parameter                     | Value         |  |
|-------------------------------|---------------|--|
| Sampling rate                 | Up to 50 MS/s |  |
| Output resolution             | 12-bit        |  |
| Full-scale voltage            | $1.7 V_{pp}$  |  |
| Effective no. of bits (ENOB)  | 11.0          |  |
| Core power (at 50 MS/sec)     | 820 μW        |  |
| Reference buffer power        | 1.15 mW       |  |
| Walden FOM (including buffer) | 19.5 fJ/bit   |  |

### **ADC - Details**

#### controller MODE Reference $V_{REF}$ Reference Offset cancel DAC $V_{BG}$ buffer compensation DAC controller calibration registers Offset cancel Asynchronous DACs (2x) Digital error correction (DEC) logic SAR controller DT-CMP Split DAC for $\overline{D}_{out}$ **SAR** converter Asynchronous **DS** controller **DAC for DS** Error correction CT-CMP converter logic Main controllers

Offset cancellation

(SAR and DS)

- Includes four major operating modes:
  - Normal operation
  - Discrete-time comparator (DT-CMP) calibration
  - Continuous-time comparator (CT-CMP) calibration
  - DAC capacitor mismatch calibration
- All digital logic is realized inside deep N-wells to provide isolation from on-chip analog circuits.
- Uses separate analog and digital power supplies (VDD = 1.2 V, VDDA = 1.8 V).
  - The analog supply is used by the band-gap reference (V<sub>BG</sub>) and reference buffer.
- Delay tuning bits within the controllers can be externally set (2-bit resolution).



# **ADC - Test Chip**





- The 1 mm<sup>2</sup> test chip integrates the ADC with accessory circuits (band-gap voltage reference, clock receiver, sampling clock generator, output serializer, serial programming interface, and power-on reset).
- Layout area of the ADC core (not including reference buffer) = 320 μm x 160 μm.
- Chip layout and verification is complete; queued for fabrication in mid-June.



### **Building Block #2: LP Line Driver with User-Configurable Pre-Emphasis SST Driver Unit**

Back to the definition: construction of pre-emphasis with DLLs and ganged SST drivers

- ~10% adjustability of 'a', 'b', 'c', 'd, 'e' variables with respect to UI is desired
- for the nEXO Taiflex cable, the best settings were found: 'a' ⇒ 10% of UI; 'b' ⇒ 90% of UI; 'c', 'd', 'e' ⇒ 0.5\*VDD



- b: Duration of pre-emphasis after an edge transition
  - c: Normal logic level (no pre-emphasis)
  - d: Pre-emphasized level prior to edge transition
  - · e: Pre-emphasized level after edge transition





### **Line Driver**

- three-tap Finite Impulse Response (FIR) filter is constructed with adjustable delays and weights for each
- adjustable delay is selected via an 8-Stage Delay-Locked Loop (DLL) and Digital Interpolators (DIs)
- False Lock Detector used for DLL to ensure no degradation in performance
- weights are configured by selecting the number of Source-Series Terminated (SST) drivers enabled in an array
- data is fed into Voltage-Controlled Delay Lines (VCDLs) controlled by the DLL to create delayed copies that are selected via the DIs
- SST drivers are implemented with selectable resistance to optimize output impedance for a low-power or closer matching of impedance.

| Parameter                     | Value        |
|-------------------------------|--------------|
| Data Rate                     | Up to 2 Gb/s |
| Min. Relative Tap Delay       | 1/16*UI      |
| Min. Tap Amplitude            | 75 mV        |
| Non-Driver Power (at 1Gb/s)   | 2.4 mW       |
| Total System Power (at 1Gb/s) | 7.2 mW       |





eye opening (0.72UI and 202.6mV (no RX equalization)

SST Driver Array

# Building block #3: Efficient data readout architectures – slides 14-15

# Building block #4: LC PLL

- Potential specifications for a universal PLL/CDR for digital FE ASIC
  - VCO operating frequency ~6 GHz,
  - Programmable divider to support a variety of clock frequencies and data rates,
  - Loop bandwidth in CDR mode > 2 MHz,
  - Loop bandwidth in PLL mode > 20 kHz,
  - Extended to Cryogenic Temperature Range of operation,
  - Guaranteed operation up to 500 Mrad,
  - Immune to SEE





# Summary

- Presented a few areas of the BNL's ASIC development related to HEP and NP
- Many subjects are not even touched
  - AI/ML embedded in FE
  - MAPS for the EIC
  - Pixel detectors for X-ray detection
  - Projects for other funding sources

Work of the current BNL's ASIC team: Soumyajit Mandala, Sandeep Miryala, Venkata Narasimha Manyam, Giovanni Pinaroli, Nick St. John, Dominik Gorni, Grzegorz Deptuch

and many other designers





At Brookhaven National Laboratory, we all play a part in tackling the most important questions that face our nation and world today. With world-class facilities and experts in a variety of fields, we've created a legacy of seven Nobel Prize winning discoveries and countless breakthrough innovations. And we keep this legacy going every day by hiring those who are excited by innovation and pursue curiosity with passion.

#### **Current Job Openings**

We are currently recruiting for positions within our Instrumentation Division

We are looking for Research Staff in ASIC design and sensor instrumentation:

Junior and Senior ASIC Designer Engineer/Scientist (Electronic Engineer/Scientist/Physicist) to develop ASICs for particle detectors, high resolution X- and gamma-ray spectrometers, high-rate photon counters and imagers, data processors, hybrid and monolithic pixel detectors and more.

He/She will be developing state-of-the-art integrated circuits in modern CMOS/BiCMOS and OPTO technologies.

Visit our career site: https://jobs.bnl.gov/ for more information regarding careers @ BNL

Brookhaven National Laboratory offers an excellent benefits package, tuition reimbursement, and a competitive salary.

#### Our benefits include but not limited to:

Medical/Dental/Vision, generous vacation and holiday plan, Retirement savings, Swimming Pool, Weight Room, Tennis Courts, plus many other perks.

meet us at the IEEE NSS&MIC Conference or email to know more:
Dr. Gabriella Carini (carini@bnl.gov) or Dr. Grzegorz Deptuch (gdeptuch@bnl.gov)

