Weekly CCE-IOS tele-conference

US/Central
Peter van Gemmeren (ANL), Rob Ross (ANL)
Description
BlueJeans Link: https://bluejeans.com/102100194

Rob Ross, Peter Van Gemmeren, Doug Benjamin, Chris Jones, Paolo Calafiura, Philippe Canal, Matthieu Dorier, Shane Snyder, Suren Byna, Torre Wenaus, Rob Latham, Saba Sehrish


Topics for future calls:
- RobL: HPC I/O, how we think about it
- Chris: CMS production workflows, or "I/O usage in CMS multi-threaded framework"

- Philippe/Jakob: More details on ROOT including rntuple stuff
- Shane: Darshan, what it is

- Matthieu and Saba: HEPnOS and Mochi, what they are
- Doug/Torre: ATLAS Simulation w/ and w/out EventService, including (or additionally) Fast-Simulation, Fast-Chain


- ?: ROOT and its use in ATLAS, CMS, and DUNE (?)
- ?: What IRIS-HEP is doing re: alternative data formats?
- ?: DAOS?


Milestones:
First quarter:
- documentation of patterns
- get to know one another

Second quarter:
- performance of HEP experiment benchmarks
  - using Cori for ATLAS, maybe, or maybe on the Grid...
  - ATLAS Simulation w/out EventService
  - ATLAS EventService Simulation (fine-grained (event-wise) processing)
- instrument ROOT I/O patterns

Experiment use cases:
- Because IRIS-HEP is covering analysis, we should focus on "production" workflows.
  - simulation
    - full simulation (easy), fast simulation (hard), to be presented, discussed.
  - reconstruction
  - derivation -- when they write the physics products -- maybe?
- nail down 3 (or maybe 4) specific use cases
  - not "look at everything"
  - Q: what's the appropriate CMS one?
    - Chris: reconstruction (maybe): something we want to do well
  - something from DUNE?

---

PVG: HEP Experiment and ROOT I/O

files have "compressed baskets" of a tree

1. read compressed baskets
2. decompress baskets -- typically have data from multiple events/entries
3. deserialize into an object, creates "persistent state" or a "transient state"
3.a if you got persistent state, then convert to transient state (TP conversion)

most of this is ROOT. CMS doesn't do any TP conversion.

Compression:
- lossless
- some type conversion for reducing fidelity, separate from this, done in serialization
  - sometimes more than this, aware of ranges and such. all done in serialization, bit packing.
  - but this is unusual.

Serialization:
- decomposition is the job of the Streamer
- every class has a Streamer
- ROOT writes class descriptions with data
- Streamerinfo list is used to decode an object

- splitting into TBranches - decides how member data is meshed into branches
  - can put in a single branch or split across many branches
  - structs of arrays vs. arrays of structs

T/P Conversion -- ATLAS specific
- not ROOT specifically
- some experiments use simpler persistent state objects to capture more complex transient classes
  - also helps with schema evolution or custom/domain specific compression

CMS started with a policy that the files generated by the framework should be easily readable without a lot of extra stuff, thus no T/P conversion. File format is meant to be directly readable.


Suren: typical # of baskets, all read?
PVG: Varies widely between different products: 10s-1000s branches. ATLAS and CMS similar. To be followed up.

There are minutes attached to this event. Show them.
    • 11:00 11:10
      Recap Kick Off meeting 10m
      Speaker: Dr Peter van Gemmeren (ANL)
      Slides
    • 11:10 11:30
      Discuss projects 20m
      Discussion of topics and getting volunteers to present in future meetings
      Speaker: All
    • 11:30 11:50
      HEP Experiment and ROOT I/O 20m
      Discussion starter
      Speakers: All, Dr Peter van Gemmeren (ANL)
      Slides