ROOT I/O Workshop, Early Spring 2018

Europe/Zurich
Brian Bockelman (University of Nebraska-Lincoln), Danilo Piparo (CERN), Philippe Canal (FERMILAB)
Description

Workshop to discuss the current bottlenecks in ROOT I/O and any potential solution.
See the presentations of the previous in person workshop and the  previous vidyo workshop

New Vidyo room:
Name: ROOT_IO_Workshop
Description: February 2018 ROOT IO workshop
Extension: 10401125
Auto-join URL
Useful links: Phone numbers

Attendees: Brian, Chris Jones, Peter V.G., Jim P, Philippe C., David M., Guilherme A, Danilo P, Liz S., Maciej Szymanskim Mikolaj Krzewickim, Matevz, Xavi, Axel, Andrei, Mihaela, Marcin, Oksana, Enric, Peter H., + at least one more.

Peter and Brian are pointing problems with the AsyncPrefetching either dead-locks or corrupted buffers.

Peter: we are very interested in using this AsyncPrefetching and thus helping with the debugging.

Brian: we ought to have a miss-cache that then load all missing baskets.

PR 240 should be able to be merged in. Need rebase and retest.

Brian: If we extend the default we should add a way to auto-disable it if the I/O operation are fast enough.  Maybe keeping a Exponential Moving Average and if below 1ms disable.  Maybe decide once a cluster.

Philippe:  If there is ’one’ long I/O for a given file, you may want to keep the TC on even if some (most) operation are faster.

Brian: With the Prefill now exists, should we redo the training for each file?

Philippe: The penalty can be large for small selection on low bandwidth link.  Maybe if we are keeping more statistic we can make an inform decision (don’t do retraining for low-bandwidth)

Peter: If there was mis-cache, this is a good indication you should do retraining.

Brian: Change “drop-behind” behavior

Peter: Yes, David Clark introduced this feature.

Brian: Should we also optimize for more than one tree per file?

all: this is really a framework level use case.

David M: Is Oksana already in contact with the ATLAS I/O performance inverstigators?  If not, then we ought to put them in contact.  There is meeting at CERN regarding that the first week of March.

Jim: Compared to parquet, Root ‘lose’ in the size of boolean when uncompressed (8 vs 1 bytes).  Also more meta-data in ROOT (to allow multiple schema in same file).

Jim: Conclusion parquet is actually very similar to ROOT,  it produces smaller files but slower.

Chris: Need to avoid the repetitive writing of the partial TTree.

Peter: we need to add asynchronous prefetch to the I/O POW.

Peter: In the TBufferMerger we need to have a way to know which entry number we are at.

There are minutes attached to this event. Show them.
    • 4:00 PM 4:05 PM
      Introduction 5m
      Speaker: Mr Philippe Canal (FERMILAB)
    • 4:05 PM 4:25 PM
      Enhance TTreeCache defaults 20m
      Speakers: Dr Brian Bockelman (University of Nebraska-Lincoln), Oksana Shadura
      Slides
    • 4:25 PM 4:35 PM
      LZ4, ZStd updates 10m
      Speaker: Oksana Shadura
      Slides
    • 4:35 PM 4:50 PM
      Parquet Data Format Performance 15m
      Speaker: Dr Jim Pivarski (Fermilab)
      Slides
    • 4:50 PM 5:05 PM
      CMS Update 15m
      Speakers: Dr Christopher Jones (Fermilab), Daniel Riley
      Slides
    • 5:05 PM 5:25 PM
      ROOT I/O Program Of Work. 20m
      Speaker: Mr Philippe Canal (FERMILAB)
    • 6:20 PM 6:35 PM
      Discussions 15m