ROOT I/O Workshop, Early Spring 2018

Europe/Zurich
Brian Bockelman (University of Nebraska-Lincoln), Danilo Piparo (CERN), Philippe Canal (FERMILAB)
Description

Workshop to discuss the current bottlenecks in ROOT I/O and any potential solution.
See the presentations of the previous in person workshop and the  previous vidyo workshop

New Vidyo room:
Name: ROOT_IO_Workshop
Description: February 2018 ROOT IO workshop
Extension: 10401125
Auto-join URL
Useful links: Phone numbers

Attendees: Brian, Chris Jones, Peter V.G., Jim P, Philippe C., David M., Guilherme A, Danilo P, Liz S., Maciej Szymanskim Mikolaj Krzewickim, Matevz, Xavi, Axel, Andrei, Mihaela, Marcin, Oksana, Enric, Peter H., + at least one more.

Peter and Brian are pointing problems with the AsyncPrefetching either dead-locks or corrupted buffers.

Peter: we are very interested in using this AsyncPrefetching and thus helping with the debugging.

Brian: we ought to have a miss-cache that then load all missing baskets.

PR 240 should be able to be merged in. Need rebase and retest.

Brian: If we extend the default we should add a way to auto-disable it if the I/O operation are fast enough.  Maybe keeping a Exponential Moving Average and if below 1ms disable.  Maybe decide once a cluster.

Philippe:  If there is ’one’ long I/O for a given file, you may want to keep the TC on even if some (most) operation are faster.

Brian: With the Prefill now exists, should we redo the training for each file?

Philippe: The penalty can be large for small selection on low bandwidth link.  Maybe if we are keeping more statistic we can make an inform decision (don’t do retraining for low-bandwidth)

Peter: If there was mis-cache, this is a good indication you should do retraining.

Brian: Change “drop-behind” behavior

Peter: Yes, David Clark introduced this feature.

Brian: Should we also optimize for more than one tree per file?

all: this is really a framework level use case.

David M: Is Oksana already in contact with the ATLAS I/O performance inverstigators?  If not, then we ought to put them in contact.  There is meeting at CERN regarding that the first week of March.

Jim: Compared to parquet, Root ‘lose’ in the size of boolean when uncompressed (8 vs 1 bytes).  Also more meta-data in ROOT (to allow multiple schema in same file).

Jim: Conclusion parquet is actually very similar to ROOT,  it produces smaller files but slower.

Chris: Need to avoid the repetitive writing of the partial TTree.

Peter: we need to add asynchronous prefetch to the I/O POW.

Peter: In the TBufferMerger we need to have a way to know which entry number we are at.

There are minutes attached to this event. Show them.