To connect via Zoom: Meeting ID 831-443-820
Password distributed with meeting announcement
(See instructions for setting Zoom default to join a meeting with audio and video off: https://larsoft.org/zoom-info/)
PC, Mac, Linux, iOS, Android: https://fnal.zoom.us/j/831443820
Phone:
https://fnal.zoom.us/zoomconference?m=SvP8nd8sBN4intZiUh6nLkW0-N16p5_b
H.323:
162.255.37.11 (US West)
162.255.36.11 (US East)
213.19.144.110 (EMEA)
See https://fnal.zoom.us/ for more information
At Fermilab: no in-person presence at the lab for this meeting
Erica: Release and project report
Herb noted one of his PRs is missing. May be on the fork? Will investigate.
A bug fix, so can get it into this release if it can be recovered, approved quickly
Erica: 2021 LArSoft Work Plan summary
Hans: noted that photon simulation in G4 is already capable of running on GPU. Just a matter of a build switch. Should look into that
Erica: This would be a hybrid solution, given that existing production platforms are grid-based. Mike has worked on allowing access to GPU from the grid. Hoping to see this operate at production scale
Mike noted that his solution is directed at machine learning. More difficult to do what Hans is suggesting.
Mike/Hans should talk at some point to better understand what would be needed to make it work
Krzysztof: mentioned that next version of G4 will support execution on accelerators. So moving toward HCP may not be as difficult w G4 as we might initially believe
Erica: This is a direction we believe we need to go, so we will be interested to learn how to do this.
We would then want to find an experiment interested in pursuing one or both of these options, and we will collaborate with them on that.
Kyle Knoepfel: Concurrent cache support
Intro
art has supported concurrent events since June 2018
Many experiment algorithms not designed with multi-threading / concurrency in mind
In pursuing MT upgrades in LArSoft, the need for a concurrent caching system for conditions information became apparent
Unlike CMS, art does not have a dedicated conditions system
Has led experiments to pursue their own solutions
Closest art has is concept of "producing" services
This work is intended to provide a different solution
Previous idea: Producing services
can insert data prods in serialized context immediately after the principal has been created
DB queries can be made in a controlled fashion
Access to data products is thread-safe, so users need not be concerned about thread-safety
For simple and small conditions info, this is a good approach
Downside
Potentially memory-expensive, unless a caching mech is developed
Significant breaking change for configurations
Shift in the mental model of what data products are for
Can framework adopt a conditions system like CMS?
largely no. Would require significant analysis to determine what implementation, interface, and scheduling adjustments would be necessary
The art framework is "feature frozen"
Small framework-agnostic features have been implemented, but large-scale dev has been halted
Less efficient, framework-agnostic, concurrent caching utility could be developed
Assumptions
Must support associative list of user-defined key-value pairs
Insertion, retrieval and (perhaps implicit) erasure of entries + any locking needed
Access shall be const/immutable (so no locks needed after retrieval)
Once access to an entry has been granted, no locking should be needed to use it
Implementation cannot remove a cache entry if it is being used any any thread
Retrieval by key or quantity that can be transformed to at most one key
Implementation
template in hep_concurrency (already in art, based on TBB's concurrent containers)
Use the example: hep::concurrency::cache<...>
...Described lookup interface with examples...
Cache handles
Provides access to a cache entry
const access to the key, the value, and the cache entry's sequence number
Valid vs invalid
convertible to boolean true or false, respectively
Dereferencing invalid handle results in exception
Valid handles can be copied and moved. (The moved-from handle becomes invalid)
Can be compared
"==" and "!=", depending on whether they point to same entry, or diff entries.
Are reference counted.
Cache entries will not be deleted as long as at least one valid handle points to it
Cache entries
Explicit call needed to drop unused entries
Can keep last N most recent entries
Recency determined by "sequence number" corresponding to when it was inserted into the cache
To avoid unnecessary locking, the cache includes an aux data structure that cannot shrink during concurrent processing
If serialized execution can be guaranteed, the shrink_to_fit() function may be called, removing all unused entries from cache and from the aux structure
Inserting
done vie emplacement: cache.emplace(...)
emplace may be called concurrently, but be mindful of the efficiency and thread-safety issues in creating its arguments
Talk to scisoft-team if concerned about this being a problem
Next plans
Will release it concurrently with art 3.07 suite
Expect this to be most useful to art service authors
Please let us know if you have concerns or suggestions