DUNE Workflow/load Development 22nd April 2021 ---------------------------------------------- https://indico.fnal.gov/event/48828/ Present: Andrew McNab (AM) (chair/notes), Ken Herner (KH) Elisetta Pennacchio (EP) Doug Benjamin (DB) Steve Timm (ST) Apologies: Chris Brew, Raja Nandakumar Intro+news ---------- - Welcome Doug (DB) from BNL GlideInWMS/HTCondor ------------------- - ST: The dune Global Pool frontend now has got its off-site web exemption and we have successfully been able to run user jobs off-site using it. So ready for more people to test submission using the duneglobalops account on dunegpschedd01.fnal.gov DB ready to test submission from BNL: need a HTCondor schedd and set up collector to be able to direct to global pool. ST: How should we have users submitting to the new pool? Same as before, just X.509? Tokens? KH: Modify jobsub to direct DUNE users' jobs to DUNE global pool? Or move straight to jobsub lite for it. AM: Should be driven by timescales for jobsub lite and global pool being ready for prime time. DB: How should tokens come into this? ST: GlideInWMS and factories compatible with tokens. Some sites will be tokens-only in 5-6 months. OSG would like to stop accepting X.509 by February 2022 User tools (inc jobsub) ----------------------- - Nothing to add to above Workflow/production ------------------- - CB is producing a diagram of the outline document https://docs.google.com/document/d/10n0kZbaEc_PPVspmfCK207dqhDnJeDSJbmx8Ue4Dlss/edit?usp=sharing HEPCloud -------- - ST: We will have early access to a pre-exascale GPU machine at NERSC, so if any workloads please propose them. Pilot Factories --------------- - Will switch global pool to OSG factories soon Other topics (ETF, HEP-SCORE, ...) ---------------------------------- - Andrew's talk at WLCG HEP-SCORE task force yesterday https://indico.cern.ch/event/1030671/ We have a proof-of-concept DUNE HEP-SCORE benchmark using one of the DUNE CI test (reco-fd). Some fixes in HEP-SCORE framework needed from their side. Not able to use input data file from dune.osgstorage.org cvmfs Probably due to CVMFS shrinkwrap procedure? Need to agree set of CI tests to include in next iteration. Task force has asked us to expose simulation, reco and analysis separately if we can, since it allows them to compare different experiments set of applications. AOB --- DB: Will there be an association between data locality and job locality? ST: This is something which is still being decided. AM: Is the outline we're working on: https://docs.google.com/document/d/10n0kZbaEc_PPVspmfCK207dqhDnJeDSJbmx8Ue4Dlss/edit?usp=sharing DB: Sites will need estimate of how much network to provide AM: Thinking about network quotas at sites for DUNE, which we use to avoid matching too many network intensive jobs at the same time at a site. Also need to provide estimates for sites so they can plan, yes. Meetings continue in this slot, 8am CT Thursdays.