Weekly operations meeting for sites doing DUNE computing
DUNE Ops meeting July 20
Open tickets
Fermilab, 30K protodune-sp files pending write to tape, have asked for sam -enstore sync up
Protodune-sp is done running.
Eos filled up over weekend, ticket with CERN, rucio DN was no good and not in the VOMS
All fixed now
Still open ticket with CCIN2P3
DCache issues after downtime
NFS clients badly messed up
Some dunegpvm may be need to reboot again
Residual from dcache downtmie
Network didn’t update main router ACL’s, left some parts unaccessible from offsite
And firewall was messed up
Now both fixed
Trying to get enstore functionality working on dunegpvm01
———————————
Protodune ops
NP04 draining cryostat as of this morning, transferring argon to np02
Only cold box and noise runs coming from np04 from here on out
Expect data taking np02 to start up late August in earnest
————————
Production operation items —Ken Herner
Big thing was the dCache downtime
Issue with xrootd transfers offsite—now fixed
CBPF onboarding —still dealing with CA certificates in non standard place.
No keepup ran in last few days because np04 ops didn’t update the run spreadsheet
——————————
Data management operations—Wenlong Yuan
NIKHEF onboarded to Rucio
SRR information now available for all UK sites but two.. (those running storm or dCache)
All WLCG sites should be publishing this SRR
Petr—ATLAS used SRR for most basic consistency check
Robert—are we feeding this info back to Rucio—Wenlong—not yet
Steve—cleanup of np04data stuff @ CERN in progress, can put some analysis product there finally.
——
Site roundtable—
tom lecompte- Argonne, want to hook up to ALCF
Some discussion about where Fermilab HEPCloud is in that process.
TIFR still working on getting their host cert
—————
ETF/CRIC
Kirby—is there any uptime report for ETF yet—
Steve—will take action item to show ETF screen and Monit screen
Raja’s DN for testing has been added to FERRY with /dune/Role=ETF
Still have to enable the testing for FNAL with that special proxy on ETF end.. adding
The role to gpce03/04 at FNAL is in progress.
Any other business
Heidi—
Need perfsonar discussion
Need to actually measure our bandwidth
Need to agree to limit our bandwidth.
Need to bring Terry Froy, Phil DeMar
MultiOne, etc.
When is meeting on ESNet use case—4 PM Today
Storage Resource Reporting (SRR) record, is a JSON object that describes a storage system. This incl
https://www.dcache.org/manuals/Book-5.2/srr.shtml
http://italiangrid.github.io/storm/documentation/how-to/how-to-publish-json-report/
https://ggus.eu/index.php?mode=ticket_info&ticket_id=145679
From Stu Fuess to Everyone: (9:08 AM)
Here's a perfSonar link I have... maybe something can be learned of capabilities here: http://psonar7.fnal.gov/toolkit/