DUNE Global Computing Operations

US/Central
https://fnal.zoom.us/j/636941598

https://fnal.zoom.us/j/636941598

Andrew Mcnab, Heidi Schellman (Oregon state), Kenneth Herner (Fermilab), Michael Kirby (FNAL), Steven Timm (Fermilab), Stuart Fuess (Fermilab), peter clarke (University Edinburgh)
Description

Weekly operations meeting for sites doing DUNE computing

DUNE Ops meeting July 20

 

Open tickets

 

Fermilab, 30K protodune-sp files pending write to tape, have asked for sam -enstore sync up

Protodune-sp is done running.

 

Eos filled up over weekend, ticket with CERN, rucio DN was no good and not in the VOMS

All fixed now

 

Still open ticket with CCIN2P3

 

DCache issues after downtime

NFS clients badly messed up

Some dunegpvm may be need to reboot again

 

Residual from dcache downtmie

Network didn’t update main router ACL’s, left some parts unaccessible from offsite

And firewall was messed up

 

Now both fixed

Trying to get enstore functionality working on dunegpvm01

 

———————————

Protodune ops

 

NP04 draining cryostat as of this morning, transferring argon to np02

 

Only cold box and noise runs coming from np04 from here on out

Expect data taking np02 to start up late August in earnest

 

————————

 

Production operation items —Ken Herner

Big thing was the dCache downtime

Issue with xrootd transfers offsite—now fixed

 

CBPF onboarding —still dealing with CA certificates in non standard place.

 

 

No keepup ran in last few days because np04 ops didn’t update the run spreadsheet

 

——————————

Data management operations—Wenlong Yuan

 

NIKHEF onboarded to Rucio

 

SRR information now available for all UK sites but two.. (those running storm or dCache)

 

All WLCG sites should be publishing this SRR

 

Petr—ATLAS used SRR for most basic consistency check

 

Robert—are we feeding this info back to Rucio—Wenlong—not yet

 

Steve—cleanup of np04data stuff @ CERN in progress, can put some analysis product there finally.

 

——

 

Site roundtable—

 

tom lecompte- Argonne, want to hook up to ALCF

Some discussion about where Fermilab HEPCloud is in that process.

 

TIFR still working on getting their host cert

—————

 

ETF/CRIC

 

Kirby—is there any uptime report for ETF yet—

Steve—will take action item to show ETF screen and Monit screen

Raja’s DN for testing has been added to FERRY with /dune/Role=ETF

Still have to enable the testing for FNAL with that special proxy on ETF end.. adding

The role to gpce03/04 at FNAL is in progress.

 

 

Any other business

Heidi—

Need perfsonar discussion

Need to actually measure our bandwidth

Need to agree to limit our bandwidth.

Need to bring Terry Froy, Phil DeMar

MultiOne, etc.

 

When is meeting on ESNet use case—4 PM Today

 

Storage Resource Reporting (SRR) record, is a JSON object that describes a storage system. This incl

https://www.dcache.org/manuals/Book-5.2/srr.shtml

http://italiangrid.github.io/storm/documentation/how-to/how-to-publish-json-report/

https://ggus.eu/index.php?mode=ticket_info&ticket_id=145679

 

From Stu Fuess to Everyone: (9:08 AM)

 

Here's a perfSonar link I have...  maybe something can be learned of capabilities here: http://psonar7.fnal.gov/toolkit/

 

 

There are minutes attached to this event. Show them.
    • 08:30 08:40
      Open Tickets. GGUS/SNOW/Github 10m

      Github task tracker

      https://github.com/DUNE/dist-comp/issues/

      Fermi and CERN ServiceNow

      DUNE GGUS tickets: http://cern.ch/go/p6s8

      Speakers: Dr Andrew McNab (University of Manchester), Dr Michael Kirby (FNAL), Dr Steven Timm (Fermilab)
    • 08:40 08:55
      Time slot for big presentation if necessary 15m
    • 08:55 09:00
      Updates from running np04/np02 5m
    • 09:00 09:05
      Production Operation Items 5m
      Speaker: Dr Kenneth Herner (Fermilab)
    • 09:05 09:10
      Data Management Operation Items 5m
      Speakers: Dr Steven Timm (Fermilab), Wenlong Yuan
    • 09:10 09:20
      Computing Sites Roundtable 10m

      Any sites having issues to raise with DUNE operations can do it at this time.

    • 09:20 09:25
      Information Systems Operations (ETF, CRIC) 5m
      Speakers: Andrew Mcnab, Raja Nandakumar
    • 09:25 09:30
      Any Other Business / Announcements 5m