DUNE Global Computing Operations

US/Central
https://fnal.zoom.us/j/636941598

https://fnal.zoom.us/j/636941598

Andrew Mcnab, Heidi Schellman (Oregon state), Kenneth Herner (Fermilab), Michael Kirby (FNAL), Steven Timm (Fermilab), Stuart Fuess (Fermilab), peter clarke (University Edinburgh)
Description

Weekly operations meeting for sites doing DUNE computing

https://indico.fnal.gov/event/47580/

 

GGUS tickets

 

BERN—enable DUNE to run

Manchester—permissions on the storage element

Sheffield—believed to be an auth problem

149395—BNL ticket job submission

149771 Liverpool ETF failure

 

OSG tickets—

NIKHEF—works but second CE isn’t working

 

Fermilab tickets—

One user close to ban hammer 10K jobs failed three times

Jasingh

 

CERN 3rd party copies issue is fixed

 

——

 

NP02 update—cooling will be back in mid-Feb

Wednesday will restart switching on  storage machines

Will be doing upgrades on storage service

At end they will finish before test with cold box

 

Processing—Planning to do some reprocessing of NP02 data

Taking into account updates of using calorimetric information between views

On time scale of 2-3 weeks

 

 

\\

Move to reprocessing of all np02 data

 

Some simulation needs to be done for near vertical drift

 

Can use CCIN2P3 space to store reco data of NP02

And the results of virtual drift simulation

500TB available on DUNE_FR_CCIN2P3_XROOTD

 

Contact Denis for setup of NP02/EHN1 EOS

 

2 2021 testing activities cold box in summer

In fall voltage testing in current cryostat

—————

 

Np04

 

——

 

Production update

 

ic.ac.uk worker nodes—glidein went crazy yesterday

Major power outage @ imperial

Tata institute working now

A few data transfers from Fermilab->CBPF very slow or got stalled for hours

Things ok now.. Helio saw no messages over the weekend.

No update re. Fabio and campinas

Almost all 6GeV MC submitted, running

 

Almost all NERSC files done for the moment

Next need to run the merging of the anatuples

 

Some problems—need to get SFA policy in for michelnemoving files

Have already merged the files into reasonably sized chunks

Where are the big merged files—

Right now in the same directory since tagged by run number

 

Very end of phase 2.

 

———

 

Data management operations items

Cutting over all operations to the OKD-based rucio server

No off-site.

Will test once this is done by wiping everything for BNL and sending it again.

 

Robert—does FTS3 server have to have off-site exemption

Eventually yes

 

Rucio test account and scope getting enabled everywhere

——

 

Site roundtable

QMUL network outage, otherwise nothing to report

 

Brazil—nothing from Brazil sites

 

BNL—gave ticket to Paul

 

 

Jonathan—who to talk to re. Getting near detector test beam to main computing center

Start with Kirby—numbers should feed into Fermi Compute Resources Steering Group meeting which may be as soon as the end of this month.

There are minutes attached to this event. Show them.
    • 08:30 08:40
      Open Tickets. GGUS/SNOW/Github 10m

      Github task tracker

      https://github.com/DUNE/dist-comp/labels/Operations

      Fermi and CERN ServiceNow

      DUNE GGUS tickets: http://cern.ch/go/p6s8

      Speakers: Dr Andrew McNab (University of Manchester), Dr Michael Kirby (FNAL), Dr Steven Timm (Fermilab)
    • 08:40 08:55
      Reserved for big presentation 15m
    • 08:55 09:00
      Updates from running np04/np02 5m
    • 09:00 09:05
      Production Operation Items 5m
      Speaker: Dr Kenneth Herner (Fermilab)
    • 09:05 09:10
      Data Management Operation Items 5m
      Speakers: Dr Steven Timm (Fermilab), Wenlong Yuan
    • 09:10 09:20
      Computing Sites Roundtable 10m

      Any sites having issues to raise with DUNE operations can do it at this time.

    • 09:20 09:25
      Information Systems Operations (ETF, CRIC) 5m
      Speakers: Andrew Mcnab, Raja Nandakumar
    • 09:25 09:30
      Any Other Business / Announcements 5m