DUNE Data Management Meeting

US/Central
203 (Feynman Computing Center)

203

Feynman Computing Center

Description

This is the weekly DUNE Data Management Meeting. We discuss the Data Management Project for the upcoming DUNE experiment, and also data management operations issues. Attendees are people who are actually working on data management plus higher DUNE computing management as required.

    • 09:00 09:20
      Standing Item: Rucio Token Development Status 20m
      Speakers: James Perry, Steven Timm (Fermilab)

      James--Rucio token status

       

      All of the token compliance scripts are working

      two that Rucio development people plus the FTS

      Got user auth flow working--but does take a change to rucio to remove one parameter that cilogon doesn't support. James submitted PR to make it configurable whether it is called or not.

      CILogon doesn't support audience parameter for the particular user auth flow.. should go to CILogon maintainers now and see if they can add.

      here is the pull request -  https://github.com/rucio/rucio/pull/7289

      Next thing is to set up a rucio test instance that is using tokens from CILogon for FTS transfers and user authentication. 

      Plan was to use the docker compose that has all the services (including FTS) running in container + all the rses are containerized and run locally.

      Admin flow didn't require any changes to rucio in the end.  required some changes to the cilogon code which is now running on the test instance.. likewise with the FTS3.

      Rucio meeting this Thursday Jan 9 3 pm CET  8 AM US

      Any updates on download or upload?

      how does Justin workflow of reading files intersect with the "download flow"

      At moment the user jobscript is given the x.509 proxy with the no roles voms attribute..which gives you changes to read only.

      the token is not given to the user at the moment.. the token is obtained by the wrapper job.. after the wrapper job has run, Justin gives the job a token for that user to do the upload (and/or copy to scratch).  That mechanism can be uploaded without a lot of work.

      For reads within the job scripts, need some kind of read-only low-priority token.  would like to avoid giving the user their own token.  use the heartbeat mechanism?  at the moment the heartbeat sends a message back that the job is still alive.

      if more than one token needed in Justin, would have to change Justin auth to use the device workflow instead of the login workflow, complicated but not impossible

      Vault (or htcondor talking to vault) is a possibility but adds a lot of complication, prefer to avoid if possible.

      whatever we can get on timelines we want.

      whatever we can get on rucio-as-a-token-issuer we need to know

      Need a plan to either (1) tell cilogon how much longer they have to stay up or (2) figure out how we could operate without CILogon personal certs.

       

       

    • 09:20 09:25
      Standing item: Proposed upgrades / configuration changes 5m
      Speakers: Brandon White (Fermilab), Marc Mengel (Fermilab), Steven Timm (Fermilab)

      Rucio v35_6 was deployed just before the holidays

      Brandon has been changing deployment to use an init container to get the certificates.  now there is a persistent volume with CA Certificate installed mounted on all containers.

      Wants to make updates to the HELM charts--have to understand the useDeprecatedImplicitSecrets option

      and implications of turning it off.

       

      Metacat--no major updates coming

      DeclaD--Steve working on 2.3.3 install at CERN

       

       

       

    • 09:25 09:30
      Standing Item: Datasets pending DM action 5m
      Speaker: Steven Timm (Fermilab)

      One issue not understood why Jake's merges are making files in hd-protodune namespace

      rather than hd-protodune-det-reco namespace,

      Will get to the bottom of this eventually.

    • 09:30 09:35
      Plans for DUNE data challenge 2025 5m
      Speakers: Doug Benjamin (Brookhaven National Laboratory (US)), Steven Timm (Fermilab)

      Won't be able to do DC25 in February due to DOE review

       

      Major goals: 

      Make sure we can keep up with beam processing 

      Make sure we know when beam is coming and get it done before.

      Elisabetta said she would discuss keep-up processing for np02 at Jan 9 prod meeting.

       

    • 09:35 09:45
      Metacat policy--checksums and virtual files 10m
      Speaker: Marc Mengel (Fermilab)

      Marc suggested that we add a virtual checksum for virtual files of form 

      virtual: '00000001'

      Idea is that if we make the file virtual, they don't have to checksum it, but we can 

      make checksum required everywhere.

      would size matter--probably not

      Steve--do I want to change retention_class to be virtual as well?

      data tier root-tuple-virtual can be used for small histogram files that get merged.

       

       

       

    • 09:45 10:00
      Clearing data off of CERN to make room for vd-protodune 15m
      Speaker: Steven Timm (Fermilab)