Please read these instructions before posting any event on Fermilab Indico

Indico will be unavailable on Wed, Jan 15th from 7-7:30am CST due to server maintenance.

Weekly CCE-IOS tele-conference

US/Central
Peter van Gemmeren (ANL), Rob Ross (ANL)
Description
BlueJeans Link: https://bluejeans.com/102100194

Attendees: Paolo Calafiura, Salman Habib, Rob Ross, Peter van Gemmeren, Shane Snyder, Chris Jones, Doug Benjamin, Jakob Blomer, John Wu, Liz Sexton-Kennedy, Matthieu Dorier, Rob Latham, Saba Sehrish, Suren Byna

 

Management News: Some slides need to be ready in a couple of weeks; RobR and PeterVG will follow up.

 

Shane Snyder presenting Darshan

Slide 4:

Darshan: lightweight I/O characterization tool. Deployed at most of the DOE sites, often "on" by default.

Modular, can be extended.

Slide 5:

Works via link-time or runtime instrumentation depending on the build.

Focus on MPI programs, but will revisit this.

Darshan itself writes out information at end of job (MPI_Finalize). Collapses data into a single file for the whole job. Compression applied.

Some simple analysis tools for digging into this data.

Q: Darshan aggregates data from multiple processes; do you still show the individual process behavior?

A: Some aggregation of data, but still some information on individual behavior.

Slide 6:

Modular setup. Core library coordinates.

Instrumentation modules target specific libraries or use cases (e.g., HDF, POSIX)

Self-describing format.

Slide 8:

Cori -- Cray XC40

Enabled by default.

Integrated into Cray software module system.

module list shows version available, etc.

Slide 9:

Just compile and run -- darshan is integrated.

Location of darshan logs described on slide.

Slide 10:

Recently moved to dynamic linking as default. Not in latest releases/deployments.

This doesn't change how things are used, but instrumentation via LD_PRELOAD may be needed.

Slide 12:

Getting text output from the log, tuples.

-1 for a rank indicates an aggregated record.

Slide 13:

Darshan job summary tool -- generates a PDF summarizing some key statistics.

Might have to load texlive to get it to work.

Slide 15:

Performance _estimates_ -- not entirely accurate due to what is/isn't captured. Take with a grain of salt.

Slide 16:

"other" is time outside of things Darshan observes -- typically majority of this time is "compute" (but could be waiting on anything outside of Darshan purview).

Slide 18:

Timeline of I/O operations, reads on top, writes on the bottom.

Can't literally see the individual operations (in a default Darshan capture) because we aren't tracing. But we can bound things by open file times, etc. This can be enough to get useful insights.

Slide 20:

There _is_ a fine-grained tracing capability, if you want to enable it, works for both POSIX and MPI-IO at this time.

darshan-dxt-parser can be used to look at this.

Also some OST information that can help you debug situations where a particular OST is problematic.

Slide 21:

This is a simple R/W timeline across ranks. This is a subfiling example -- groups of processes writing but not all.

Slide 23:

New things, are integrated, maybe useful here: Non-MPI instrumentation

Significant refactoring to enable this (i.e., to make MPI optional), including new way to initialize and catch end of job

Slide 24:

Right now one has to build Darshan especially for non-MPI capabilities

When running, have to set an environment variable. This allows us to avoid capturing all sorts of executables that you run -- keeps the noise down.

Q: DougB: When we launch on an HPC, ATLAS runs python pilot that spawns additional work, also monitors things. Launched inside a container. Wondering where to put the Darshan intercepts.

A: Shane: Good question! We need to work together on this particular bit.

A: Calls to fork() can cause issues, something to look out for.

Q: Do we have a Spark-specific way to follow the sub-processes?

A: Separate the capture from the analysis.

 

Discussion:

Q: Will Darshan work in a case where executables are stored on a RO FS? Compiled with some other (older) compiler?

A: Think so. Need to investigate.

TODO: Plan a hack session to investigate. Shane and Doug leading.

Doug and Torre hopefully presenting next week. Suren on HDF5 the following week.

There are minutes attached to this event. Show them.
    • 11:00 11:05
      Management News 5m
      Speakers: Paolo Calafiura (LBNL), Dr Salman Habib (Argonne National Laboratory)
    • 11:05 11:10
      Introduction 5m
      Speakers: Dr Peter van Gemmeren (ANL), Rob Ross (ANL)
    • 11:10 11:35
      Introduction to Darshan 25m
      Speaker: Shane Snyder (Argonne National Laboratory)
      Slides
    • 11:35 12:00
      Discuss projects 25m
      Discussion of topics and getting volunteers to present in future meetings
      Speaker: All