Joint EGEE and OSG Workshop on Data Handling in Production Grids

Name: Joint EGEE and OSG Workshop on Data Handling in Production Grids
Start: 2007-06-25T09:00:00-05:00
End: 2007-06-25T19:00:00-05:00
Location: Monterey Bay

Monday 25 Jun 2007, 09:00 → 19:00 US/Central

Monterey Bay

Description

<HTML>

Joint EGEE and OSG Workshop on Data Handling in Production Grids

The workshop will be held June 25, 2007 in conjunction with HPDC 2007. Please consult the HPDC website for logistics details.

With the establishment of large scale multidisciplinary production Grid infrastructures such as the EGEE, OSG, or Naregi, dependable and secure handling of all data forms - from user and operational, to accounting and auditing - across organization within one Grid and across Grids is the cornerstone to the success of such production infrastructure

This workshop is the second of the series on topics in production Grids that was initiated at HPDC-15 with the workshop on Management of Rights in Production Grids.

This workshop will bring together practitioners and researchers on all aspects of distributed data handling to discuss capabilities of existing technologies, identify areas where new functionalities are needed, explore how latest research results can be integrated into the software stack of production Grids and offer guidance to ongoing standardization efforts.

Topics include, but are not restricted to:

Handling of user data (replication, data transfer, meta-data, cataloging, data protection, ..)
Allocation of data storage resources.
Long term data curation on Grids.
Management of operational data.
Management of auditing and accounting data.

</HTML>

- 09:00 → 09:40
  Introduction and Motivation
  - 09:00
    
    Data Management on EGEE 20m
    
    This talk will review the requirements on data handling coming from the diverse user communities of EGEE, including Astronomy, High Energy Physics, and Life Sciences, and what tools are offered on the EGEE infrastructure to fulfill these requirements. We will discuss the current state and point out areas where future work is needed.
    
    Speaker: Erwin Laure (CERN/EGEE)
    
    Slides
  - 09:20
    
    Communicating via files – an OSG perspective 20m
    
    Files offer distributed applications a convenient asynchronous communication channel with seemingly infinite buffering capacity. As a result, more and more distributed applications employ files to interface producers and consumers of information. Logging and accounting information as well as temporary results of multi stage parallel computations are routinely written into files with the expectation that “someone” will eventually consume the information and free the space. We will review the challenges the OSG is facing in supporting this growing trend and our plans to address them.
    
    Speaker: Miron Livny (Univ. of Wisconsin, Madison)
    
    Slides
- 09:40 → 10:10
  The Industry View
  - 09:40
    
    Trends in Mainstream Storage and Data Management 30m
    
    This brief presentation focuses on emerging technologies in the storage and data areas. Included are the SMI-S and XAM standards under development in SNIA, complementary work being done in the DMTF, and the NFS v4.0 and v4.1 work, including pNFS, being done in the IETF.
    
    Speaker: Alan Yoder (NetApp)
    
    Slides
- 10:10 → 10:40
  
  Coffee 30m
- 10:40 → 11:40
  Distributed File Systems
  - 10:40
    
    NFSv4 and Petascale Data Management 30m
    
    Anticipating terascale and petascale HPC demands, NFSv4 architects are designing pNFS, a standard extension that supports parallel access to cluster file systems, object stores, and massive SANs. CITI's GridNFS project integrates NFSv4 into the ecology of Grid middleware: Globus GSI support, name space construction and management, fine-grained access control with foreign user support, and high performance secure file system access for jobs scheduled in an indeterminate future. In this talk, I will describe NFSv4 protocol features that support petascale data management along with implementation experiences.
    
    Speaker: Andy Adamson (Univ. of Michigan)
    
    Slides
  - 11:10
    
    Experiences with MC-GPFS in DEISA 30m
    
    DEISA is a European cooperation of HPC-centers. Although based on different hardware, including Power-based IBM-AIX systems, PPC-base Linux systems and even an SGI-Altix, a common global file system is shared between the sites, which is MC-GPFS. Starting with 1Gbit/s network connections and an old version of GPFS, which had some restrictions, now most sites are connected with 10 Gbit/s and a version of MC-GPFS, which has a removed many of the former design problems related to WAN network connections and fire wall rules. The talk will cover the technical aspects of the installation and configuration of MC-GPFS in the European wide context. Furthermore the advantages and practical use of that global file system setup are shown from a user's view as well as from a more administrative one.
    
    Speaker: Andreas Schott (MPG)
    
    Slides
- 11:40 → 12:10
  Data Storage I
  - 11:40
    
    SRM Interface Specification and Interoperability Testing 30m
    
    Storage Resource Managers (SRM) are middleware components whose function is to provide dynamic space allocation and file management on shared storage components on the data grid, as well as interfaces to the underlying storage resources. SRM is based on a common specification that emerged over years through an international collaboration. As this specification is being adapted to different storage systems, a consistent interface behavior to the Grid becomes a challenge, since sites have their own diverse storage infrastructures. Compatibility and interoperability testing for multiple SRM implementations will be discussed.
    
    Speaker: Alex Sim (LBL)
    
    Slides
- 12:10 → 13:15
  
  lunch 1h 5m
- 13:15 → 14:55
  Data Storage II
  - 13:15
    
    SRM 2.2 interface to dCache 20m
    
    The dCache team has recently finished the implementation of all SRM v2.2 elements required by the LHC experiments. The new functionality includes space reservation, more advanced data transfer, and new namespace and permissions management functions. Implementation of these features required an update of the dCache architecture and evolution of the services and core components of dCache Storage System. The new SRM concepts of AccessLatency and RetentionPolicy led to the definition of new dCache file attributes and new dCache pool code that implements these abstractions. Implementation of SRM Space Reservation led to new functionality in the Pool Manager and the development of the new Space Manager component of dCache, responsible for accounting, reservation and distribution of the storage space in dCache. The new dCache abstractions are LinkGroups that allow the partition of the total dCache space according to the types of the storage services provided and according to the Virtual Organizations that are allowed to use this space. SRM's "Bring Online"function required redevelopment of the Pin Manager service,responsible for staging files from the back-end tape storage system and keeping these files on disk for the duration of the Online state. SRM permission management functions led to the development of the Access Control List support in the new dCache namespace service, Chimera. I will discuss these new features and services in dCache, provide motivation for particular architectural decisions and describe their benefits to the Grid Storage Community.
    
    Speaker: Timur Perelmutov (FNAL)
    
    Slides
  - 13:35
    
    dCache, preparing for LHC data taking 20m
    
    Within the next 6 - 9 months, we are expecting the Large Hadron Collider at CERN to finally go online. From the current distribution of the dCache technology, we assume that dCache will be the storage element holding the largest share of data produced by the involved LHC experiments worldwide. Consequently, the dCache team is in its final phase of adjusting dCache capabilities to fit the requested needs for this big challenge. This presentation will touch on the conceptual improvements of dCache between the currently deployed systems and the version commonly known as dCache 1.8. On the protocol level we will report on the status of the SRM 2.2 deployment and testing, the progress of the gsiFtp protocol version-2 introduction and the status of Chimera and the NFS 4.1 implementation efforts. Some Details will be discussed as they may be of interest for our customers. This is certainly the consequent usage of checksums for inter dCache transfers, the redesign of the storage pool software to cope with extremely large pools, resp. disk partitions and the improvements on the authorization module, gPlazma. Finally we will provide information on the overall dCache project structure and on ongoing developments.
    
    Speaker: Patrick Fuhrmann (DESY)
    
    Slides
  - 13:55
    
    DPM - A lightweight secure disk pool manager 30m
    
    Speaker: Sophie Lemaitre (CERN)
    
    Slides
  - 14:25
    
    Scalla Update: Opportunities for New Tier 2 Models 30m
    
    This talk will focus on recent additions to the Scalla xrootd/olbd software suite and how some of these additions make xrootd an interesting choice for Wide Area Network Data Management approaches to Grid-based computing at the sub-Tier 1 level and how the xrootd protocol is well-suited in addressing some vexing grid-related issues.
    
    Speaker: Andrew Hanushevsky (Stanford Univ.)
    
    Slides
- 15:00 → 15:30
  
  Coffee 30m
- 15:30 → 17:30
  Data Scheduling
  - 15:30
    
    An Overview of iRods - a Rule Oriented Data management System 30m
    
    iRODS, is a project for building the next generation data management system for the cyberinfrastructure. Based on our experience with our SRB software and feedbacks from the users, there is a need for a flexible way to customize a data grid system such as the SRB. In iRods, this is accomplished through the use of rules and micro-services. This talk gives an overview of the iRods architecture and the use of rules and micro-services in iRods.
    
    Speaker: Mike Wan (SDSC)
    
    Slides
  - 16:00
    
    FTS - The gLite File Transfer System 30m
    
    We review the gLite File Transfer Service software from the point of view of our experience in running the distributed WLCG transfer service. We focus on what is required for the stable and sustainable operations of a reliable file transfer service.
    
    Speaker: Sophie Lemaitre (CERN)
    
    Slides
  - 16:30
    
    TeraGrid Data Transfer 30m
    
    The NSF TeraGrid project, initiated in 2001, currently links high-performance computing and data resources at centers located at nine U.S. universities and national laboratories (IU, NCAR, NCSA, ORNL, PSC, Purdue, SDSC, TACC, UC/ANL) with a dedicated network infrastructure. Data management and transfer methods used among TeraGrid systems have evolved over time, as available technologies have improved and the needs of users in the national science community have driven development for more efficient solutions. Data transfer performance has improved with better strategies in the deployment and use of grid data transfer services such as Globus GridFTP and HPN-scp. TeraGrid sites participate actively in the development of emerging network filesystem technologies (e.g., GPFS, Lustre), data streaming methods and tools (e.g., PDIO), queued parallel data transfer (e.g., DMOVER), and data collections management solutions (e.g., SRB). With large, data-intensive scientific instrument deployments and petascale HPC systems visible on the horizon, Teragrid will continue to drive data management and transfer solutions to meet the needs of the national science community.
    
    Speaker: Derek Simmel (PSC)
    
    Slides
  - 17:00
    
    Moving 100TB a day across EGEE and OSG - A CMS Perspective 30m
    
    During 12 of the last 30 days, the CMS experiment has moved more than 80TB a day across the roughly 40 sites on EGEE and OSG. Large volume data movement has clearly become a routine operation. We will reflect upon the successes as well as remaining challenges for large scale data movement for a community of a couple thousand scientists. Special emphasis will be put on technological as well as sociological obstacles that still make large scale data movement a challenge, and the corresponding risks to the CMS computing model.
    
    Speaker: Frank Wuerthwein (UCSD)
    
    Slides
- 17:30 → 19:00
  
  Round Table