Joint EGEE and OSG Workshop on Data Handling in Production Grids

Monday, June 25, 2007 from to (US/Central)
at Monterey Bay
Description

Joint EGEE and OSG Workshop on Data Handling in Production Grids

The workshop will be held June 25, 2007 in conjunction with HPDC 2007. Please consult the HPDC website for logistics details.

With the establishment of large scale multidisciplinary production Grid infrastructures such as the EGEE, OSG, or Naregi, dependable and secure handling of all data forms - from user and operational, to accounting and auditing - across organization within one Grid and across Grids is the cornerstone to the success of such production infrastructure

This workshop is the second of the series on topics in production Grids that was initiated at HPDC-15 with the workshop on Management of Rights in Production Grids.

This workshop will bring together practitioners and researchers on all aspects of distributed data handling to discuss capabilities of existing technologies, identify areas where new functionalities are needed, explore how latest research results can be integrated into the software stack of production Grids and offer guidance to ongoing standardization efforts.

Topics include, but are not restricted to:
  • Handling of user data (replication, data transfer, meta-data, cataloging, data protection, ..)
  • Allocation of data storage resources.
  • Long term data curation on Grids.
  • Management of operational data.
  • Management of auditing and accounting data.

Go to day
  • Monday, June 25, 2007
    • 09:00 - 09:40 Introduction and Motivation
      • 09:00 Data Management on EGEE 20'
        This talk will review the requirements on data handling coming from the diverse user communities of EGEE, including Astronomy, High Energy Physics, and Life Sciences, and what tools are offered on the EGEE infrastructure to fulfill these requirements. We will discuss the current state and point out areas where future work is needed.
        Speaker: Erwin Laure (CERN/EGEE)
        Material: Slides powerpoint file
      • 09:20 Communicating via files – an OSG perspective 20'
        Files offer distributed applications a convenient asynchronous communication channel with seemingly infinite buffering capacity. As a result, more and more distributed applications employ files to interface producers and consumers of information.   Logging and accounting information as well as temporary results of multi stage parallel computations are routinely written into files with the expectation that “someone” will eventually consume the information and free the space. We will review the challenges the OSG is facing in supporting this growing trend and our plans to address them.
        Speaker: Miron Livny (Univ. of Wisconsin, Madison)
        Material: Slides powerpoint file
    • 09:40 - 10:10 The Industry View
      • 09:40 Trends in Mainstream Storage and Data Management 30'
        This brief presentation focuses on emerging technologies
        in the storage and data areas.  Included are the SMI-S
        and XAM standards under development in SNIA, complementary
        work being done in the DMTF, and the NFS v4.0 and v4.1 work, 
        including pNFS, being done in the IETF.
        Speaker: Alan Yoder (NetApp)
        Material: Slides powerpoint file
    • 10:10 - 10:40 Coffee
    • 10:40 - 11:40 Distributed File Systems
      • 10:40 NFSv4 and Petascale Data Management 30'
        Anticipating terascale and petascale HPC demands, NFSv4 architects are designing pNFS, a standard extension that supports parallel access to cluster file systems, object stores, and massive SANs.
        
        CITI's GridNFS project integrates NFSv4 into the ecology of Grid middleware: Globus GSI support, name space construction and management, fine-grained access control with foreign user support, and high performance secure file system access for jobs scheduled in an indeterminate future. In this talk, I will describe NFSv4 protocol features that support petascale data management along with implementation experiences.
        Speaker: Andy Adamson (Univ. of Michigan)
        Material: Slides pdf file
      • 11:10 Experiences with MC-GPFS in DEISA 30'
        DEISA is a European cooperation of HPC-centers. Although based
        on different hardware, including Power-based IBM-AIX systems,
        PPC-base Linux systems and even an SGI-Altix, a common global file
        system is shared between the sites, which is MC-GPFS. Starting with
        1Gbit/s network connections and an old version of GPFS, which had
        some restrictions, now most sites are connected with 10 Gbit/s and a
        version of MC-GPFS, which has a removed many of the former design
        problems related to WAN network connections and fire wall rules.
        
        The talk will cover the technical aspects of the installation and configuration
        of MC-GPFS in the European wide context. Furthermore the advantages
        and practical use of that global file system setup are shown from a user's
        view as well as from a more administrative one.
        Speaker: Andreas Schott (MPG)
        Material: Slides powerpoint file
    • 11:40 - 12:10 Data Storage I
      • 11:40 SRM Interface Specification and Interoperability Testing 30'
        Storage Resource Managers (SRM) are middleware components whose function is
        to provide dynamic space allocation and file management on shared storage
        components on the data grid, as well as interfaces to the underlying storage
        resources. SRM is based on a common specification that emerged over years
        through an international collaboration.   As this specification is being
        adapted to different storage systems, a consistent interface behavior to the
        Grid becomes a challenge, since sites have their own diverse storage
        infrastructures.  Compatibility and interoperability testing for multiple
        SRM implementations will be discussed.
        Speaker: Alex Sim (LBL)
        Material: Slides powerpoint file
    • 12:10 - 13:15 lunch
    • 13:15 - 14:55 Data Storage II
      • 13:15 SRM 2.2 interface to dCache 20'
        The dCache team has recently finished the implementation of all SRM v2.2 elements required by the LHC experiments. The new functionality includes space reservation, more advanced data transfer, and new namespace and permissions management functions. Implementation of these features required an update of the dCache architecture and evolution of the services and core components of dCache Storage System. The new SRM concepts of AccessLatency and RetentionPolicy led to the definition of new dCache file attributes and new dCache pool code that implements these abstractions. Implementation of SRM Space Reservation led to new functionality in the Pool Manager and the development of the new Space Manager component of dCache, responsible for accounting, reservation and distribution of the storage space in dCache. The new dCache abstractions are LinkGroups that allow the partition of the total dCache space according to the types of the storage services provided and according to the Virtual Organizations that are allowed to use this space. SRM's "Bring Online"function required redevelopment of the Pin Manager service,responsible for staging files from the back-end tape storage system and keeping these files on disk for the duration of the Online state. SRM permission management functions led to the development of the Access Control List support in the new dCache namespace service, Chimera. I will discuss these new features and services in dCache, provide motivation for particular architectural decisions and describe their benefits to the Grid Storage Community.
        Speaker: Timur Perelmutov (FNAL)
        Material: Slides pdf file
      • 13:35 dCache, preparing for LHC data taking 20'
        Within the next 6 - 9 months, we are expecting the Large Hadron 
        Collider at CERN to finally go online. From the current distribution 
        of the dCache technology, we assume that dCache will be the storage 
        element holding the largest share of data produced by the involved 
        LHC experiments worldwide. Consequently, the dCache team is  in its 
        final phase of adjusting dCache capabilities to fit the requested 
        needs for  this big challenge. This presentation will 
        touch on the conceptual improvements of dCache between the currently 
        deployed systems and the version commonly known as dCache 1.8. On 
        the protocol level we will report on the status of the SRM 2.2 
        deployment and testing, the progress of the gsiFtp protocol version-2 
        introduction and the status of Chimera and the NFS 4.1 implementation 
        efforts. Some Details will be discussed as they may be of interest for 
        our customers. This is certainly the consequent usage of checksums for
        inter dCache transfers, the redesign of the storage pool software to
        cope with extremely large pools, resp. disk partitions and the improvements
        on the authorization module, gPlazma. Finally we will provide information 
        on the overall dCache project structure and on ongoing developments.
        Speaker: Patrick Fuhrmann (DESY)
        Material: Slides pdf file
      • 13:55 DPM - A lightweight secure disk pool manager 30'
        Speaker: Sophie Lemaitre (CERN)
        Material: Slides powerpoint file
      • 14:25 Scalla Update: Opportunities for New Tier 2 Models 30'
        This talk will focus on recent additions to the Scalla xrootd/olbd software suite and how some of these additions make xrootd an interesting choice for Wide Area Network Data Management approaches to Grid-based computing at the sub-Tier 1 level and how the xrootd protocol is well-suited in addressing some vexing grid-related issues.
        Speaker: Andrew Hanushevsky (Stanford Univ.)
        Material: Slides powerpoint file
    • 15:00 - 15:30 Coffee
    • 15:30 - 17:30 Data Scheduling
      • 15:30 An Overview of iRods - a Rule Oriented Data management System 30'
        iRODS, is a project for building the next generation data management
        system for the cyberinfrastructure. Based on our experience with our SRB
        software and feedbacks from the  users, there is a need for a flexible way
        to customize a data grid system  such as the SRB. In iRods, this is
        accomplished through the use of rules and micro-services. This talk gives an overview
        of the iRods architecture and the use of rules and  micro-services in iRods.
        Speaker: Mike Wan (SDSC)
        Material: Slides powerpoint file
      • 16:00 FTS - The gLite File Transfer System 30'
        We review the gLite File Transfer Service software from the point of view of our experience in running the distributed WLCG transfer service. We focus on what is required for the stable and sustainable operations of a reliable file transfer service.
        Speaker: Sophie Lemaitre (CERN)
        Material: Slides powerpoint file
      • 16:30 TeraGrid Data Transfer 30'
        The NSF TeraGrid project, initiated in 2001, currently links high-performance computing and data resources at centers located at nine U.S. universities and national laboratories (IU, NCAR, NCSA, ORNL, PSC, Purdue, SDSC, TACC, UC/ANL) with a dedicated network infrastructure. Data management and transfer methods used among TeraGrid systems have evolved over time, as available technologies have improved and the needs of users in the national science community have driven development for more efficient solutions. Data transfer performance has improved with better strategies in the deployment and use of grid data transfer services such as Globus GridFTP and HPN-scp. TeraGrid sites participate actively in the development of emerging network filesystem technologies (e.g., GPFS, Lustre), data streaming methods and tools (e.g., PDIO), queued parallel data transfer (e.g., DMOVER), and data collections management solutions (e.g., SRB). With large, data-intensive scientific instrument deployments and petascale HPC systems visible on the horizon, Teragrid will continue to drive data management and transfer solutions to meet the needs of the national science community.
        Speaker: Derek Simmel (PSC)
        Material: Slides powerpoint file
      • 17:00 Moving 100TB a day across EGEE and OSG - A CMS Perspective 30'
        During 12 of the last 30 days, the CMS experiment has moved more than 80TB a day
        across the roughly 40 sites on EGEE and OSG. Large volume data movement has
        clearly become a routine operation.
        
        We will reflect upon the successes as well as remaining challenges for large scale data movement for a community of
        a couple thousand scientists. Special emphasis will be put on technological as well as sociological obstacles that
        still make large scale data movement a challenge, and the corresponding risks to the CMS computing model.
        Speaker: Frank Wuerthwein (UCSD)
        Material: Slides pdf file
    • 17:30 - 19:00 Round Table