Joint EGEE and OSG Workshop on Data Handling in Production Grids
at Monterey Bay
09:00 - 09:40
Introduction and Motivation
Data Management on EGEE
This talk will review the requirements on data handling coming from the diverse user communities of EGEE, including Astronomy, High Energy Physics, and Life Sciences, and what tools are offered on the EGEE infrastructure to fulfill these requirements. We will discuss the current state and point out areas where future work is needed.
Speaker: Erwin Laure (CERN/EGEE) Material: Slides
Communicating via files – an OSG perspective
Files offer distributed applications a convenient asynchronous communication channel with seemingly infinite buffering capacity. As a result, more and more distributed applications employ files to interface producers and consumers of information. Logging and accounting information as well as temporary results of multi stage parallel computations are routinely written into files with the expectation that “someone” will eventually consume the information and free the space. We will review the challenges the OSG is facing in supporting this growing trend and our plans to address them.
Speaker: Miron Livny (Univ. of Wisconsin, Madison) Material: Slides
- 09:00 Data Management on EGEE 20'
09:40 - 10:10
The Industry View
Trends in Mainstream Storage and Data Management
This brief presentation focuses on emerging technologies in the storage and data areas. Included are the SMI-S and XAM standards under development in SNIA, complementary work being done in the DMTF, and the NFS v4.0 and v4.1 work, including pNFS, being done in the IETF.
Speaker: Alan Yoder (NetApp) Material: Slides
- 09:40 Trends in Mainstream Storage and Data Management 30'
- 10:10 - 10:40 Coffee
10:40 - 11:40
Distributed File Systems
NFSv4 and Petascale Data Management
Anticipating terascale and petascale HPC demands, NFSv4 architects are designing pNFS, a standard extension that supports parallel access to cluster file systems, object stores, and massive SANs. CITI's GridNFS project integrates NFSv4 into the ecology of Grid middleware: Globus GSI support, name space construction and management, fine-grained access control with foreign user support, and high performance secure file system access for jobs scheduled in an indeterminate future. In this talk, I will describe NFSv4 protocol features that support petascale data management along with implementation experiences.
Speaker: Andy Adamson (Univ. of Michigan) Material: Slides
Experiences with MC-GPFS in DEISA
DEISA is a European cooperation of HPC-centers. Although based on different hardware, including Power-based IBM-AIX systems, PPC-base Linux systems and even an SGI-Altix, a common global file system is shared between the sites, which is MC-GPFS. Starting with 1Gbit/s network connections and an old version of GPFS, which had some restrictions, now most sites are connected with 10 Gbit/s and a version of MC-GPFS, which has a removed many of the former design problems related to WAN network connections and fire wall rules. The talk will cover the technical aspects of the installation and configuration of MC-GPFS in the European wide context. Furthermore the advantages and practical use of that global file system setup are shown from a user's view as well as from a more administrative one.
Speaker: Andreas Schott (MPG) Material: Slides
- 10:40 NFSv4 and Petascale Data Management 30'
11:40 - 12:10
Data Storage I
SRM Interface Specification and Interoperability Testing
Storage Resource Managers (SRM) are middleware components whose function is to provide dynamic space allocation and file management on shared storage components on the data grid, as well as interfaces to the underlying storage resources. SRM is based on a common specification that emerged over years through an international collaboration. As this specification is being adapted to different storage systems, a consistent interface behavior to the Grid becomes a challenge, since sites have their own diverse storage infrastructures. Compatibility and interoperability testing for multiple SRM implementations will be discussed.
Speaker: Alex Sim (LBL) Material: Slides
- 11:40 SRM Interface Specification and Interoperability Testing 30'
- 12:10 - 13:15 lunch
13:15 - 14:55
Data Storage II
SRM 2.2 interface to dCache
The dCache team has recently finished the implementation of all SRM v2.2 elements required by the LHC experiments. The new functionality includes space reservation, more advanced data transfer, and new namespace and permissions management functions. Implementation of these features required an update of the dCache architecture and evolution of the services and core components of dCache Storage System. The new SRM concepts of AccessLatency and RetentionPolicy led to the definition of new dCache file attributes and new dCache pool code that implements these abstractions. Implementation of SRM Space Reservation led to new functionality in the Pool Manager and the development of the new Space Manager component of dCache, responsible for accounting, reservation and distribution of the storage space in dCache. The new dCache abstractions are LinkGroups that allow the partition of the total dCache space according to the types of the storage services provided and according to the Virtual Organizations that are allowed to use this space. SRM's "Bring Online"function required redevelopment of the Pin Manager service,responsible for staging files from the back-end tape storage system and keeping these files on disk for the duration of the Online state. SRM permission management functions led to the development of the Access Control List support in the new dCache namespace service, Chimera. I will discuss these new features and services in dCache, provide motivation for particular architectural decisions and describe their benefits to the Grid Storage Community.
Speaker: Timur Perelmutov (FNAL) Material: Slides
dCache, preparing for LHC data taking
Within the next 6 - 9 months, we are expecting the Large Hadron Collider at CERN to finally go online. From the current distribution of the dCache technology, we assume that dCache will be the storage element holding the largest share of data produced by the involved LHC experiments worldwide. Consequently, the dCache team is in its final phase of adjusting dCache capabilities to fit the requested needs for this big challenge. This presentation will touch on the conceptual improvements of dCache between the currently deployed systems and the version commonly known as dCache 1.8. On the protocol level we will report on the status of the SRM 2.2 deployment and testing, the progress of the gsiFtp protocol version-2 introduction and the status of Chimera and the NFS 4.1 implementation efforts. Some Details will be discussed as they may be of interest for our customers. This is certainly the consequent usage of checksums for inter dCache transfers, the redesign of the storage pool software to cope with extremely large pools, resp. disk partitions and the improvements on the authorization module, gPlazma. Finally we will provide information on the overall dCache project structure and on ongoing developments.
Speaker: Patrick Fuhrmann (DESY) Material: Slides
DPM - A lightweight secure disk pool manager
Speaker: Sophie Lemaitre (CERN) Material: Slides
Scalla Update: Opportunities for New Tier 2 Models
This talk will focus on recent additions to the Scalla xrootd/olbd software suite and how some of these additions make xrootd an interesting choice for Wide Area Network Data Management approaches to Grid-based computing at the sub-Tier 1 level and how the xrootd protocol is well-suited in addressing some vexing grid-related issues.
Speaker: Andrew Hanushevsky (Stanford Univ.) Material: Slides
- 13:15 SRM 2.2 interface to dCache 20'
- 15:00 - 15:30 Coffee
15:30 - 17:30
An Overview of iRods - a Rule Oriented Data management System
iRODS, is a project for building the next generation data management system for the cyberinfrastructure. Based on our experience with our SRB software and feedbacks from the users, there is a need for a flexible way to customize a data grid system such as the SRB. In iRods, this is accomplished through the use of rules and micro-services. This talk gives an overview of the iRods architecture and the use of rules and micro-services in iRods.
Speaker: Mike Wan (SDSC) Material: Slides
FTS - The gLite File Transfer System
We review the gLite File Transfer Service software from the point of view of our experience in running the distributed WLCG transfer service. We focus on what is required for the stable and sustainable operations of a reliable file transfer service.
Speaker: Sophie Lemaitre (CERN) Material: Slides
TeraGrid Data Transfer
The NSF TeraGrid project, initiated in 2001, currently links high-performance computing and data resources at centers located at nine U.S. universities and national laboratories (IU, NCAR, NCSA, ORNL, PSC, Purdue, SDSC, TACC, UC/ANL) with a dedicated network infrastructure. Data management and transfer methods used among TeraGrid systems have evolved over time, as available technologies have improved and the needs of users in the national science community have driven development for more efficient solutions. Data transfer performance has improved with better strategies in the deployment and use of grid data transfer services such as Globus GridFTP and HPN-scp. TeraGrid sites participate actively in the development of emerging network filesystem technologies (e.g., GPFS, Lustre), data streaming methods and tools (e.g., PDIO), queued parallel data transfer (e.g., DMOVER), and data collections management solutions (e.g., SRB). With large, data-intensive scientific instrument deployments and petascale HPC systems visible on the horizon, Teragrid will continue to drive data management and transfer solutions to meet the needs of the national science community.
Speaker: Derek Simmel (PSC) Material: Slides
Moving 100TB a day across EGEE and OSG - A CMS Perspective
During 12 of the last 30 days, the CMS experiment has moved more than 80TB a day across the roughly 40 sites on EGEE and OSG. Large volume data movement has clearly become a routine operation. We will reflect upon the successes as well as remaining challenges for large scale data movement for a community of a couple thousand scientists. Special emphasis will be put on technological as well as sociological obstacles that still make large scale data movement a challenge, and the corresponding risks to the CMS computing model.
Speaker: Frank Wuerthwein (UCSD) Material: Slides
- 15:30 An Overview of iRods - a Rule Oriented Data management System 30'
17:30 - 19:00
- 09:00 - 09:40 Introduction and Motivation