LHC计算网格的文件编目性能分析

HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005. Pub Date : 2005-07-24 DOI:10.1109/HPDC.2005.1520941

J. Baud, J. Casey, S. Lemaitre, C. Nicholson

{"title":"LHC计算网格的文件编目性能分析","authors":"J. Baud, J. Casey, S. Lemaitre, C. Nicholson","doi":"10.1109/HPDC.2005.1520941","DOIUrl":null,"url":null,"abstract":"The Large Hadron Collider (LHC) at CERN, the European Organization for Nuclear Research, needs to produce unprecedented volumes of data when it starts operation in 2007. To provide for its computational needs, the LHC computing grid (LCG) should be deployed as a worldwide computational grid service, providing the middleware upon which the physics analysis for the LHC can be carried out. In 2003, versions of this middleware were deployed which were based on the middleware produced by the European Data Grid project (EDG). In 2004 the LCG-2 release, which consisted of the EDG middleware with some minor modifications, was deployed for use by the LHC experiments. A series of data challenges by these experiments were the first real experiment production use of LCG. During the course of the data challenges, many issues and problems were exposed which had not shown up in more limited tests. The deployment, service and development teams worked closely with the experiments to understand these issues and while some of the problems were solved during the data challenges, others exposed fundamental problems with the middleware as deployed in LCG-2. One of these fundamental problems was the performance under real load of the catalog component provided by EDG, the replica location service. To solve these problems a new component was designed, the LCG file catalog (LFC). The LFC moves away from the replica location service model used in previous LCG releases, towards a hierarchical file system model which is more like a UNIX file system. It also adds missing functionality which was requested by the experiments. This paper presents the architecture and implementation of the LFC and evaluates it in a series of performance tests, with up to forty million entries and one hundred requesting threads from multiple clients. The results show good scalability up to the limits of these tests, and compare favourably with other grid catalog implementations.","PeriodicalId":120564,"journal":{"name":"HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005.","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":"{\"title\":\"Performance analysis of a file catalog for the LHC computing grid\",\"authors\":\"J. Baud, J. Casey, S. Lemaitre, C. Nicholson\",\"doi\":\"10.1109/HPDC.2005.1520941\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Large Hadron Collider (LHC) at CERN, the European Organization for Nuclear Research, needs to produce unprecedented volumes of data when it starts operation in 2007. To provide for its computational needs, the LHC computing grid (LCG) should be deployed as a worldwide computational grid service, providing the middleware upon which the physics analysis for the LHC can be carried out. In 2003, versions of this middleware were deployed which were based on the middleware produced by the European Data Grid project (EDG). In 2004 the LCG-2 release, which consisted of the EDG middleware with some minor modifications, was deployed for use by the LHC experiments. A series of data challenges by these experiments were the first real experiment production use of LCG. During the course of the data challenges, many issues and problems were exposed which had not shown up in more limited tests. The deployment, service and development teams worked closely with the experiments to understand these issues and while some of the problems were solved during the data challenges, others exposed fundamental problems with the middleware as deployed in LCG-2. One of these fundamental problems was the performance under real load of the catalog component provided by EDG, the replica location service. To solve these problems a new component was designed, the LCG file catalog (LFC). The LFC moves away from the replica location service model used in previous LCG releases, towards a hierarchical file system model which is more like a UNIX file system. It also adds missing functionality which was requested by the experiments. This paper presents the architecture and implementation of the LFC and evaluates it in a series of performance tests, with up to forty million entries and one hundred requesting threads from multiple clients. The results show good scalability up to the limits of these tests, and compare favourably with other grid catalog implementations.\",\"PeriodicalId\":120564,\"journal\":{\"name\":\"HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005.\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPDC.2005.1520941\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPDC.2005.1520941","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 37

摘要

欧洲核子研究中心(CERN)的大型强子对撞机(LHC)在2007年开始运行时，需要产生前所未有的数据量。为了满足其计算需求，大型强子对撞机计算网格(LCG)应该被部署为一个全球性的计算网格服务，为大型强子对撞机的物理分析提供中间件。2003年，部署了基于欧洲数据网格项目(EDG)生成的中间件的中间件版本。在2004年发布的LCG-2，包括EDG中间件和一些小的修改，被部署在LHC实验中使用。这些实验的一系列数据挑战是LCG第一次真正的实验生产使用。在数据挑战的过程中，暴露了许多在更有限的测试中没有显示出来的问题和问题。部署、服务和开发团队与实验密切合作，以了解这些问题，虽然在数据挑战期间解决了一些问题，但其他问题暴露了部署在LCG-2中的中间件的基本问题。这些基本问题之一是EDG(副本位置服务)提供的目录组件在实际负载下的性能。为了解决这些问题，设计了一个新的组件LCG文件目录(LFC)。LFC不再使用以前LCG版本中使用的副本位置服务模型，而是转向更像UNIX文件系统的分层文件系统模型。它还增加了实验所要求的缺失功能。本文介绍了LFC的体系结构和实现，并在一系列性能测试中对其进行了评估，其中包含多达4000万个条目和来自多个客户机的100个请求线程。结果显示了良好的可伸缩性，达到了这些测试的极限，并且与其他网格目录实现相比具有优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance analysis of a file catalog for the LHC computing grid

The Large Hadron Collider (LHC) at CERN, the European Organization for Nuclear Research, needs to produce unprecedented volumes of data when it starts operation in 2007. To provide for its computational needs, the LHC computing grid (LCG) should be deployed as a worldwide computational grid service, providing the middleware upon which the physics analysis for the LHC can be carried out. In 2003, versions of this middleware were deployed which were based on the middleware produced by the European Data Grid project (EDG). In 2004 the LCG-2 release, which consisted of the EDG middleware with some minor modifications, was deployed for use by the LHC experiments. A series of data challenges by these experiments were the first real experiment production use of LCG. During the course of the data challenges, many issues and problems were exposed which had not shown up in more limited tests. The deployment, service and development teams worked closely with the experiments to understand these issues and while some of the problems were solved during the data challenges, others exposed fundamental problems with the middleware as deployed in LCG-2. One of these fundamental problems was the performance under real load of the catalog component provided by EDG, the replica location service. To solve these problems a new component was designed, the LCG file catalog (LFC). The LFC moves away from the replica location service model used in previous LCG releases, towards a hierarchical file system model which is more like a UNIX file system. It also adds missing functionality which was requested by the experiments. This paper presents the architecture and implementation of the LFC and evaluates it in a series of performance tests, with up to forty million entries and one hundred requesting threads from multiple clients. The results show good scalability up to the limits of these tests, and compare favourably with other grid catalog implementations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005.

自引率

0.00%

发文量