R. Schlichting, A. Chien, C. Kesselman, K. Marzullo, J. Plank, S. Shrivastava
{"title":"Dependability and the grid issues and challenges","authors":"R. Schlichting, A. Chien, C. Kesselman, K. Marzullo, J. Plank, S. Shrivastava","doi":"10.1109/DSN.2002.1028907","DOIUrl":null,"url":null,"abstract":"For over a decade, researchers involved with scientific computing have been investigating technologies that allow advanced scientific applications to exploit resources associated with machines connected by wide-area networks across large geographical distances. Originally referred to as metacomputing or heterogenous computing, Grid computing is currently the most common term used to describe this type of distributed computing model. Generally speaking, Grid computing emphasizes large scale resource sharing—not only computational cycles, but also software and data— across administrative domains in a flexible, secure, and coordinated fashion. A number of software platforms have been developed that address all or subsets of the challenges associated with Grid computing, including Condor, the Entropia platform, the Globus toolkit, Legion, LSF, Ninf, and Sun’s Grid Engine. While the Grid was originally designed to support scientific applications, there has been significant interest recently in extending the model to support the needs of enterprise computing, including those based on Web services. For example, both IBM and Sun have made the Grid part of their enterprise computing strategies, while the recent Global Grid Forum GGF-4 (http://www.gridforum.org/) included a number of topics related to generalizing the Grid in this way. Part of this effort includes defining an Open Grid Services Architecture (OGSA) that can be used to integrate services within and across enterprises. As might be expected given the difference between scientific and enterprise applications, there are any number of technical issues that must be addressed to accomplish this goal. This panel will focus on one particular challenge associated with Grid computing, that of ensuring dependable operation of Grid computations. Dependability in this context encompasses a broad collection of possible attributes, including availability, reliability, security, and timely execution. Among the possible topics for discussion are different dependability requirements of current versus envisioned application scenarios, technical barriers to achieving dependability in both contexts, and architectural issues related to providing appropriate support in software platforms such as OGSA. The overall goal is to bring together the perspectives of individuals working in different communities to identify issues and challenges that remain to be solved to make dependable Grid computing a reality.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"695 1","pages":"263-263"},"PeriodicalIF":0.0000,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Dependable Systems and Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2002.1028907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
For over a decade, researchers involved with scientific computing have been investigating technologies that allow advanced scientific applications to exploit resources associated with machines connected by wide-area networks across large geographical distances. Originally referred to as metacomputing or heterogenous computing, Grid computing is currently the most common term used to describe this type of distributed computing model. Generally speaking, Grid computing emphasizes large scale resource sharing—not only computational cycles, but also software and data— across administrative domains in a flexible, secure, and coordinated fashion. A number of software platforms have been developed that address all or subsets of the challenges associated with Grid computing, including Condor, the Entropia platform, the Globus toolkit, Legion, LSF, Ninf, and Sun’s Grid Engine. While the Grid was originally designed to support scientific applications, there has been significant interest recently in extending the model to support the needs of enterprise computing, including those based on Web services. For example, both IBM and Sun have made the Grid part of their enterprise computing strategies, while the recent Global Grid Forum GGF-4 (http://www.gridforum.org/) included a number of topics related to generalizing the Grid in this way. Part of this effort includes defining an Open Grid Services Architecture (OGSA) that can be used to integrate services within and across enterprises. As might be expected given the difference between scientific and enterprise applications, there are any number of technical issues that must be addressed to accomplish this goal. This panel will focus on one particular challenge associated with Grid computing, that of ensuring dependable operation of Grid computations. Dependability in this context encompasses a broad collection of possible attributes, including availability, reliability, security, and timely execution. Among the possible topics for discussion are different dependability requirements of current versus envisioned application scenarios, technical barriers to achieving dependability in both contexts, and architectural issues related to providing appropriate support in software platforms such as OGSA. The overall goal is to bring together the perspectives of individuals working in different communities to identify issues and challenges that remain to be solved to make dependable Grid computing a reality.