Dependability and the grid issues and challenges

R. Schlichting, A. Chien, C. Kesselman, K. Marzullo, J. Plank, S. Shrivastava
{"title":"Dependability and the grid issues and challenges","authors":"R. Schlichting, A. Chien, C. Kesselman, K. Marzullo, J. Plank, S. Shrivastava","doi":"10.1109/DSN.2002.1028907","DOIUrl":null,"url":null,"abstract":"For over a decade, researchers involved with scientific computing have been investigating technologies that allow advanced scientific applications to exploit resources associated with machines connected by wide-area networks across large geographical distances. Originally referred to as metacomputing or heterogenous computing, Grid computing is currently the most common term used to describe this type of distributed computing model. Generally speaking, Grid computing emphasizes large scale resource sharing—not only computational cycles, but also software and data— across administrative domains in a flexible, secure, and coordinated fashion. A number of software platforms have been developed that address all or subsets of the challenges associated with Grid computing, including Condor, the Entropia platform, the Globus toolkit, Legion, LSF, Ninf, and Sun’s Grid Engine. While the Grid was originally designed to support scientific applications, there has been significant interest recently in extending the model to support the needs of enterprise computing, including those based on Web services. For example, both IBM and Sun have made the Grid part of their enterprise computing strategies, while the recent Global Grid Forum GGF-4 (http://www.gridforum.org/) included a number of topics related to generalizing the Grid in this way. Part of this effort includes defining an Open Grid Services Architecture (OGSA) that can be used to integrate services within and across enterprises. As might be expected given the difference between scientific and enterprise applications, there are any number of technical issues that must be addressed to accomplish this goal. This panel will focus on one particular challenge associated with Grid computing, that of ensuring dependable operation of Grid computations. Dependability in this context encompasses a broad collection of possible attributes, including availability, reliability, security, and timely execution. Among the possible topics for discussion are different dependability requirements of current versus envisioned application scenarios, technical barriers to achieving dependability in both contexts, and architectural issues related to providing appropriate support in software platforms such as OGSA. The overall goal is to bring together the perspectives of individuals working in different communities to identify issues and challenges that remain to be solved to make dependable Grid computing a reality.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"695 1","pages":"263-263"},"PeriodicalIF":0.0000,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Dependable Systems and Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2002.1028907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

For over a decade, researchers involved with scientific computing have been investigating technologies that allow advanced scientific applications to exploit resources associated with machines connected by wide-area networks across large geographical distances. Originally referred to as metacomputing or heterogenous computing, Grid computing is currently the most common term used to describe this type of distributed computing model. Generally speaking, Grid computing emphasizes large scale resource sharing—not only computational cycles, but also software and data— across administrative domains in a flexible, secure, and coordinated fashion. A number of software platforms have been developed that address all or subsets of the challenges associated with Grid computing, including Condor, the Entropia platform, the Globus toolkit, Legion, LSF, Ninf, and Sun’s Grid Engine. While the Grid was originally designed to support scientific applications, there has been significant interest recently in extending the model to support the needs of enterprise computing, including those based on Web services. For example, both IBM and Sun have made the Grid part of their enterprise computing strategies, while the recent Global Grid Forum GGF-4 (http://www.gridforum.org/) included a number of topics related to generalizing the Grid in this way. Part of this effort includes defining an Open Grid Services Architecture (OGSA) that can be used to integrate services within and across enterprises. As might be expected given the difference between scientific and enterprise applications, there are any number of technical issues that must be addressed to accomplish this goal. This panel will focus on one particular challenge associated with Grid computing, that of ensuring dependable operation of Grid computations. Dependability in this context encompasses a broad collection of possible attributes, including availability, reliability, security, and timely execution. Among the possible topics for discussion are different dependability requirements of current versus envisioned application scenarios, technical barriers to achieving dependability in both contexts, and architectural issues related to providing appropriate support in software platforms such as OGSA. The overall goal is to bring together the perspectives of individuals working in different communities to identify issues and challenges that remain to be solved to make dependable Grid computing a reality.
可靠性和电网问题与挑战
十多年来,参与科学计算的研究人员一直在研究技术,使先进的科学应用能够利用与跨大地理距离的广域网连接的机器相关的资源。网格计算最初被称为元计算或异构计算,目前是用来描述这种类型的分布式计算模型的最常用术语。一般来说,网格计算强调以灵活、安全和协调的方式跨管理域进行大规模资源共享(不仅包括计算周期,还包括软件和数据)。已经开发了许多软件平台来解决与网格计算相关的所有或子集挑战,包括Condor、Entropia平台、Globus工具包、Legion、LSF、Ninf和Sun的网格引擎。虽然网格最初是为支持科学应用程序而设计的,但最近对扩展该模型以支持企业计算需求(包括基于Web服务的计算需求)的兴趣越来越大。例如,IBM和Sun都将网格作为其企业计算战略的一部分,而最近的全球网格论坛GGF-4 (http://www.gridforum.org/)包含了许多与以这种方式推广网格相关的主题。这项工作的一部分包括定义一个开放网格服务体系结构(OGSA),该体系结构可用于在企业内部和跨企业集成服务。鉴于科学应用程序和企业应用程序之间的差异,可以预期,要实现这一目标,必须解决许多技术问题。这个小组将集中讨论与网格计算相关的一个特殊挑战,即确保网格计算的可靠运行。在此上下文中,可靠性包括广泛的可能属性集合,包括可用性、可靠性、安全性和及时执行。可能讨论的主题包括当前应用程序场景与预期应用程序场景的不同可靠性需求、在两种上下文中实现可靠性的技术障碍,以及与在软件平台(如OGSA)中提供适当支持相关的体系结构问题。总体目标是将在不同社区工作的个人的观点结合在一起,以确定有待解决的问题和挑战,从而使可靠的网格计算成为现实。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信