世界范围的失败

EW 7 Pub Date : 1996-09-09 DOI:10.1145/504450.504473

W. Vogels

{"title":"世界范围的失败","authors":"W. Vogels","doi":"10.1145/504450.504473","DOIUrl":null,"url":null,"abstract":"The one issue that unites almost all approaches to distributed computing is the need to know whether certain components in the system have failed or are otherwise unavailable. When designing and building systems that need to function at a global scale, failure management needs to be considered a fundamental building block. This paper describes the development of a system-independent failure management service, which allows systems and applications to incorporate accurate detection of failed processes, nodes and networks, without the need for making compromises in their particular design.","PeriodicalId":137590,"journal":{"name":"EW 7","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":"{\"title\":\"World wide failures\",\"authors\":\"W. Vogels\",\"doi\":\"10.1145/504450.504473\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The one issue that unites almost all approaches to distributed computing is the need to know whether certain components in the system have failed or are otherwise unavailable. When designing and building systems that need to function at a global scale, failure management needs to be considered a fundamental building block. This paper describes the development of a system-independent failure management service, which allows systems and applications to incorporate accurate detection of failed processes, nodes and networks, without the need for making compromises in their particular design.\",\"PeriodicalId\":137590,\"journal\":{\"name\":\"EW 7\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1996-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"40\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EW 7\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/504450.504473\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EW 7","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/504450.504473","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 40

摘要

将几乎所有分布式计算方法统一起来的一个问题是，需要知道系统中的某些组件是否发生故障或以其他方式不可用。当设计和构建需要在全球范围内运行的系统时，需要将故障管理视为基本构建块。本文描述了一种独立于系统的故障管理服务的开发，它允许系统和应用程序结合对故障过程、节点和网络的准确检测，而无需在其特定设计中做出妥协。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

World wide failures

The one issue that unites almost all approaches to distributed computing is the need to know whether certain components in the system have failed or are otherwise unavailable. When designing and building systems that need to function at a global scale, failure management needs to be considered a fundamental building block. This paper describes the development of a system-independent failure management service, which allows systems and applications to incorporate accurate detection of failed processes, nodes and networks, without the need for making compromises in their particular design.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

EW 7

自引率

0.00%

发文量