Cluster-based scalable network services

Proceedings of the sixteenth ACM symposium on Operating systems principles Pub Date : 1997-10-01 DOI:10.1145/268998.266662

A. Fox, S. Gribble, Y. Chawathe, E. Brewer, P. Gauthier

{"title":"Cluster-based scalable network services","authors":"A. Fox, S. Gribble, Y. Chawathe, E. Brewer, P. Gauthier","doi":"10.1145/268998.266662","DOIUrl":null,"url":null,"abstract":"We identify three fundamental requirements for scalable network services: incremental scalability and overflow growth provisioning, 24x7 availability through fault masking, and cost-effectiveness. We argue that clusters of commodity workstations interconnected by a high-speed SAN are exceptionally well-suited to meeting these challenges for Internet-server workloads, provided the software infrastructure for managing partial failures and administering a large cluster does not have to be reinvented for each new service. To this end, we propose a general, layered architecture for building cluster-based scalable network services that encapsulates the above requirements for reuse, and a service-programming model based on composable workers that perform transformation, aggregation, caching, and customization (TACC) of Internet content. For both performance and implementation simplicity, the architecture and TACC programming model exploit BASE, a weaker-than-ACID data semantics that results from trading consistency for availability and relying on soft state for robustness in failure management. Our architecture can be used as an off the shelf infrastructural platform for creating new network services, allowing authors to focus on the content of the service (by composing TACC building blocks) rather than its implementation. We discuss two real implementations of services based on this architecture: TranSend, a Web distillation proxy deployed to the UC Berkeley dialup IP population, and HotBot, the commercial implementation of the Inktomi search engine. We present detailed measurements of TranSend's performance based on substantial client traces, as well as anecdotal evidence from the TranSend and HotBot experience, to support the claims made for the architecture.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"152 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"639","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the sixteenth ACM symposium on Operating systems principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/268998.266662","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 639

Abstract

We identify three fundamental requirements for scalable network services: incremental scalability and overflow growth provisioning, 24x7 availability through fault masking, and cost-effectiveness. We argue that clusters of commodity workstations interconnected by a high-speed SAN are exceptionally well-suited to meeting these challenges for Internet-server workloads, provided the software infrastructure for managing partial failures and administering a large cluster does not have to be reinvented for each new service. To this end, we propose a general, layered architecture for building cluster-based scalable network services that encapsulates the above requirements for reuse, and a service-programming model based on composable workers that perform transformation, aggregation, caching, and customization (TACC) of Internet content. For both performance and implementation simplicity, the architecture and TACC programming model exploit BASE, a weaker-than-ACID data semantics that results from trading consistency for availability and relying on soft state for robustness in failure management. Our architecture can be used as an off the shelf infrastructural platform for creating new network services, allowing authors to focus on the content of the service (by composing TACC building blocks) rather than its implementation. We discuss two real implementations of services based on this architecture: TranSend, a Web distillation proxy deployed to the UC Berkeley dialup IP population, and HotBot, the commercial implementation of the Inktomi search engine. We present detailed measurements of TranSend's performance based on substantial client traces, as well as anecdotal evidence from the TranSend and HotBot experience, to support the claims made for the architecture.

查看原文本刊更多论文

基于集群的可扩展网络服务

我们确定了可扩展网络服务的三个基本需求:增量可伸缩性和溢出增长供应、通过故障屏蔽的24x7可用性和成本效益。我们认为，通过高速SAN相互连接的商品工作站集群非常适合应对互联网服务器工作负载的这些挑战，前提是用于管理部分故障和管理大型集群的软件基础设施不必为每个新服务重新设计。为此，我们提出了一种通用的分层体系结构，用于构建基于集群的可扩展网络服务，该服务封装了上述重用需求，并提出了一种基于执行Internet内容的转换、聚合、缓存和自定义(TACC)的可组合工作者的服务编程模型。为了性能和实现的简单性，体系结构和TACC编程模型利用了BASE，这是一种弱于acid的数据语义，它是在故障管理中以一致性换取可用性和依靠软状态获得健壮性的结果。我们的体系结构可以用作创建新网络服务的现成基础设施平台，允许作者专注于服务的内容(通过组合TACC构建块)而不是其实现。我们将讨论基于此架构的两个服务的实际实现:TranSend，一个部署到UC Berkeley拨号IP人口的Web蒸馏代理，以及HotBot, Inktomi搜索引擎的商业实现。我们根据大量的客户跟踪，以及来自TranSend和HotBot经验的轶事证据，对TranSend的性能进行了详细的测量，以支持对该架构的主张。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the sixteenth ACM symposium on Operating systems principles

自引率

0.00%

发文量