Sung-Eun Choi, E. A. Hendriks, R. Minnich, M. Sottile, Aaron Marks
{"title":"Life with Ed: a case study of a linux BIOS/BProc cluster","authors":"Sung-Eun Choi, E. A. Hendriks, R. Minnich, M. Sottile, Aaron Marks","doi":"10.1109/HPCSA.2002.1019132","DOIUrl":null,"url":null,"abstract":"In this paper, we describe experiences with our 127-node/161-processor Alpha cluster estbed, Ed. Ed is unique for two distinct reasons. First, we have replaced the standard BIOS on the cluster nodes with the Linux BIOS which loads Linux directly from non-volatile memory (Flash RAM). Second, the operating system provides a single-system image of the entire cluster, much like a traditional supercomputer. We will discuss the advantages of such a cluster, including time to boot (101 seconds for 100 nodes), upgrade (same as time to boot), and start processes (2.4 seconds for 15,000 processes). Additionally, we have discovered that certain predictions about the nature ofter a scale clusters, such as the need for hierrchical structure, are false. Finally, we argue that to achieve true scalability, terascale clusters must be built in the way of Ed.","PeriodicalId":111862,"journal":{"name":"Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSA.2002.1019132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In this paper, we describe experiences with our 127-node/161-processor Alpha cluster estbed, Ed. Ed is unique for two distinct reasons. First, we have replaced the standard BIOS on the cluster nodes with the Linux BIOS which loads Linux directly from non-volatile memory (Flash RAM). Second, the operating system provides a single-system image of the entire cluster, much like a traditional supercomputer. We will discuss the advantages of such a cluster, including time to boot (101 seconds for 100 nodes), upgrade (same as time to boot), and start processes (2.4 seconds for 15,000 processes). Additionally, we have discovered that certain predictions about the nature ofter a scale clusters, such as the need for hierrchical structure, are false. Finally, we argue that to achieve true scalability, terascale clusters must be built in the way of Ed.
在本文中,我们描述了我们的127节点/161处理器Alpha集群的经验,Ed。Ed的独特之处在于两个明显的原因。首先,我们用Linux BIOS取代了集群节点上的标准BIOS, Linux BIOS直接从非易失性内存(Flash RAM)加载Linux。其次,操作系统提供整个集群的单系统映像,很像传统的超级计算机。我们将讨论这种集群的优点,包括引导时间(100个节点101秒)、升级时间(与引导时间相同)和启动进程(1.5万个进程2.4秒)。此外,我们已经发现,某些关于规模集群性质的预测,如对层次结构的需求,是错误的。最后,我们认为,要实现真正的可扩展性,必须以Ed的方式构建万亿级集群。