{"title":"弹性虚拟集群","authors":"Michael V. Le, I. Hsu, Y. Tamir","doi":"10.1109/PRDC.2011.33","DOIUrl":null,"url":null,"abstract":"Clusters of computers can provide, in aggregate, reliable services despite the failure of individual computers. System-level virtualization is widely used to consolidate the workload of multiple physical systems as multiple virtual machines (VMs) on a single physical computer. A single physical computer thus forms a \\fIvirtual cluster\\fP of VMs. A key difficulty with virtualization is that the failure of the virtualization infrastructure (VI) often leads to the failure of multiple VMs. This is likely to overload \"cluster computing\" resiliency mechanisms, typically designed to tolerate the failure of only a single node at a time. By supporting recovery from failure of key VI components, we have enhanced the resiliency of a VI (Xen), thus enabling the use of existing \"cluster computing\" techniques to provide resilient virtual clusters. In the overwhelming majority of cases, these enhancements allow recovery from errors in the VI to be accomplished without the failure of more than a single VM. The resulting resiliency of the virtual cluster is demonstrated by running two existing \"cluster computing\" systems while subjecting the VI to injected faults.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Resilient Virtual Clusters\",\"authors\":\"Michael V. Le, I. Hsu, Y. Tamir\",\"doi\":\"10.1109/PRDC.2011.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clusters of computers can provide, in aggregate, reliable services despite the failure of individual computers. System-level virtualization is widely used to consolidate the workload of multiple physical systems as multiple virtual machines (VMs) on a single physical computer. A single physical computer thus forms a \\\\fIvirtual cluster\\\\fP of VMs. A key difficulty with virtualization is that the failure of the virtualization infrastructure (VI) often leads to the failure of multiple VMs. This is likely to overload \\\"cluster computing\\\" resiliency mechanisms, typically designed to tolerate the failure of only a single node at a time. By supporting recovery from failure of key VI components, we have enhanced the resiliency of a VI (Xen), thus enabling the use of existing \\\"cluster computing\\\" techniques to provide resilient virtual clusters. In the overwhelming majority of cases, these enhancements allow recovery from errors in the VI to be accomplished without the failure of more than a single VM. The resulting resiliency of the virtual cluster is demonstrated by running two existing \\\"cluster computing\\\" systems while subjecting the VI to injected faults.\",\"PeriodicalId\":254760,\"journal\":{\"name\":\"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PRDC.2011.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PRDC.2011.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Clusters of computers can provide, in aggregate, reliable services despite the failure of individual computers. System-level virtualization is widely used to consolidate the workload of multiple physical systems as multiple virtual machines (VMs) on a single physical computer. A single physical computer thus forms a \fIvirtual cluster\fP of VMs. A key difficulty with virtualization is that the failure of the virtualization infrastructure (VI) often leads to the failure of multiple VMs. This is likely to overload "cluster computing" resiliency mechanisms, typically designed to tolerate the failure of only a single node at a time. By supporting recovery from failure of key VI components, we have enhanced the resiliency of a VI (Xen), thus enabling the use of existing "cluster computing" techniques to provide resilient virtual clusters. In the overwhelming majority of cases, these enhancements allow recovery from errors in the VI to be accomplished without the failure of more than a single VM. The resulting resiliency of the virtual cluster is demonstrated by running two existing "cluster computing" systems while subjecting the VI to injected faults.