Boyu Li , Bin Wu , Meng Shen , Hao Peng , Weisheng Li , Hong Zhang , Jie Gan , Zhihong Tian , Guangquan Xu
{"title":"在线容错服务器整合算法","authors":"Boyu Li , Bin Wu , Meng Shen , Hao Peng , Weisheng Li , Hong Zhang , Jie Gan , Zhihong Tian , Guangquan Xu","doi":"10.1016/j.dcan.2024.06.007","DOIUrl":null,"url":null,"abstract":"<div><div>We study a novel replication mechanism to ensure service continuity against multiple simultaneous server failures. In this mechanism, each item represents a computing task and is replicated into <span><math><mi>ξ</mi><mo>+</mo><mn>1</mn></math></span> servers for some integer <span><math><mi>ξ</mi><mo>≥</mo><mn>1</mn></math></span>, with workloads specified by the amount of required resources. If one or more servers fail, the affected workloads can be redirected to other servers that host replicas associated with the same item, such that the service is not interrupted by the failure of up to <em>ξ</em> servers. This requires that any feasible assignment algorithm must reserve some capacity in each server to accommodate the workload redirected from potential failed servers without overloading, and determining the optimal method for reserving capacity becomes a key issue. Unlike existing algorithms that assume that no two servers share replicas of more than one item, we first formulate capacity reservation for a general arbitrary scenario. Due to the combinatorial nature of this problem, finding the optimal solution is difficult. To this end, we propose a Generalized and Simple Calculating Reserved Capacity (GSCRC) algorithm, with a time complexity only related to the number of items packed in the server. In conjunction with GSCRC, we propose a robust replica packing algorithm with capacity optimization (RobustPack), which aims to minimize the number of servers hosting replicas and tolerate multiple server failures. Through theoretical analysis and experimental evaluations, we show that the RobustPack algorithm can achieve better performance.</div></div>","PeriodicalId":48631,"journal":{"name":"Digital Communications and Networks","volume":"11 2","pages":"Pages 514-523"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithms for online fault tolerance server consolidation\",\"authors\":\"Boyu Li , Bin Wu , Meng Shen , Hao Peng , Weisheng Li , Hong Zhang , Jie Gan , Zhihong Tian , Guangquan Xu\",\"doi\":\"10.1016/j.dcan.2024.06.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>We study a novel replication mechanism to ensure service continuity against multiple simultaneous server failures. In this mechanism, each item represents a computing task and is replicated into <span><math><mi>ξ</mi><mo>+</mo><mn>1</mn></math></span> servers for some integer <span><math><mi>ξ</mi><mo>≥</mo><mn>1</mn></math></span>, with workloads specified by the amount of required resources. If one or more servers fail, the affected workloads can be redirected to other servers that host replicas associated with the same item, such that the service is not interrupted by the failure of up to <em>ξ</em> servers. This requires that any feasible assignment algorithm must reserve some capacity in each server to accommodate the workload redirected from potential failed servers without overloading, and determining the optimal method for reserving capacity becomes a key issue. Unlike existing algorithms that assume that no two servers share replicas of more than one item, we first formulate capacity reservation for a general arbitrary scenario. Due to the combinatorial nature of this problem, finding the optimal solution is difficult. To this end, we propose a Generalized and Simple Calculating Reserved Capacity (GSCRC) algorithm, with a time complexity only related to the number of items packed in the server. In conjunction with GSCRC, we propose a robust replica packing algorithm with capacity optimization (RobustPack), which aims to minimize the number of servers hosting replicas and tolerate multiple server failures. Through theoretical analysis and experimental evaluations, we show that the RobustPack algorithm can achieve better performance.</div></div>\",\"PeriodicalId\":48631,\"journal\":{\"name\":\"Digital Communications and Networks\",\"volume\":\"11 2\",\"pages\":\"Pages 514-523\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Communications and Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352864824000749\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Communications and Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352864824000749","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Algorithms for online fault tolerance server consolidation
We study a novel replication mechanism to ensure service continuity against multiple simultaneous server failures. In this mechanism, each item represents a computing task and is replicated into servers for some integer , with workloads specified by the amount of required resources. If one or more servers fail, the affected workloads can be redirected to other servers that host replicas associated with the same item, such that the service is not interrupted by the failure of up to ξ servers. This requires that any feasible assignment algorithm must reserve some capacity in each server to accommodate the workload redirected from potential failed servers without overloading, and determining the optimal method for reserving capacity becomes a key issue. Unlike existing algorithms that assume that no two servers share replicas of more than one item, we first formulate capacity reservation for a general arbitrary scenario. Due to the combinatorial nature of this problem, finding the optimal solution is difficult. To this end, we propose a Generalized and Simple Calculating Reserved Capacity (GSCRC) algorithm, with a time complexity only related to the number of items packed in the server. In conjunction with GSCRC, we propose a robust replica packing algorithm with capacity optimization (RobustPack), which aims to minimize the number of servers hosting replicas and tolerate multiple server failures. Through theoretical analysis and experimental evaluations, we show that the RobustPack algorithm can achieve better performance.
期刊介绍:
Digital Communications and Networks is a prestigious journal that emphasizes on communication systems and networks. We publish only top-notch original articles and authoritative reviews, which undergo rigorous peer-review. We are proud to announce that all our articles are fully Open Access and can be accessed on ScienceDirect. Our journal is recognized and indexed by eminent databases such as the Science Citation Index Expanded (SCIE) and Scopus.
In addition to regular articles, we may also consider exceptional conference papers that have been significantly expanded. Furthermore, we periodically release special issues that focus on specific aspects of the field.
In conclusion, Digital Communications and Networks is a leading journal that guarantees exceptional quality and accessibility for researchers and scholars in the field of communication systems and networks.