{"title":"基于域划分和盲重构的可编程芯片容错系统","authors":"L. Shang, Mi Zhou, Yu Hu","doi":"10.1109/AHS.2010.5546245","DOIUrl":null,"url":null,"abstract":"Field programmable gate arrays (FPGAs) are widely used in building Systems-on-Programmable-Chips (SOPCs) since they contain plenty of reconfigurable heterogeneous resources providing the facility to implement various intellectual property cores. However, with the shrinking device feature size and the increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation, which results in challenges of building reliable SOPCs. In this paper, a SOPC implementing a smart 1553B bus node is presented to investigate the challenges and illustrate a feasible approach for building a complex system aimed at high reliability and low recovery latency on a commercial FPGA. First, a general reliability model, the DomainPartition (DP) model, is introduced to formulate the SOPCs which contain multiple alternative configurations proving the fault recovery capability. The assignment of the alternative configurations for maximizing the reliability is then determined according to a first-order optimal solution under the DP framework. Finally, the blind reconfiguration technique is used to reduce the recovery latency. The experiments based on a Monte Carlo simulation approach are carried out to evaluate the reliability and the latency. The obtained results show that higher reliability is attainable with less overhead than the generic triple-modular redundancy method.","PeriodicalId":101655,"journal":{"name":"2010 NASA/ESA Conference on Adaptive Hardware and Systems","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A fault-tolerant system-on-programmable-chip based on domain-partition and blind reconfiguration\",\"authors\":\"L. Shang, Mi Zhou, Yu Hu\",\"doi\":\"10.1109/AHS.2010.5546245\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Field programmable gate arrays (FPGAs) are widely used in building Systems-on-Programmable-Chips (SOPCs) since they contain plenty of reconfigurable heterogeneous resources providing the facility to implement various intellectual property cores. However, with the shrinking device feature size and the increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation, which results in challenges of building reliable SOPCs. In this paper, a SOPC implementing a smart 1553B bus node is presented to investigate the challenges and illustrate a feasible approach for building a complex system aimed at high reliability and low recovery latency on a commercial FPGA. First, a general reliability model, the DomainPartition (DP) model, is introduced to formulate the SOPCs which contain multiple alternative configurations proving the fault recovery capability. The assignment of the alternative configurations for maximizing the reliability is then determined according to a first-order optimal solution under the DP framework. Finally, the blind reconfiguration technique is used to reduce the recovery latency. The experiments based on a Monte Carlo simulation approach are carried out to evaluate the reliability and the latency. The obtained results show that higher reliability is attainable with less overhead than the generic triple-modular redundancy method.\",\"PeriodicalId\":101655,\"journal\":{\"name\":\"2010 NASA/ESA Conference on Adaptive Hardware and Systems\",\"volume\":\"140 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-06-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 NASA/ESA Conference on Adaptive Hardware and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AHS.2010.5546245\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 NASA/ESA Conference on Adaptive Hardware and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AHS.2010.5546245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A fault-tolerant system-on-programmable-chip based on domain-partition and blind reconfiguration
Field programmable gate arrays (FPGAs) are widely used in building Systems-on-Programmable-Chips (SOPCs) since they contain plenty of reconfigurable heterogeneous resources providing the facility to implement various intellectual property cores. However, with the shrinking device feature size and the increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation, which results in challenges of building reliable SOPCs. In this paper, a SOPC implementing a smart 1553B bus node is presented to investigate the challenges and illustrate a feasible approach for building a complex system aimed at high reliability and low recovery latency on a commercial FPGA. First, a general reliability model, the DomainPartition (DP) model, is introduced to formulate the SOPCs which contain multiple alternative configurations proving the fault recovery capability. The assignment of the alternative configurations for maximizing the reliability is then determined according to a first-order optimal solution under the DP framework. Finally, the blind reconfiguration technique is used to reduce the recovery latency. The experiments based on a Monte Carlo simulation approach are carried out to evaluate the reliability and the latency. The obtained results show that higher reliability is attainable with less overhead than the generic triple-modular redundancy method.