{"title":"用动态隐马尔可夫模型平衡系统可用性和寿命","authors":"Jacopo Panerati, S. Abdi, G. Beltrame","doi":"10.1109/AHS.2014.6880183","DOIUrl":null,"url":null,"abstract":"Electronic components in space applications are subject to high levels of ionizing and particle radiation. Their lifetime is reduced by the former (especially at high levels of utilization) and transient errors might be caused by the latter. Transient errors can be detected and corrected using memory scrubbing. However, this causes an overhead that reduces both the availability and the lifetime of the system. In this work, we present a mechanism based on dynamic hidden Markov models (D-HMMs) that balances availability and lifetime of a multi-resource system by estimating the occurrence of permanent faults amid transient faults, and by dynamically migrating the computation on excess resources when failure occurs. The dynamic nature of the model makes it adaptable to different mission profiles and fault rates. Results show that our model is able to lead systems to their desired lifetime, while keeping availability within the 2% of its ideal value, and it outperforms static rule-based and traditional hidden Markov models (HMMs) approaches.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"160 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Balancing system availability and lifetime with dynamic hidden Markov models\",\"authors\":\"Jacopo Panerati, S. Abdi, G. Beltrame\",\"doi\":\"10.1109/AHS.2014.6880183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Electronic components in space applications are subject to high levels of ionizing and particle radiation. Their lifetime is reduced by the former (especially at high levels of utilization) and transient errors might be caused by the latter. Transient errors can be detected and corrected using memory scrubbing. However, this causes an overhead that reduces both the availability and the lifetime of the system. In this work, we present a mechanism based on dynamic hidden Markov models (D-HMMs) that balances availability and lifetime of a multi-resource system by estimating the occurrence of permanent faults amid transient faults, and by dynamically migrating the computation on excess resources when failure occurs. The dynamic nature of the model makes it adaptable to different mission profiles and fault rates. Results show that our model is able to lead systems to their desired lifetime, while keeping availability within the 2% of its ideal value, and it outperforms static rule-based and traditional hidden Markov models (HMMs) approaches.\",\"PeriodicalId\":428581,\"journal\":{\"name\":\"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)\",\"volume\":\"160 3\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AHS.2014.6880183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AHS.2014.6880183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Balancing system availability and lifetime with dynamic hidden Markov models
Electronic components in space applications are subject to high levels of ionizing and particle radiation. Their lifetime is reduced by the former (especially at high levels of utilization) and transient errors might be caused by the latter. Transient errors can be detected and corrected using memory scrubbing. However, this causes an overhead that reduces both the availability and the lifetime of the system. In this work, we present a mechanism based on dynamic hidden Markov models (D-HMMs) that balances availability and lifetime of a multi-resource system by estimating the occurrence of permanent faults amid transient faults, and by dynamically migrating the computation on excess resources when failure occurs. The dynamic nature of the model makes it adaptable to different mission profiles and fault rates. Results show that our model is able to lead systems to their desired lifetime, while keeping availability within the 2% of its ideal value, and it outperforms static rule-based and traditional hidden Markov models (HMMs) approaches.