在程序鲁棒性评估中节省时间

Joao Gramacho, Dolores Rexachs, E. Luque
{"title":"在程序鲁棒性评估中节省时间","authors":"Joao Gramacho, Dolores Rexachs, E. Luque","doi":"10.1109/TRUSTCOM.2013.237","DOIUrl":null,"url":null,"abstract":"The risk of having a program execution corrupted by transient faults is growing as computer processors are using more transistors, are becoming denser and are operating at lower voltages. This risk is multiplied when we take into account High Performance Computing with its hundreds or thousands of processors working together to solve a single problem. To evaluate how program executions behave in presence of transient faults we have proposed the concept of robustness against transient faults. This concept can be used to determine the more significant parts of a program with respect to the risk of misbehavior by transient faults for further study of improvement. The robustness concept can also be used as a metric to compare different approaches applied to a program to make it less likely of producing corrupted results. In this work we present why and how is possible to simplify a fraction of a program's robustness by taking into account the repetition of sequences of instructions. The simplified analysis obtains the exact same result as a full program robustness evaluation (exhaustively and without estimations). By simplifying the analysis we were able to reduce in up to 192 times our previously published robustness analysis time and also were able to evaluate larger programs in feasible time (unimaginable by using executions in a fault injection capable environment).","PeriodicalId":206739,"journal":{"name":"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications","volume":"109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Saving Time in a Program Robustness Evaluation\",\"authors\":\"Joao Gramacho, Dolores Rexachs, E. Luque\",\"doi\":\"10.1109/TRUSTCOM.2013.237\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The risk of having a program execution corrupted by transient faults is growing as computer processors are using more transistors, are becoming denser and are operating at lower voltages. This risk is multiplied when we take into account High Performance Computing with its hundreds or thousands of processors working together to solve a single problem. To evaluate how program executions behave in presence of transient faults we have proposed the concept of robustness against transient faults. This concept can be used to determine the more significant parts of a program with respect to the risk of misbehavior by transient faults for further study of improvement. The robustness concept can also be used as a metric to compare different approaches applied to a program to make it less likely of producing corrupted results. In this work we present why and how is possible to simplify a fraction of a program's robustness by taking into account the repetition of sequences of instructions. The simplified analysis obtains the exact same result as a full program robustness evaluation (exhaustively and without estimations). By simplifying the analysis we were able to reduce in up to 192 times our previously published robustness analysis time and also were able to evaluate larger programs in feasible time (unimaginable by using executions in a fault injection capable environment).\",\"PeriodicalId\":206739,\"journal\":{\"name\":\"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications\",\"volume\":\"109 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TRUSTCOM.2013.237\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TRUSTCOM.2013.237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着计算机处理器使用的晶体管越来越多,密度越来越大,工作电压也越来越低,程序执行被瞬时故障破坏的风险也越来越大。当我们考虑到高性能计算的数百或数千个处理器一起工作来解决单个问题时,这种风险就会成倍增加。为了评估程序执行在出现瞬态故障时的行为,我们提出了对瞬态故障的鲁棒性的概念。这个概念可以用来确定程序中更重要的部分,这些部分与瞬态故障引起的不正常行为的风险有关,以便进一步研究改进。鲁棒性概念还可以用作比较应用于程序的不同方法的度量,以使其不太可能产生损坏的结果。在这项工作中,我们提出了为什么以及如何通过考虑指令序列的重复来简化程序鲁棒性的一部分。简化的分析得到与完整的程序健壮性评估完全相同的结果(详尽且不需要估计)。通过简化分析,我们能够将之前发布的鲁棒性分析时间减少192倍,并且能够在可行的时间内评估更大的程序(在具有故障注入能力的环境中使用执行是无法想象的)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Saving Time in a Program Robustness Evaluation
The risk of having a program execution corrupted by transient faults is growing as computer processors are using more transistors, are becoming denser and are operating at lower voltages. This risk is multiplied when we take into account High Performance Computing with its hundreds or thousands of processors working together to solve a single problem. To evaluate how program executions behave in presence of transient faults we have proposed the concept of robustness against transient faults. This concept can be used to determine the more significant parts of a program with respect to the risk of misbehavior by transient faults for further study of improvement. The robustness concept can also be used as a metric to compare different approaches applied to a program to make it less likely of producing corrupted results. In this work we present why and how is possible to simplify a fraction of a program's robustness by taking into account the repetition of sequences of instructions. The simplified analysis obtains the exact same result as a full program robustness evaluation (exhaustively and without estimations). By simplifying the analysis we were able to reduce in up to 192 times our previously published robustness analysis time and also were able to evaluate larger programs in feasible time (unimaginable by using executions in a fault injection capable environment).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信