Eslam Yassien, Yongjia Xu, Hui Jiang, Thach Nguyen, Jennifer Dworak, T. Manikas, Kundan Nepal
{"title":"Harvesting Wasted Clock Cycles for Efficient Online Testing","authors":"Eslam Yassien, Yongjia Xu, Hui Jiang, Thach Nguyen, Jennifer Dworak, T. Manikas, Kundan Nepal","doi":"10.1109/ETS56758.2023.10173955","DOIUrl":null,"url":null,"abstract":"Mission-critical systems often require some testing to occur while the system is running. In many cases, this involves taking parts of the system off-line temporarily to apply the tests. However, hazards that occur during regular processor execution require the addition of stall cycles to maintain program correctness. These stall cycles generally perform no other function. In this paper, we focus on testing the ALU during those stall cycles to identify new errors or defects that arise during program execution due to aging and increased temperature that may slow down the circuitry or cause permanent defects. We investigate the time to detection of a fault (both stuck-at and transition) that may have caused silent data corruption. In addition, we identify the relationship between the programs running and the list of functional faults and how this impacts the test set length. Finally, we discuss area and performance impacts for the physical implementation of the approach.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"388 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE European Test Symposium (ETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ETS56758.2023.10173955","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Mission-critical systems often require some testing to occur while the system is running. In many cases, this involves taking parts of the system off-line temporarily to apply the tests. However, hazards that occur during regular processor execution require the addition of stall cycles to maintain program correctness. These stall cycles generally perform no other function. In this paper, we focus on testing the ALU during those stall cycles to identify new errors or defects that arise during program execution due to aging and increased temperature that may slow down the circuitry or cause permanent defects. We investigate the time to detection of a fault (both stuck-at and transition) that may have caused silent data corruption. In addition, we identify the relationship between the programs running and the list of functional faults and how this impacts the test set length. Finally, we discuss area and performance impacts for the physical implementation of the approach.