{"title":"Balancing performance and fault detection for GPGPU workloads","authors":"J. Backer, R. Karri","doi":"10.1109/ICCD.2012.6378702","DOIUrl":null,"url":null,"abstract":"GPUs are increasingly being used for processing highly parallel scientific and high performance workloads. Such applications require correctness and accuracy of the computation. GPUs lack adequate support for detecting hardware faults that may lead to computation errors. We present a tunable fault detection scheme that allows one to balance GPU performance and fault checking by configuring the amount of resources to allocate for detection and the frequency of checking for faults.","PeriodicalId":313428,"journal":{"name":"2012 IEEE 30th International Conference on Computer Design (ICCD)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 30th International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2012.6378702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
GPUs are increasingly being used for processing highly parallel scientific and high performance workloads. Such applications require correctness and accuracy of the computation. GPUs lack adequate support for detecting hardware faults that may lead to computation errors. We present a tunable fault detection scheme that allows one to balance GPU performance and fault checking by configuring the amount of resources to allocate for detection and the frequency of checking for faults.