Gang Wang, Zan Wang, Junjie Chen, Xiang Chen, Ming Yan
{"title":"An Empirical Study on Numerical Bugs in Deep Learning Programs","authors":"Gang Wang, Zan Wang, Junjie Chen, Xiang Chen, Ming Yan","doi":"10.1145/3551349.3559561","DOIUrl":null,"url":null,"abstract":"The task of a deep learning (DL) program is to train a model with high precision and apply it to different scenarios. A DL program often involves massive numerical calculations. Therefore, the robustness and stability of the numerical calculations are dominant in the quality of DL programs. Indeed, numerical bugs are common in DL programs, producing NaN (Not-a-Number) and INF (Infinite). A numerical bug may render the DL models inaccurate, causing the DL applications unusable. In this work, we conduct the first empirical study on numerical bugs in DL programs by analyzing the programs implemented on the top of two popular DL libraries (i.e., TensorFlow and PyTorch). Specifically, We collect a dataset of 400 numerical bugs in DL programs. Then, we classify these numerical bugs into nine categories based on their root causes and summarize two findings. Finally, we provide the implications of our study on detecting numerical bugs in DL programs.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3551349.3559561","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The task of a deep learning (DL) program is to train a model with high precision and apply it to different scenarios. A DL program often involves massive numerical calculations. Therefore, the robustness and stability of the numerical calculations are dominant in the quality of DL programs. Indeed, numerical bugs are common in DL programs, producing NaN (Not-a-Number) and INF (Infinite). A numerical bug may render the DL models inaccurate, causing the DL applications unusable. In this work, we conduct the first empirical study on numerical bugs in DL programs by analyzing the programs implemented on the top of two popular DL libraries (i.e., TensorFlow and PyTorch). Specifically, We collect a dataset of 400 numerical bugs in DL programs. Then, we classify these numerical bugs into nine categories based on their root causes and summarize two findings. Finally, we provide the implications of our study on detecting numerical bugs in DL programs.