EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries

Jiannan Wang, Thibaud Lutellier, Shangshu Qian, H. Pham, Lin Tan
{"title":"EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries","authors":"Jiannan Wang, Thibaud Lutellier, Shangshu Qian, H. Pham, Lin Tan","doi":"10.1145/3510003.3510165","DOIUrl":null,"url":null,"abstract":"Testing deep learning (DL) software is crucial and challenging. Recent approaches use differential testing to cross-check pairs of implementations of the same functionality across different libraries. Such approaches require two DL libraries implementing the same functionality, which is often unavailable. In addition, they rely on a high-level library, Keras, that implements missing functionality in all supported DL libraries, which is prohibitively expensive and thus no longer maintained. To address this issue, we propose EAGLE, a new technique that uses differential testing in a different dimension, by using equivalent graphs to test a single DL implementation (e.g., a single DL library). Equivalent graphs use different Application Programming Interfaces (APIs), data types, or optimizations to achieve the same functionality. The rationale is that two equivalent graphs executed on a single DL implementation should produce identical output given the same input. Specifically, we design 16 new DL equivalence rules and propose a technique, EAGLE, that (1) uses these equivalence rules to build concrete pairs of equivalent graphs and (2) cross-checks the output of these equivalent graphs to detect inconsistency bugs in a DL library. Our evaluation on two widely-used DL libraries, i.e., Tensor Flow and PyTorch, shows that EAGLE detects 25 bugs (18 in Tensor Flow and 7 in PyTorch), including 13 previously unknown bugs.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Testing deep learning (DL) software is crucial and challenging. Recent approaches use differential testing to cross-check pairs of implementations of the same functionality across different libraries. Such approaches require two DL libraries implementing the same functionality, which is often unavailable. In addition, they rely on a high-level library, Keras, that implements missing functionality in all supported DL libraries, which is prohibitively expensive and thus no longer maintained. To address this issue, we propose EAGLE, a new technique that uses differential testing in a different dimension, by using equivalent graphs to test a single DL implementation (e.g., a single DL library). Equivalent graphs use different Application Programming Interfaces (APIs), data types, or optimizations to achieve the same functionality. The rationale is that two equivalent graphs executed on a single DL implementation should produce identical output given the same input. Specifically, we design 16 new DL equivalence rules and propose a technique, EAGLE, that (1) uses these equivalence rules to build concrete pairs of equivalent graphs and (2) cross-checks the output of these equivalent graphs to detect inconsistency bugs in a DL library. Our evaluation on two widely-used DL libraries, i.e., Tensor Flow and PyTorch, shows that EAGLE detects 25 bugs (18 in Tensor Flow and 7 in PyTorch), including 13 previously unknown bugs.
EAGLE:创建等效图来测试深度学习库
测试深度学习(DL)软件是至关重要且具有挑战性的。最近的方法使用差异测试来交叉检查跨不同库的相同功能的实现对。这种方法需要两个实现相同功能的DL库,而这通常是不可用的。此外,它们依赖于一个高级库Keras,它实现了所有受支持的DL库中缺失的功能,这些功能非常昂贵,因此不再维护。为了解决这个问题,我们提出了EAGLE,这是一种在不同维度上使用差分测试的新技术,通过使用等效图来测试单个DL实现(例如,单个DL库)。等价图使用不同的应用程序编程接口、数据类型或优化来实现相同的功能。其基本原理是,在单个DL实现上执行的两个等效图应该在给定相同输入的情况下产生相同的输出。具体来说,我们设计了16个新的DL等价规则,并提出了一种技术EAGLE,该技术(1)使用这些等价规则构建具体的等价图对,(2)交叉检查这些等价图的输出以检测DL库中的不一致错误。我们对两个广泛使用的DL库,即Tensor Flow和PyTorch进行了评估,结果显示EAGLE检测到了25个bug (Tensor Flow中有18个,PyTorch中有7个),其中包括13个以前未知的bug。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信