DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2019-09-01 DOI:10.1109/ICSME.2019.00078

Houssem Ben Braiek, Foutse Khomh

{"title":"DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks","authors":"Houssem Ben Braiek, Foutse Khomh","doi":"10.1109/ICSME.2019.00078","DOIUrl":null,"url":null,"abstract":"The increasing inclusion of Deep Learning (DL) models in safety-critical systems such as autonomous vehicles have led to the development of multiple model-based DL testing techniques. One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria. So far, the effectiveness of these approaches has been hindered by their reliance on random fuzzing or transformations that do not always produce test cases with a good diversity. To overcome these limitations, we propose, DeepEvolution, a novel search-based approach for testing DL models that relies on metaheuristics to ensure a maximum diversity in generated test cases. We assess the effectiveness of DeepEvolution in testing computer-vision DL models and found that it significantly increases the neuronal coverage of generated test cases. Moreover, using DeepEvolution, we could successfully find several corner-case behaviors. Finally, DeepEvolution outperformed Tensorfuzz (a coverage-guided fuzzing tool developed at Google Brain) in detecting latent defects introduced during the quantization of the models. These results suggest that search-based approaches can help build effective testing tools for DL systems.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME.2019.00078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

Abstract

The increasing inclusion of Deep Learning (DL) models in safety-critical systems such as autonomous vehicles have led to the development of multiple model-based DL testing techniques. One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria. So far, the effectiveness of these approaches has been hindered by their reliance on random fuzzing or transformations that do not always produce test cases with a good diversity. To overcome these limitations, we propose, DeepEvolution, a novel search-based approach for testing DL models that relies on metaheuristics to ensure a maximum diversity in generated test cases. We assess the effectiveness of DeepEvolution in testing computer-vision DL models and found that it significantly increases the neuronal coverage of generated test cases. Moreover, using DeepEvolution, we could successfully find several corner-case behaviors. Finally, DeepEvolution outperformed Tensorfuzz (a coverage-guided fuzzing tool developed at Google Brain) in detecting latent defects introduced during the quantization of the models. These results suggest that search-based approaches can help build effective testing tools for DL systems.

查看原文本刊更多论文

深度进化:一种基于搜索的深度神经网络测试方法

深度学习(DL)模型越来越多地应用于安全关键系统，如自动驾驶汽车，这导致了多种基于模型的深度学习测试技术的发展。这些测试技术的一个共同点是自动生成测试用例，例如，从原始训练数据转换的新输入，目的是优化一些测试充分性标准。到目前为止，这些方法的有效性已经被它们对随机模糊或转换的依赖所阻碍，这些转换并不总是产生具有良好多样性的测试用例。为了克服这些限制，我们提出了DeepEvolution，这是一种基于搜索的测试DL模型的新方法，它依赖于元启发式来确保生成的测试用例的最大多样性。我们评估了DeepEvolution在测试计算机视觉深度学习模型中的有效性，发现它显著增加了生成的测试用例的神经元覆盖率。此外，使用DeepEvolution，我们可以成功地找到几个角落情况的行为。最后，在检测模型量化过程中引入的潜在缺陷方面，DeepEvolution优于Tensorfuzz(一种由Google Brain开发的覆盖引导模糊工具)。这些结果表明，基于搜索的方法可以帮助为深度学习系统构建有效的测试工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

自引率

0.00%

发文量