面向深度学习增强的数据分布感知测试选择实证研究

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-04-19 DOI:10.1145/3511598

Huangping Qiang, Yuejun Guo, Maxime Cordy, Xiaofei Xie, Qiang Hu, Yuejun Guo, Maxime Cordy, Xiaofei Xie, L. Ma, Mike Papadakis

{"title":"面向深度学习增强的数据分布感知测试选择实证研究","authors":"Huangping Qiang, Yuejun Guo, Maxime Cordy, Xiaofei Xie, Qiang Hu, Yuejun Guo, Maxime Cordy, Xiaofei Xie, L. Ma, Mike Papadakis","doi":"10.1145/3511598","DOIUrl":null,"url":null,"abstract":"Similar to traditional software that is constantly under evolution, deep neural networks need to evolve upon the rapid growth of test data for continuous enhancement (e.g., adapting to distribution shift in a new environment for deployment). However, it is labor intensive to manually label all of the collected test data. Test selection solves this problem by strategically choosing a small set to label. Via retraining with the selected set, deep neural networks will achieve competitive accuracy. Unfortunately, existing selection metrics involve three main limitations: (1) using different retraining processes, (2) ignoring data distribution shifts, and (3) being insufficiently evaluated. To fill this gap, we first conduct a systemically empirical study to reveal the impact of the retraining process and data distribution on model enhancement. Then based on our findings, we propose DAT, a novel distribution-aware test selection metric. Experimental results reveal that retraining using both the training and selected data outperforms using only the selected data. None of the selection metrics perform the best under various data distributions. By contrast, DAT effectively alleviates the impact of distribution shifts and outperforms the compared metrics by up to five times and 30.09% accuracy improvement for model enhancement on simulated and in-the-wild distribution shift scenarios, respectively.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"37 23","pages":"1 - 30"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"An Empirical Study on Data Distribution-Aware Test Selection for Deep Learning Enhancement\",\"authors\":\"Huangping Qiang, Yuejun Guo, Maxime Cordy, Xiaofei Xie, Qiang Hu, Yuejun Guo, Maxime Cordy, Xiaofei Xie, L. Ma, Mike Papadakis\",\"doi\":\"10.1145/3511598\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Similar to traditional software that is constantly under evolution, deep neural networks need to evolve upon the rapid growth of test data for continuous enhancement (e.g., adapting to distribution shift in a new environment for deployment). However, it is labor intensive to manually label all of the collected test data. Test selection solves this problem by strategically choosing a small set to label. Via retraining with the selected set, deep neural networks will achieve competitive accuracy. Unfortunately, existing selection metrics involve three main limitations: (1) using different retraining processes, (2) ignoring data distribution shifts, and (3) being insufficiently evaluated. To fill this gap, we first conduct a systemically empirical study to reveal the impact of the retraining process and data distribution on model enhancement. Then based on our findings, we propose DAT, a novel distribution-aware test selection metric. Experimental results reveal that retraining using both the training and selected data outperforms using only the selected data. None of the selection metrics perform the best under various data distributions. By contrast, DAT effectively alleviates the impact of distribution shifts and outperforms the compared metrics by up to five times and 30.09% accuracy improvement for model enhancement on simulated and in-the-wild distribution shift scenarios, respectively.\",\"PeriodicalId\":7398,\"journal\":{\"name\":\"ACM Transactions on Software Engineering and Methodology (TOSEM)\",\"volume\":\"37 23\",\"pages\":\"1 - 30\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Software Engineering and Methodology (TOSEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3511598\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology (TOSEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511598","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

与不断进化的传统软件类似，深度神经网络需要随着测试数据的快速增长而进化，以不断增强(例如，适应新环境中的分布变化进行部署)。然而，手动标记所有收集的测试数据是一项劳动密集型工作。测试选择通过策略性地选择一个小集合来标记来解决这个问题。通过对所选集的再训练，深度神经网络将达到竞争精度。不幸的是，现有的选择度量包括三个主要的限制:(1)使用不同的再训练过程，(2)忽略数据分布的变化，(3)没有得到充分的评估。为了填补这一空白，我们首先进行了系统的实证研究，揭示了再训练过程和数据分布对模型增强的影响。然后基于我们的发现，我们提出了一种新的分布感知测试选择度量。实验结果表明，同时使用训练数据和选定数据进行再训练的效果优于仅使用选定数据进行再训练。在各种数据分布下，没有一个选择指标表现最好。相比之下，DAT有效地缓解了分布变化的影响，在模拟和野外分布变化场景下，模型增强的准确率分别提高了5倍和30.09%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Empirical Study on Data Distribution-Aware Test Selection for Deep Learning Enhancement

Similar to traditional software that is constantly under evolution, deep neural networks need to evolve upon the rapid growth of test data for continuous enhancement (e.g., adapting to distribution shift in a new environment for deployment). However, it is labor intensive to manually label all of the collected test data. Test selection solves this problem by strategically choosing a small set to label. Via retraining with the selected set, deep neural networks will achieve competitive accuracy. Unfortunately, existing selection metrics involve three main limitations: (1) using different retraining processes, (2) ignoring data distribution shifts, and (3) being insufficiently evaluated. To fill this gap, we first conduct a systemically empirical study to reveal the impact of the retraining process and data distribution on model enhancement. Then based on our findings, we propose DAT, a novel distribution-aware test selection metric. Experimental results reveal that retraining using both the training and selected data outperforms using only the selected data. None of the selection metrics perform the best under various data distributions. By contrast, DAT effectively alleviates the impact of distribution shifts and outperforms the compared metrics by up to five times and 30.09% accuracy improvement for model enhancement on simulated and in-the-wild distribution shift scenarios, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Software Engineering and Methodology (TOSEM)

自引率

0.00%

发文量