测试机器学习和深度学习系统：成就和挑战

IF 2.9 4区综合性期刊 Q2 MULTIDISCIPLINARY SCIENCES

Arabian Journal for Science and Engineering Pub Date : 2025-06-19 DOI:10.1007/s13369-025-10276-w

Salma Albelali, Moataz Ahmed

{"title":"测试机器学习和深度学习系统：成就和挑战","authors":"Salma Albelali, Moataz Ahmed","doi":"10.1007/s13369-025-10276-w","DOIUrl":null,"url":null,"abstract":"<div><p>Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.</p></div>","PeriodicalId":54354,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"50 15","pages":"11433 - 11484"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Testing Machine Learning and Deep Learning Systems: Achievements and Challenges\",\"authors\":\"Salma Albelali, Moataz Ahmed\",\"doi\":\"10.1007/s13369-025-10276-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.</p></div>\",\"PeriodicalId\":54354,\"journal\":{\"name\":\"Arabian Journal for Science and Engineering\",\"volume\":\"50 15\",\"pages\":\"11433 - 11484\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Arabian Journal for Science and Engineering\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s13369-025-10276-w\",\"RegionNum\":4,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://link.springer.com/article/10.1007/s13369-025-10276-w","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

人工智能的快速发展推动了学习算法（机器学习（ML）和深度学习（DL）模型）在各个行业的整合，为测试这些复杂系统带来了新的挑战。在自动驾驶、医疗诊断和财务预测等高风险领域，对基于ML/ dl的系统（mls）进行严格测试尤其重要，因为这些领域的系统可靠性至关重要。与传统软件不同，MLS的质量不仅依赖于模型体系结构和开发过程，而且在很大程度上依赖于训练数据的质量。本研究提供了MLS测试方法的全面回顾，重点关注数据盒测试的新兴角色，以及已建立的黑盒和白盒技术。数据盒测试评估培训数据的质量，以确保它符合标准，如充分性和充分性，连接黑盒和白盒方法，以提高系统的可靠性。该研究进一步解决了在DL中越来越多地使用突变检测（MT）的问题，探索了MT技术和突变操作员，以确保足够的覆盖率。通过综合最近的进展，我们提出了一个集成的MLS测试框架，它封装了这些关键方面，为未来的研究提供了见解，并突出了改进MLS测试实践的领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Testing Machine Learning and Deep Learning Systems: Achievements and Challenges

查看原文本刊更多论文

Testing Machine Learning and Deep Learning Systems: Achievements and Challenges

Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Arabian Journal for Science and Engineering MULTIDISCIPLINARY SCIENCES-

CiteScore

5.70

自引率

3.40%

发文量

993

期刊介绍： King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE). AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.