测试机器学习和深度学习系统:成就和挑战

IF 2.9 4区 综合性期刊 Q2 MULTIDISCIPLINARY SCIENCES
Salma Albelali, Moataz Ahmed
{"title":"测试机器学习和深度学习系统:成就和挑战","authors":"Salma Albelali,&nbsp;Moataz Ahmed","doi":"10.1007/s13369-025-10276-w","DOIUrl":null,"url":null,"abstract":"<div><p>Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.</p></div>","PeriodicalId":54354,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"50 15","pages":"11433 - 11484"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Testing Machine Learning and Deep Learning Systems: Achievements and Challenges\",\"authors\":\"Salma Albelali,&nbsp;Moataz Ahmed\",\"doi\":\"10.1007/s13369-025-10276-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.</p></div>\",\"PeriodicalId\":54354,\"journal\":{\"name\":\"Arabian Journal for Science and Engineering\",\"volume\":\"50 15\",\"pages\":\"11433 - 11484\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Arabian Journal for Science and Engineering\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s13369-025-10276-w\",\"RegionNum\":4,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://link.springer.com/article/10.1007/s13369-025-10276-w","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

人工智能的快速发展推动了学习算法(机器学习(ML)和深度学习(DL)模型)在各个行业的整合,为测试这些复杂系统带来了新的挑战。在自动驾驶、医疗诊断和财务预测等高风险领域,对基于ML/ dl的系统(mls)进行严格测试尤其重要,因为这些领域的系统可靠性至关重要。与传统软件不同,MLS的质量不仅依赖于模型体系结构和开发过程,而且在很大程度上依赖于训练数据的质量。本研究提供了MLS测试方法的全面回顾,重点关注数据盒测试的新兴角色,以及已建立的黑盒和白盒技术。数据盒测试评估培训数据的质量,以确保它符合标准,如充分性和充分性,连接黑盒和白盒方法,以提高系统的可靠性。该研究进一步解决了在DL中越来越多地使用突变检测(MT)的问题,探索了MT技术和突变操作员,以确保足够的覆盖率。通过综合最近的进展,我们提出了一个集成的MLS测试框架,它封装了这些关键方面,为未来的研究提供了见解,并突出了改进MLS测试实践的领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Testing Machine Learning and Deep Learning Systems: Achievements and Challenges

Testing Machine Learning and Deep Learning Systems: Achievements and Challenges

Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Arabian Journal for Science and Engineering
Arabian Journal for Science and Engineering MULTIDISCIPLINARY SCIENCES-
CiteScore
5.70
自引率
3.40%
发文量
993
期刊介绍: King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE). AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信