Reproducibility in machine-learning-based research: Overview, barriers, and drivers

IF 3.2 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Ai Magazine Pub Date : 2025-04-14 DOI:10.1002/aaai.70002

Harald Semmelrock, Tony Ross-Hellauer, Simone Kopeinik, Dieter Theiler, Armin Haberl, Stefan Thalmann, Dominik Kowald

{"title":"Reproducibility in machine-learning-based research: Overview, barriers, and drivers","authors":"Harald Semmelrock, Tony Ross-Hellauer, Simone Kopeinik, Dieter Theiler, Armin Haberl, Stefan Thalmann, Dominik Kowald","doi":"10.1002/aaai.70002","DOIUrl":null,"url":null,"abstract":"<p>Many research fields are currently reckoning with issues of poor levels of reproducibility. Some label it a “crisis,” and research employing or building machine learning (ML) models is no exception. Issues including lack of transparency, data or code, poor adherence to standards, and the sensitivity of ML training conditions mean that many papers are not even reproducible in principle. Where they are, though, reproducibility experiments have found worryingly low degrees of similarity with original results. Despite previous appeals from ML researchers on this topic and various initiatives from conference reproducibility tracks to the ACM's new Emerging Interest Group on Reproducibility and Replicability, we contend that the general community continues to take this issue too lightly. Poor reproducibility threatens trust in and integrity of research results. Therefore, in this article, we lay out a new perspective on the key barriers and drivers (both procedural and technical) to increased reproducibility at various levels (methods, code, data, and experiments). We then map the drivers to the barriers to give concrete advice for strategies for researchers to mitigate reproducibility issues in their own work, to lay out key areas where further research is needed in specific areas, and to further ignite discussion on the threat presented by these urgent issues.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"46 2","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.70002","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ai Magazine","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aaai.70002","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Many research fields are currently reckoning with issues of poor levels of reproducibility. Some label it a “crisis,” and research employing or building machine learning (ML) models is no exception. Issues including lack of transparency, data or code, poor adherence to standards, and the sensitivity of ML training conditions mean that many papers are not even reproducible in principle. Where they are, though, reproducibility experiments have found worryingly low degrees of similarity with original results. Despite previous appeals from ML researchers on this topic and various initiatives from conference reproducibility tracks to the ACM's new Emerging Interest Group on Reproducibility and Replicability, we contend that the general community continues to take this issue too lightly. Poor reproducibility threatens trust in and integrity of research results. Therefore, in this article, we lay out a new perspective on the key barriers and drivers (both procedural and technical) to increased reproducibility at various levels (methods, code, data, and experiments). We then map the drivers to the barriers to give concrete advice for strategies for researchers to mitigate reproducibility issues in their own work, to lay out key areas where further research is needed in specific areas, and to further ignite discussion on the threat presented by these urgent issues.

Abstract Image

查看原文本刊更多论文

基于机器学习的研究的可重复性：概述、障碍和驱动因素

许多研究领域目前都在考虑可重复性差的问题。一些人将其称为“危机”，使用或构建机器学习（ML）模型的研究也不例外。缺乏透明度、数据或代码、对标准的依从性差以及ML训练条件的敏感性等问题意味着许多论文在原则上甚至是不可复制的。然而，可重复性实验发现，与原始结果的相似度低得令人担忧。尽管之前机器学习研究人员对这个主题提出了呼吁，并且从会议可重复性跟踪到ACM新成立的可重复性和可复制性兴趣小组，我们认为一般社区仍然对这个问题过于轻视。低可重复性威胁到对研究结果的信任和完整性。因此，在本文中，我们将从一个新的角度来看待在不同层次（方法、代码、数据和实验）上提高再现性的关键障碍和驱动因素（程序和技术）。然后，我们将驱动因素映射到障碍，为研究人员提供具体的策略建议，以减轻他们自己工作中的可重复性问题，列出在特定领域需要进一步研究的关键领域，并进一步引发对这些紧迫问题所带来的威胁的讨论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ai Magazine 工程技术-计算机：人工智能

CiteScore

3.90

自引率

11.10%

发文量

审稿时长

>12 weeks

期刊介绍： AI Magazine publishes original articles that are reasonably self-contained and aimed at a broad spectrum of the AI community. Technical content should be kept to a minimum. In general, the magazine does not publish articles that have been published elsewhere in whole or in part. The magazine welcomes the contribution of articles on the theory and practice of AI as well as general survey articles, tutorial articles on timely topics, conference or symposia or workshop reports, and timely columns on topics of interest to AI scientists.