Enabling the Verification of Computational Results: An Empirical Evaluation of Computational Reproducibility

V. Stodden, M. Krafczyk, A. Bhaskar
{"title":"Enabling the Verification of Computational Results: An Empirical Evaluation of Computational Reproducibility","authors":"V. Stodden, M. Krafczyk, A. Bhaskar","doi":"10.1145/3214239.3214242","DOIUrl":null,"url":null,"abstract":"The ability to independently regenerate published computational claims is widely recognized as a key component of scientific reproducibility. In this article we take a narrow interpretation of this goal, and attempt to regenerate published claims from author-supplied information, including data, code, inputs, and other provided specifications, on a different computational system than that used by the original authors. We are motivated by Claerbout and Donoho's exhortation of the importance of providing complete information for reproducibility of the published claim. We chose the Elsevier journal, the Journal of Computational Physics, which has stated author guidelines that encourage the availability of computational digital artifacts that support scholarly findings. In an IRB approved study at the University of Illinois at Urbana-Champaign (IRB #17329) we gathered artifacts from a sample of authors who published in this journal in 2016 and 2017. We then used the ICERM criteria generated at the 2012 ICERM workshop \"Reproducibility in Computational and Experimental Mathematics\" to evaluate the sufficiency of the information provided in the publications and the ease with which the digital artifacts afforded computational reproducibility. We find that, for the articles for which we obtained computational artifacts, we could not easily regenerate the findings for 67% of them, and we were unable to easily regenerate all the findings for any of the articles. We then evaluated the artifacts we did obtain (55 of 306 articles) and find that the main barriers to computational reproducibility are inadequate documentation of code, data, and workflow information (70.9%), missing code function and setting information, and missing licensing information (75%). We recommend improvements based on these findings, including the deposit of supporting digital artifacts for reproducibility as a condition of publication, and verification of computational findings via re-execution of the code when possible.","PeriodicalId":422030,"journal":{"name":"Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3214239.3214242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

The ability to independently regenerate published computational claims is widely recognized as a key component of scientific reproducibility. In this article we take a narrow interpretation of this goal, and attempt to regenerate published claims from author-supplied information, including data, code, inputs, and other provided specifications, on a different computational system than that used by the original authors. We are motivated by Claerbout and Donoho's exhortation of the importance of providing complete information for reproducibility of the published claim. We chose the Elsevier journal, the Journal of Computational Physics, which has stated author guidelines that encourage the availability of computational digital artifacts that support scholarly findings. In an IRB approved study at the University of Illinois at Urbana-Champaign (IRB #17329) we gathered artifacts from a sample of authors who published in this journal in 2016 and 2017. We then used the ICERM criteria generated at the 2012 ICERM workshop "Reproducibility in Computational and Experimental Mathematics" to evaluate the sufficiency of the information provided in the publications and the ease with which the digital artifacts afforded computational reproducibility. We find that, for the articles for which we obtained computational artifacts, we could not easily regenerate the findings for 67% of them, and we were unable to easily regenerate all the findings for any of the articles. We then evaluated the artifacts we did obtain (55 of 306 articles) and find that the main barriers to computational reproducibility are inadequate documentation of code, data, and workflow information (70.9%), missing code function and setting information, and missing licensing information (75%). We recommend improvements based on these findings, including the deposit of supporting digital artifacts for reproducibility as a condition of publication, and verification of computational findings via re-execution of the code when possible.
使计算结果的验证:计算再现性的经验评价
独立再生已发表的计算声明的能力被广泛认为是科学可重复性的关键组成部分。在本文中,我们对这一目标进行了狭义的解释,并尝试从作者提供的信息(包括数据、代码、输入和其他提供的规范)中重新生成已发布的声明,这些信息使用的计算系统与原始作者使用的计算系统不同。我们的动机是Claerbout和Donoho的劝告,即提供完整信息对于已发表声明的可重复性的重要性。我们选择了爱思唯尔的期刊《计算物理杂志》,它阐述了作者指南,鼓励使用支持学术发现的计算数字工件。在伊利诺伊大学厄巴纳-香槟分校(IRB #17329)的一项IRB批准的研究中,我们从2016年和2017年在该期刊上发表的作者样本中收集了文物。然后,我们使用在2012年ICERM研讨会“计算和实验数学的再现性”上生成的ICERM标准来评估出版物中提供的信息的充分性以及数字工件提供计算再现性的便利性。我们发现,对于我们获得计算伪像的文章,我们不能轻易地重新生成其中67%的结果,并且我们不能轻易地重新生成任何文章的所有结果。然后,我们评估了我们确实获得的工件(306篇文章中的55篇),发现计算可重复性的主要障碍是代码、数据和工作流信息的文档不足(70.9%),缺失代码功能和设置信息,以及缺失许可信息(75%)。我们建议在这些发现的基础上进行改进,包括将支持数字工件的可再现性作为出版条件,并在可能的情况下通过重新执行代码来验证计算结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信