A Study in Reproducibility: The Congruent Matching Cells Algorithm and cmcR Package

R J. Pub Date : 2023-02-10 DOI:10.32614/rj-2023-014
Joe Zemmels, Susan Vanderplas, H. Hofmann
{"title":"A Study in Reproducibility: The Congruent Matching Cells Algorithm and cmcR Package","authors":"Joe Zemmels, Susan Vanderplas, H. Hofmann","doi":"10.32614/rj-2023-014","DOIUrl":null,"url":null,"abstract":"Scientific research is driven by our ability to use methods, procedures, and materials from previous studies and further research by adding to it. As the need for computationally-intensive methods to analyze large amounts of data grows, the criteria needed to achieve reproducibility, specifically computational reproducibility, have become more sophisticated. In general, prosaic descriptions of algorithms are not detailed or precise enough to ensure complete reproducibility of a method. Results may be sensitive to conditions not commonly specified in written-word descriptions such as implicit parameter settings or the programming language used. To achieve true computational reproducibility, it is necessary to provide all intermediate data and code used to produce published results. In this paper, we consider a class of algorithms developed to perform firearm evidence identification on cartridge case evidence known as the Congruent Matching Cells (CMC) methods. To date, these algorithms have been published as textual descriptions only. We introduce the first open-source implementation of the Congruent Matching Cells methods in the R package cmcR . We have structured the cmcR package as a set of sequential, modularized functions intended to ease the process of parameter experimentation. We use cmcR and a novel variance ratio statistic to explore the CMC methodology and demonstrate how to fill in the gaps when provided with computationally ambiguous descriptions of algorithms.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"7 1","pages":"79-102"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"R J.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32614/rj-2023-014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Scientific research is driven by our ability to use methods, procedures, and materials from previous studies and further research by adding to it. As the need for computationally-intensive methods to analyze large amounts of data grows, the criteria needed to achieve reproducibility, specifically computational reproducibility, have become more sophisticated. In general, prosaic descriptions of algorithms are not detailed or precise enough to ensure complete reproducibility of a method. Results may be sensitive to conditions not commonly specified in written-word descriptions such as implicit parameter settings or the programming language used. To achieve true computational reproducibility, it is necessary to provide all intermediate data and code used to produce published results. In this paper, we consider a class of algorithms developed to perform firearm evidence identification on cartridge case evidence known as the Congruent Matching Cells (CMC) methods. To date, these algorithms have been published as textual descriptions only. We introduce the first open-source implementation of the Congruent Matching Cells methods in the R package cmcR . We have structured the cmcR package as a set of sequential, modularized functions intended to ease the process of parameter experimentation. We use cmcR and a novel variance ratio statistic to explore the CMC methodology and demonstrate how to fill in the gaps when provided with computationally ambiguous descriptions of algorithms.
可重复性的研究:一致匹配单元算法和cmcR包
科学研究是由我们使用以前研究的方法、程序和材料的能力推动的,并通过添加进一步的研究。随着对分析大量数据的计算密集型方法的需求的增长,实现可再现性(特别是计算可再现性)所需的标准变得更加复杂。一般来说,对算法的平淡描述不够详细或精确,无法确保方法的完全再现性。结果可能对通常在书面文字描述中没有指定的条件很敏感,例如隐式参数设置或使用的编程语言。为了实现真正的计算再现性,有必要提供用于生成已发布结果的所有中间数据和代码。在本文中,我们考虑了一类被称为一致匹配单元(CMC)方法的算法,用于对弹壳证据进行枪支证据识别。迄今为止,这些算法仅以文本描述的形式发表。我们在R包cmcR中介绍了一致性匹配单元方法的第一个开源实现。我们将cmcR包构建为一组顺序的模块化功能,旨在简化参数实验过程。我们使用cmcR和一种新颖的方差比统计来探索CMC方法,并演示了当提供计算模糊的算法描述时如何填补空白。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信