INCREASING THE REPRODUCIBILITY OF SCIENTIFIC RESEARCH WORKS: A CASE STUDY USING THE ENVIRONMENT CODE-FIRST FRAMEWORK

Daniel Adorno Gomes, P. Mestre, Carlos Serôdio
{"title":"INCREASING THE REPRODUCIBILITY OF SCIENTIFIC RESEARCH WORKS: A CASE STUDY USING THE ENVIRONMENT CODE-FIRST FRAMEWORK","authors":"Daniel Adorno Gomes, P. Mestre, Carlos Serôdio","doi":"10.26668/businessreview/2024.v9i5.4662","DOIUrl":null,"url":null,"abstract":"Objective: The purpose of this paper is to present a case study on how a recently proposed reproducibility framework named Environment Code-First (ECF) based on the Infrastructure-as-Code approach can improve the implementation and reproduction of computing environments by reducing complexity and manual intervention. \nMethodology: The study compares the manual way of implementing a pipeline and the automated method proposed by the ECF framework, showing real metrics regarding time consumption, efforts, manual intervention, and platform agnosticism. It details the steps needed to implement the computational environment of a bioinformatics pipeline named MetaWorks from the perspective of the scientist who owns the research work. Also, we present the steps taken to recreate the environment from the point of view of one who wants to reproduce the published results of a research work. \nFindings and Conclusion: The results demonstrate considerable benefits in adopting the ECF framework, particularly in maintaining the same applicational behavior across different machines. Such empirical evidence underscores the significance of reducing manual intervention, as it ensures the consistent recreation of the environment as many times as needed, especially by non-original researchers. \nOriginality/Value: Verifying published findings in bioinformatics through independent validation is challenging, mainly when accounting for differences in software and hardware to recreate computational environments. Reproducing a computational environment that closely mimics the original proves intricate and demands a significant investment of time. This study contributes to educate and assist researchers in enhancing the reproducibility of their work by creating self-contained computational environments that are highly reproducible, isolated, portable, and platform-agnostic.","PeriodicalId":506637,"journal":{"name":"International Journal of Professional Business Review","volume":"7 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Professional Business Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26668/businessreview/2024.v9i5.4662","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: The purpose of this paper is to present a case study on how a recently proposed reproducibility framework named Environment Code-First (ECF) based on the Infrastructure-as-Code approach can improve the implementation and reproduction of computing environments by reducing complexity and manual intervention. Methodology: The study compares the manual way of implementing a pipeline and the automated method proposed by the ECF framework, showing real metrics regarding time consumption, efforts, manual intervention, and platform agnosticism. It details the steps needed to implement the computational environment of a bioinformatics pipeline named MetaWorks from the perspective of the scientist who owns the research work. Also, we present the steps taken to recreate the environment from the point of view of one who wants to reproduce the published results of a research work. Findings and Conclusion: The results demonstrate considerable benefits in adopting the ECF framework, particularly in maintaining the same applicational behavior across different machines. Such empirical evidence underscores the significance of reducing manual intervention, as it ensures the consistent recreation of the environment as many times as needed, especially by non-original researchers. Originality/Value: Verifying published findings in bioinformatics through independent validation is challenging, mainly when accounting for differences in software and hardware to recreate computational environments. Reproducing a computational environment that closely mimics the original proves intricate and demands a significant investment of time. This study contributes to educate and assist researchers in enhancing the reproducibility of their work by creating self-contained computational environments that are highly reproducible, isolated, portable, and platform-agnostic.
提高科研成果的可复制性:使用环境代码优先框架的案例研究
目的:本文旨在通过一个案例研究,介绍最近提出的基于 "基础设施即代码"(Infrastructure-as-Code)方法的 "环境代码优先"(Environment Code-First,ECF)可重现性框架如何通过减少复杂性和人工干预来改进计算环境的实施和重现。研究方法:研究比较了手动实施管道的方法和 ECF 框架提出的自动方法,显示了时间消耗、工作量、手动干预和平台无关性方面的真实指标。它从拥有研究工作的科学家的角度,详细介绍了实施名为 MetaWorks 的生物信息学管道计算环境所需的步骤。此外,我们还从想要重现已发表的研究成果的角度,介绍了重新创建环境的步骤。研究结果和结论:结果表明,采用 ECF 框架有相当大的好处,尤其是在不同机器上保持相同的应用行为。这些经验证据强调了减少人工干预的重要性,因为它能确保根据需要多次一致地再现环境,尤其是非原创研究人员。原创性/价值:通过独立验证来验证生物信息学领域已发表的研究成果具有挑战性,主要是在考虑软件和硬件差异以重现计算环境时。事实证明,要重现一个与原研究成果非常相似的计算环境非常复杂,需要投入大量时间。本研究通过创建高度可重现、隔离、便携和平台无关的独立计算环境,帮助研究人员提高其工作的可重现性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信