{"title":"DataDeps.jl: Repeatable Data Setup for Reproducible Data Science","authors":"Lyndon White, R. Togneri, Wei Liu, Bennamoun","doi":"10.5334/jors.244","DOIUrl":null,"url":null,"abstract":"We present DataDeps.jl: a julia package for the reproducible handling of static datasets to enhance the repeatability of scripts used in the data and computational sciences. It is used to automate the data setup part of running software which accompanies a paper to replicate a result. This step is commonly done manually, which expends time and allows for confusion. This functionality is also useful for other packages which require data to function (e.g. a trained machine learning based model). DataDeps.jl simplifies extending research software by automatically managing the dependencies and makes it easier to run another author’s code, thus enhancing the reproducibility of data science research.","PeriodicalId":37323,"journal":{"name":"Journal of Open Research Software","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Open Research Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5334/jors.244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 3
Abstract
We present DataDeps.jl: a julia package for the reproducible handling of static datasets to enhance the repeatability of scripts used in the data and computational sciences. It is used to automate the data setup part of running software which accompanies a paper to replicate a result. This step is commonly done manually, which expends time and allows for confusion. This functionality is also useful for other packages which require data to function (e.g. a trained machine learning based model). DataDeps.jl simplifies extending research software by automatically managing the dependencies and makes it easier to run another author’s code, thus enhancing the reproducibility of data science research.