{"title":"通过嵌入脚本和数据丰富图形来提高LaTeX文档的再现性","authors":"C. Jacobs","doi":"10.2218/ijdc.v14i1.656","DOIUrl":null,"url":null,"abstract":"The introduction of open access data policies by research councils, the enforcement of best practices, and the deployment of persistent online repositories have enabled datasets that support results in scientific papers to become more widely accessible. Unfortunately, despite this advancement in the curation/publishing workflow, the data-driven figures within a paper often remain difficult to reproduce. Plotting or analysis scripts rarely accompany the manuscript or any associated software release; and even if they do, it may be unclear exactly which version was used. Furthermore, the precise commands and parameters used to execute the scripts are often not included in a README file or in the paper itself. This paper introduces a new open source digital curation tool, Pynea, for improving the reproducibility of LaTeX documents. Each figure within a document is enriched by automatically embedding the plotting script and data files required to generate it, such that it can be regenerated by readers of the paper in the future. The command used to execute the plotting script is also added to the figure’s metadata, along with details of the specific version of the script used (if the script is tracked with the Git version control system). If the document is to be recompiled with a figure that has since changed, or had its plotting script or data files modified, the figure is regenerated such that the author can be confident that the latest version of the figure and its dependencies are included. Received 06 April 2019 | Revision received 30 June 2019 | Accepted 12 August 2019 Correspondence should be addressed to Dr Christian T. Jacobs, Defence Science and Technology Laboratory (Dstl), Porton Down, Salisbury, Wiltshire, SP4 0JQ, United Kingdom, Email: cjacobs@dstl.gov.uk The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by the University of Edinburgh on behalf of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ Copyright rests with the authors. This work is released under a Creative Commons Attribution 4.0 International Licence. For details please see http://creativecommons.org/licenses/by/4.0/ International Journal of Digital Curation 2020, Vol. 14, Iss. 1, 292–302. 292 https://doi.org/10.2218/ijdc.v14i1.656 DOI: 10.2218/ijdc.v14i1.656 doi:10.2218/ijdc.v14i1.656 Christian T. Jacobs | 293","PeriodicalId":87279,"journal":{"name":"International journal of digital curation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the Reproducibility of LaTeX Documents by Enriching Figures with Embedded Scripts and Data\",\"authors\":\"C. Jacobs\",\"doi\":\"10.2218/ijdc.v14i1.656\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The introduction of open access data policies by research councils, the enforcement of best practices, and the deployment of persistent online repositories have enabled datasets that support results in scientific papers to become more widely accessible. Unfortunately, despite this advancement in the curation/publishing workflow, the data-driven figures within a paper often remain difficult to reproduce. Plotting or analysis scripts rarely accompany the manuscript or any associated software release; and even if they do, it may be unclear exactly which version was used. Furthermore, the precise commands and parameters used to execute the scripts are often not included in a README file or in the paper itself. This paper introduces a new open source digital curation tool, Pynea, for improving the reproducibility of LaTeX documents. Each figure within a document is enriched by automatically embedding the plotting script and data files required to generate it, such that it can be regenerated by readers of the paper in the future. The command used to execute the plotting script is also added to the figure’s metadata, along with details of the specific version of the script used (if the script is tracked with the Git version control system). If the document is to be recompiled with a figure that has since changed, or had its plotting script or data files modified, the figure is regenerated such that the author can be confident that the latest version of the figure and its dependencies are included. Received 06 April 2019 | Revision received 30 June 2019 | Accepted 12 August 2019 Correspondence should be addressed to Dr Christian T. Jacobs, Defence Science and Technology Laboratory (Dstl), Porton Down, Salisbury, Wiltshire, SP4 0JQ, United Kingdom, Email: cjacobs@dstl.gov.uk The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by the University of Edinburgh on behalf of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ Copyright rests with the authors. This work is released under a Creative Commons Attribution 4.0 International Licence. For details please see http://creativecommons.org/licenses/by/4.0/ International Journal of Digital Curation 2020, Vol. 14, Iss. 1, 292–302. 292 https://doi.org/10.2218/ijdc.v14i1.656 DOI: 10.2218/ijdc.v14i1.656 doi:10.2218/ijdc.v14i1.656 Christian T. Jacobs | 293\",\"PeriodicalId\":87279,\"journal\":{\"name\":\"International journal of digital curation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of digital curation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2218/ijdc.v14i1.656\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of digital curation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2218/ijdc.v14i1.656","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
研究委员会引入的开放获取数据政策,最佳实践的实施,以及持久在线存储库的部署,使支持科学论文结果的数据集变得更广泛地可访问。不幸的是,尽管在管理/出版工作流程方面取得了进步,但论文中数据驱动的数字通常仍然难以复制。绘图或分析脚本很少伴随手稿或任何相关的软件发布;即使他们知道,可能也不清楚到底使用了哪个版本。此外,用于执行脚本的精确命令和参数通常不包括在README文件或论文本身中。本文介绍了一个新的开源数字管理工具Pynea,用于提高LaTeX文档的可再现性。通过自动嵌入生成所需的绘图脚本和数据文件,可以丰富文档中的每个图形,以便将来由论文的读者重新生成。用于执行绘图脚本的命令还被添加到图的元数据中,以及所使用的脚本的特定版本的详细信息(如果使用Git版本控制系统跟踪脚本)。如果文档要用一个已经更改过的图形重新编译,或者其绘图脚本或数据文件已被修改,则会重新生成该图形,以便作者可以确信包含了该图形及其依赖项的最新版本。收到2019年4月6日|修订收到2019年6月30日|接受2019年8月12日信件应发送给Christian T. Jacobs博士,国防科学与技术实验室(Dstl), Porton Down,索尔兹伯里,威尔特郡,SP4 0JQ,英国,电子邮件:cjacobs@dstl.gov.uk国际数字策展杂志是一本致力于学术卓越的国际期刊,致力于在各个领域推进数字策展。IJDC由爱丁堡大学代表数字策展中心出版。ISSN: 1746 - 8256。版权归作者所有。本作品采用知识共享署名4.0国际许可协议发布。详细信息请参见http://creativecommons.org/licenses/by/4.0/国际数字策展杂志2020,Vol. 14, Iss. 1,292 - 302。292 https://doi.org/10.2218/ijdc.v14i1.656 DOI: 10.2218/ijdc.v14i1.656 DOI: 10.2218/ijdc.v14i1.656 Christian T. Jacobs | 293
Improving the Reproducibility of LaTeX Documents by Enriching Figures with Embedded Scripts and Data
The introduction of open access data policies by research councils, the enforcement of best practices, and the deployment of persistent online repositories have enabled datasets that support results in scientific papers to become more widely accessible. Unfortunately, despite this advancement in the curation/publishing workflow, the data-driven figures within a paper often remain difficult to reproduce. Plotting or analysis scripts rarely accompany the manuscript or any associated software release; and even if they do, it may be unclear exactly which version was used. Furthermore, the precise commands and parameters used to execute the scripts are often not included in a README file or in the paper itself. This paper introduces a new open source digital curation tool, Pynea, for improving the reproducibility of LaTeX documents. Each figure within a document is enriched by automatically embedding the plotting script and data files required to generate it, such that it can be regenerated by readers of the paper in the future. The command used to execute the plotting script is also added to the figure’s metadata, along with details of the specific version of the script used (if the script is tracked with the Git version control system). If the document is to be recompiled with a figure that has since changed, or had its plotting script or data files modified, the figure is regenerated such that the author can be confident that the latest version of the figure and its dependencies are included. Received 06 April 2019 | Revision received 30 June 2019 | Accepted 12 August 2019 Correspondence should be addressed to Dr Christian T. Jacobs, Defence Science and Technology Laboratory (Dstl), Porton Down, Salisbury, Wiltshire, SP4 0JQ, United Kingdom, Email: cjacobs@dstl.gov.uk The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by the University of Edinburgh on behalf of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ Copyright rests with the authors. This work is released under a Creative Commons Attribution 4.0 International Licence. For details please see http://creativecommons.org/licenses/by/4.0/ International Journal of Digital Curation 2020, Vol. 14, Iss. 1, 292–302. 292 https://doi.org/10.2218/ijdc.v14i1.656 DOI: 10.2218/ijdc.v14i1.656 doi:10.2218/ijdc.v14i1.656 Christian T. Jacobs | 293