Perspectives on tracking data reuse across biodata resources

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Karen Ross, Frederic B Bastian, Matt Buys, Charles E Cook, Peter D'Eustachio, Melissa Harrison, H. Hermjakob, Donghui Li, Phillip Lord, Darren A Natale, Bjoern Peters, Paul W. Sternberg, Andrew I Su, Matthew Thakur, Paul D Thomas, Alex Bateman
{"title":"Perspectives on tracking data reuse across biodata resources","authors":"Karen Ross, Frederic B Bastian, Matt Buys, Charles E Cook, Peter D'Eustachio, Melissa Harrison, H. Hermjakob, Donghui Li, Phillip Lord, Darren A Natale, Bjoern Peters, Paul W. Sternberg, Andrew I Su, Matthew Thakur, Paul D Thomas, Alex Bateman","doi":"10.1093/bioadv/vbae057","DOIUrl":null,"url":null,"abstract":"\n \n \n Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge.\n \n \n \n The paper reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources.\n \n \n \n Supplementary data are available at Bioinformatics Advances online.\n","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge. The paper reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources. Supplementary data are available at Bioinformatics Advances online.
跟踪生物数据资源中数据再利用情况的视角
数据再利用是分子生物学中常见的重要实践,它使近几十年来收集的知识能够推动生命科学的发现和创新。这些知识中的许多已被整理到分子生物学数据库(如 UniProtKB)中,这些资源从它们之间的数据共享中获得了巨大的价值。然而,量化和记录这种数据再利用仍然是一项挑战。 本文报告了 UniProt 联合会于 2023 年 3 月举办的为期一天的虚拟研讨会的情况,来自生物数据资源的代表、数据管理专家和美国国立卫生研究院(NIH)的项目经理参加了此次研讨会。研讨会重点讨论了数据再利用的跟踪策略、数据再利用的最佳实践以及与数据再利用和跟踪相关的挑战。调查和讨论结果表明,数据再利用非常普遍,但有时缺乏可重复性的关键信息。挑战包括跟踪数据再利用的成本、跟踪数据与开放共享之间的矛盾、限制性许可以及跟踪商业数据使用的困难。讨论中提出的建议包括:开发记录数据再利用的标准化格式,开展有关限制性许可所造成障碍的教育,以及资助机构继续认识到数据管理是一项需要专门资源的重要活动。 补充数据可在 Bioinformatics Advances 在线查阅。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信