A property graph schema for automated metadata capture, reproducibility and knowledge discovery in high-throughput bioprocess development†

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY
Federico M. Mione, Martin F. Luna, Lucas Kaspersetz, Peter Neubauer, Ernesto C. Martinez and M. Nicolas Cruz Bournazou
{"title":"A property graph schema for automated metadata capture, reproducibility and knowledge discovery in high-throughput bioprocess development†","authors":"Federico M. Mione, Martin F. Luna, Lucas Kaspersetz, Peter Neubauer, Ernesto C. Martinez and M. Nicolas Cruz Bournazou","doi":"10.1039/D5DD00070J","DOIUrl":null,"url":null,"abstract":"<p >Recent advances in autonomous experimentation and self-driving laboratories have drastically increased the complexity of orchestrating robotic experiments and of recording the different computational processes involved including all related metadata. Addressing this challenge requires a flexible and scalable information storage system that prioritizes the relationships between data and metadata, surpassing the limitations of traditional relational databases. To foster knowledge discovery in high-throughput bioprocess development, the computational control of the experimentation must be fully automated, with the capability to efficiently collect and manage experimental data and their integration into a knowledge base. This work proposes the adoption of graph databases integrated with a semantic structure to enable knowledge transfer between humans and machines. To this end, a property graph schema (PG-schema) has been specifically designed for high-throughput experiments in robotic platforms, focused mainly on the automation of the computational workflow used to ensure the reproducibility, reusability, and credibility of learned bioprocess models. A prototype implementation of the PG-schema and its integration with the workflow management system using simulated experiments is presented to highlight the advantages of the proposed approach in the generation of FAIR data.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 9","pages":" 2401-2422"},"PeriodicalIF":6.2000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00070j?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/dd/d5dd00070j","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advances in autonomous experimentation and self-driving laboratories have drastically increased the complexity of orchestrating robotic experiments and of recording the different computational processes involved including all related metadata. Addressing this challenge requires a flexible and scalable information storage system that prioritizes the relationships between data and metadata, surpassing the limitations of traditional relational databases. To foster knowledge discovery in high-throughput bioprocess development, the computational control of the experimentation must be fully automated, with the capability to efficiently collect and manage experimental data and their integration into a knowledge base. This work proposes the adoption of graph databases integrated with a semantic structure to enable knowledge transfer between humans and machines. To this end, a property graph schema (PG-schema) has been specifically designed for high-throughput experiments in robotic platforms, focused mainly on the automation of the computational workflow used to ensure the reproducibility, reusability, and credibility of learned bioprocess models. A prototype implementation of the PG-schema and its integration with the workflow management system using simulated experiments is presented to highlight the advantages of the proposed approach in the generation of FAIR data.

Abstract Image

高通量生物工艺开发中用于自动元数据捕获、再现性和知识发现的属性图模式
自主实验和自动驾驶实验室的最新进展大大增加了编排机器人实验和记录涉及的不同计算过程(包括所有相关元数据)的复杂性。解决这一挑战需要一个灵活且可扩展的信息存储系统,该系统可以优先考虑数据和元数据之间的关系,从而超越传统关系数据库的限制。为了促进高通量生物工艺开发中的知识发现,实验的计算控制必须完全自动化,具有有效收集和管理实验数据并将其集成到知识库中的能力。这项工作提出了采用与语义结构集成的图形数据库来实现人与机器之间的知识转移。为此,专门为机器人平台的高通量实验设计了一个属性图模式(PG-schema),主要关注用于确保学习生物过程模型的再现性、可重用性和可信度的计算工作流的自动化。通过模拟实验,给出了pg模式的原型实现及其与工作流管理系统的集成,以突出该方法在生成FAIR数据方面的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信