Robert W. Epps, Amanda A. Volk, Robert R. White, Robert Tirawat, Rosemary C. Bramante, Joseph J. Berry
{"title":"通用工作流语言和软件实现几何学习和 FAIR 科学协议报告","authors":"Robert W. Epps, Amanda A. Volk, Robert R. White, Robert Tirawat, Rosemary C. Bramante, Joseph J. Berry","doi":"arxiv-2409.05899","DOIUrl":null,"url":null,"abstract":"The modern technological landscape has trended towards increased precision\nand greater digitization of information. However, the methods used to record\nand communicate scientific procedures have remained largely unchanged over the\nlast century. Written text as the primary means for communicating scientific\nprotocols poses notable limitations in human and machine information transfer.\nIn this work, we present the Universal Workflow Language (UWL) and the\nopen-source Universal Workflow Language interface (UWLi). UWL is a graph-based\ndata architecture that can capture arbitrary scientific procedures through\nworkflow representation of protocol steps and embedded procedure metadata. It\nis machine readable, discipline agnostic, and compatible with FAIR reporting\nstandards. UWLi is an accompanying software package for building and\nmanipulating UWL files into tabular and plain text representations in a\ncontrolled, detailed, and multilingual format. UWL transcription of protocols\nfrom three high-impact publications resulted in the identification of\nsubstantial deficiencies in the detail of the reported procedures. UWL\ntranscription of these publications identified seventeen procedural ambiguities\nand thirty missing parameters for every one hundred words in published\nprocedures. In addition to preventing and identifying procedural omission, UWL\nfiles were found to be compatible with geometric learning techniques for\nrepresenting scientific protocols. In a surrogate function designed to\nrepresent an arbitrary multi-step experimental process, graph transformer\nnetworks were able to predict outcomes in approximately 6,000 fewer experiments\nthan equivalent linear models. Implementation of UWL and UWLi into the\nscientific reporting process will result in higher reproducibility between both\nexperimentalists and machines, thus proving an avenue to more effective\nmodeling and control of complex systems.","PeriodicalId":501043,"journal":{"name":"arXiv - PHYS - Physics and Society","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Universal Workflow Language and Software Enables Geometric Learning and FAIR Scientific Protocol Reporting\",\"authors\":\"Robert W. Epps, Amanda A. Volk, Robert R. White, Robert Tirawat, Rosemary C. Bramante, Joseph J. Berry\",\"doi\":\"arxiv-2409.05899\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The modern technological landscape has trended towards increased precision\\nand greater digitization of information. However, the methods used to record\\nand communicate scientific procedures have remained largely unchanged over the\\nlast century. Written text as the primary means for communicating scientific\\nprotocols poses notable limitations in human and machine information transfer.\\nIn this work, we present the Universal Workflow Language (UWL) and the\\nopen-source Universal Workflow Language interface (UWLi). UWL is a graph-based\\ndata architecture that can capture arbitrary scientific procedures through\\nworkflow representation of protocol steps and embedded procedure metadata. It\\nis machine readable, discipline agnostic, and compatible with FAIR reporting\\nstandards. UWLi is an accompanying software package for building and\\nmanipulating UWL files into tabular and plain text representations in a\\ncontrolled, detailed, and multilingual format. UWL transcription of protocols\\nfrom three high-impact publications resulted in the identification of\\nsubstantial deficiencies in the detail of the reported procedures. UWL\\ntranscription of these publications identified seventeen procedural ambiguities\\nand thirty missing parameters for every one hundred words in published\\nprocedures. In addition to preventing and identifying procedural omission, UWL\\nfiles were found to be compatible with geometric learning techniques for\\nrepresenting scientific protocols. In a surrogate function designed to\\nrepresent an arbitrary multi-step experimental process, graph transformer\\nnetworks were able to predict outcomes in approximately 6,000 fewer experiments\\nthan equivalent linear models. Implementation of UWL and UWLi into the\\nscientific reporting process will result in higher reproducibility between both\\nexperimentalists and machines, thus proving an avenue to more effective\\nmodeling and control of complex systems.\",\"PeriodicalId\":501043,\"journal\":{\"name\":\"arXiv - PHYS - Physics and Society\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Physics and Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05899\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Physics and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Universal Workflow Language and Software Enables Geometric Learning and FAIR Scientific Protocol Reporting
The modern technological landscape has trended towards increased precision
and greater digitization of information. However, the methods used to record
and communicate scientific procedures have remained largely unchanged over the
last century. Written text as the primary means for communicating scientific
protocols poses notable limitations in human and machine information transfer.
In this work, we present the Universal Workflow Language (UWL) and the
open-source Universal Workflow Language interface (UWLi). UWL is a graph-based
data architecture that can capture arbitrary scientific procedures through
workflow representation of protocol steps and embedded procedure metadata. It
is machine readable, discipline agnostic, and compatible with FAIR reporting
standards. UWLi is an accompanying software package for building and
manipulating UWL files into tabular and plain text representations in a
controlled, detailed, and multilingual format. UWL transcription of protocols
from three high-impact publications resulted in the identification of
substantial deficiencies in the detail of the reported procedures. UWL
transcription of these publications identified seventeen procedural ambiguities
and thirty missing parameters for every one hundred words in published
procedures. In addition to preventing and identifying procedural omission, UWL
files were found to be compatible with geometric learning techniques for
representing scientific protocols. In a surrogate function designed to
represent an arbitrary multi-step experimental process, graph transformer
networks were able to predict outcomes in approximately 6,000 fewer experiments
than equivalent linear models. Implementation of UWL and UWLi into the
scientific reporting process will result in higher reproducibility between both
experimentalists and machines, thus proving an avenue to more effective
modeling and control of complex systems.