U. Haus, Timothy Dykes, Aniello Esposito, Clément Foyer, Adrian Tate
{"title":"Universal Data Junction: A Transport Layer for Data Driven Workflows","authors":"U. Haus, Timothy Dykes, Aniello Esposito, Clément Foyer, Adrian Tate","doi":"10.1145/3592979.3593423","DOIUrl":null,"url":null,"abstract":"A novel transport library for the efficient coupling of applications through their data dependencies is presented. The design is driven by the intent to require minimal changes to existing scientific applications and to declare the data objects that are meaningful for other applications for read and write as well as to perform transparent transport including automatic redistribution of parallel data structures, thus permitting seamless coupling of applications in workflows. The actual transport can be selected at run time, and can exploit a variety of data exchange methods, including MPI, Dataspaces, Ceph Rados, CRAY Datawarp, and a POSIX file system. For the case of MPI transport, the library is used to implement the first stage of a co-working visualization pipeline for CP2K and results show a significant advantage compared to a filesystem based approach.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Platform for Advanced Scientific Computing Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3592979.3593423","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A novel transport library for the efficient coupling of applications through their data dependencies is presented. The design is driven by the intent to require minimal changes to existing scientific applications and to declare the data objects that are meaningful for other applications for read and write as well as to perform transparent transport including automatic redistribution of parallel data structures, thus permitting seamless coupling of applications in workflows. The actual transport can be selected at run time, and can exploit a variety of data exchange methods, including MPI, Dataspaces, Ceph Rados, CRAY Datawarp, and a POSIX file system. For the case of MPI transport, the library is used to implement the first stage of a co-working visualization pipeline for CP2K and results show a significant advantage compared to a filesystem based approach.