Yan Mendes, Regina M. M. Braga, Victor Ströele, Daniela Alarcão De Oliveira
{"title":"Polyflow","authors":"Yan Mendes, Regina M. M. Braga, Victor Ströele, Daniela Alarcão De Oliveira","doi":"10.1145/3330204.3330259","DOIUrl":null,"url":null,"abstract":"In the last decade the (big) data-driven science paradigm became a wide-spread reality. However, this approach has some limitations such as a performance dependency on the quality of the data and the lack of reproducibility of the results. In order to enable this reproducibility, many tools such as Workflow Management Systems were developed to formalize process pipelines and capture execution traces. However, interoperating data generated by these solutions became a problem, since most systems adopted proprietary data models. To support interoperability across heterogeneous provenance data, we propose a Service Oriented Architecture with a polystore storage design in which provenance is conceptually represented utilizing the ProvONE model. A wrapper layer is responsible for transforming data described by heterogeneous formats into ProvONE-compliant. Moreover, we propose a query layer that provides location and access transparency to users. Furthermore, we conduct two feasibility studies, showcasing real usecase scenarios. Firstly, we illustrate how two research groups can compare their processes and results. Secondly, we show how our architecture can be used as a queriable provenance repository. We show Polyflow's viability for both scenarios using the Goal-Question-Metric methodology. Finally, we show our solution usability and extensibility appeal by comparing it to similar approaches.","PeriodicalId":348938,"journal":{"name":"Proceedings of the XV Brazilian Symposium on Information Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the XV Brazilian Symposium on Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3330204.3330259","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In the last decade the (big) data-driven science paradigm became a wide-spread reality. However, this approach has some limitations such as a performance dependency on the quality of the data and the lack of reproducibility of the results. In order to enable this reproducibility, many tools such as Workflow Management Systems were developed to formalize process pipelines and capture execution traces. However, interoperating data generated by these solutions became a problem, since most systems adopted proprietary data models. To support interoperability across heterogeneous provenance data, we propose a Service Oriented Architecture with a polystore storage design in which provenance is conceptually represented utilizing the ProvONE model. A wrapper layer is responsible for transforming data described by heterogeneous formats into ProvONE-compliant. Moreover, we propose a query layer that provides location and access transparency to users. Furthermore, we conduct two feasibility studies, showcasing real usecase scenarios. Firstly, we illustrate how two research groups can compare their processes and results. Secondly, we show how our architecture can be used as a queriable provenance repository. We show Polyflow's viability for both scenarios using the Goal-Question-Metric methodology. Finally, we show our solution usability and extensibility appeal by comparing it to similar approaches.