Dante D. Sánchez-Gallegos, D. Di Luccio, J. L. González-Compeán, R. Montella
{"title":"A Microservice-Based Building Block Approach for Scientific Workflow Engines: Processing Large Data Volumes with DagOnStar","authors":"Dante D. Sánchez-Gallegos, D. Di Luccio, J. L. González-Compeán, R. Montella","doi":"10.1109/SITIS.2019.00066","DOIUrl":null,"url":null,"abstract":"The impact of machine learning algorithms on everyday life is overwhelming until the novel concept of datacracy as a new social paradigm. In the field of computational environmental science and, in particular, of applications of large data science proof of concept on the natural resources management this kind of approaches could make the difference between species surviving to potential extinction and compromised ecological niches. In this scenario, the use of high throughput workflow engines, enabling the management of complex data flows in production is rock solid, as demonstrated by the rise of recent tools as Parsl and DagOnStar. Nevertheless, the availability of dedicated computational resources, although mitigated by the use of cloud computing technologies, could be a remarkable limitation. In this paper, we present a novel and improved version of DagOnStar, enabling the execution of lightweight but recurring computational tasks on the microservice architecture. We present our preliminary results motivating our choices supported by some evaluations and a real-world use case.","PeriodicalId":301876,"journal":{"name":"2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SITIS.2019.00066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
The impact of machine learning algorithms on everyday life is overwhelming until the novel concept of datacracy as a new social paradigm. In the field of computational environmental science and, in particular, of applications of large data science proof of concept on the natural resources management this kind of approaches could make the difference between species surviving to potential extinction and compromised ecological niches. In this scenario, the use of high throughput workflow engines, enabling the management of complex data flows in production is rock solid, as demonstrated by the rise of recent tools as Parsl and DagOnStar. Nevertheless, the availability of dedicated computational resources, although mitigated by the use of cloud computing technologies, could be a remarkable limitation. In this paper, we present a novel and improved version of DagOnStar, enabling the execution of lightweight but recurring computational tasks on the microservice architecture. We present our preliminary results motivating our choices supported by some evaluations and a real-world use case.