{"title":"CNDAS-WF: Cloud Native Data Analysis System Based On Workflow Engine","authors":"Xinshi Zhou, Yuxuan Wu","doi":"10.1145/3584871.3584891","DOIUrl":null,"url":null,"abstract":"With the development of modern big data technology, data size in daily life is expanding rapidly and data relationship is more complex. However, the requirements of data analysis for different resources continuous to surging. Therefore, how to handle a large number of data analysis tasks with complex dependencies efficiently become the challenge. In this paper, we design and implement a cloud native data analysis system based on workflow engine. The system arranges the data analysis tasks, which deployed by containers, with dependency through the workflow engine based on cloud native technology. Flexibility of container cloud makes data analysis procedure effective and efficient. In addition, we designed a workflow engine and an operation and maintenance subsystem for overall system platform anomaly detection. Finally, we verify the effectiveness and efficiency of the system through scientific workflow data. The cloud native data analysis system based on workflow engine has passed all tests and has been applied in small and medium-sized enterprises.","PeriodicalId":173315,"journal":{"name":"Proceedings of the 2023 6th International Conference on Software Engineering and Information Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 6th International Conference on Software Engineering and Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3584871.3584891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of modern big data technology, data size in daily life is expanding rapidly and data relationship is more complex. However, the requirements of data analysis for different resources continuous to surging. Therefore, how to handle a large number of data analysis tasks with complex dependencies efficiently become the challenge. In this paper, we design and implement a cloud native data analysis system based on workflow engine. The system arranges the data analysis tasks, which deployed by containers, with dependency through the workflow engine based on cloud native technology. Flexibility of container cloud makes data analysis procedure effective and efficient. In addition, we designed a workflow engine and an operation and maintenance subsystem for overall system platform anomaly detection. Finally, we verify the effectiveness and efficiency of the system through scientific workflow data. The cloud native data analysis system based on workflow engine has passed all tests and has been applied in small and medium-sized enterprises.