Zhao Ru-tao, Wang Jing, Chen Gao-jian, Li Qian-wen, Yuan Yun-jing
{"title":"A Machine Learning Pipeline Generation Approach for Data Analysis","authors":"Zhao Ru-tao, Wang Jing, Chen Gao-jian, Li Qian-wen, Yuan Yun-jing","doi":"10.1109/ICCC51575.2020.9345123","DOIUrl":null,"url":null,"abstract":"Data analysis requires a high level of expertise for domain workers, and AutoML aims to make these decisions in an automated way. But it is still a difficult problem to automatically generate machine learning pipelines with high performance in acceptable time. This paper presents a DFSR (Data Feature and Service Association) approach to automatically generating machine learning pipelines utilizing data features and service associations. The experimental results showed that the performance of the generated pipelines reached the satisfactory level of current AutoML tools, and the time consumption is reduced to the minute level.","PeriodicalId":386048,"journal":{"name":"2020 IEEE 6th International Conference on Computer and Communications (ICCC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 6th International Conference on Computer and Communications (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCC51575.2020.9345123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data analysis requires a high level of expertise for domain workers, and AutoML aims to make these decisions in an automated way. But it is still a difficult problem to automatically generate machine learning pipelines with high performance in acceptable time. This paper presents a DFSR (Data Feature and Service Association) approach to automatically generating machine learning pipelines utilizing data features and service associations. The experimental results showed that the performance of the generated pipelines reached the satisfactory level of current AutoML tools, and the time consumption is reduced to the minute level.