数据分析的机器学习管道生成方法

2020 IEEE 6th International Conference on Computer and Communications (ICCC) Pub Date : 2020-12-11 DOI:10.1109/ICCC51575.2020.9345123

Zhao Ru-tao, Wang Jing, Chen Gao-jian, Li Qian-wen, Yuan Yun-jing

{"title":"数据分析的机器学习管道生成方法","authors":"Zhao Ru-tao, Wang Jing, Chen Gao-jian, Li Qian-wen, Yuan Yun-jing","doi":"10.1109/ICCC51575.2020.9345123","DOIUrl":null,"url":null,"abstract":"Data analysis requires a high level of expertise for domain workers, and AutoML aims to make these decisions in an automated way. But it is still a difficult problem to automatically generate machine learning pipelines with high performance in acceptable time. This paper presents a DFSR (Data Feature and Service Association) approach to automatically generating machine learning pipelines utilizing data features and service associations. The experimental results showed that the performance of the generated pipelines reached the satisfactory level of current AutoML tools, and the time consumption is reduced to the minute level.","PeriodicalId":386048,"journal":{"name":"2020 IEEE 6th International Conference on Computer and Communications (ICCC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Machine Learning Pipeline Generation Approach for Data Analysis\",\"authors\":\"Zhao Ru-tao, Wang Jing, Chen Gao-jian, Li Qian-wen, Yuan Yun-jing\",\"doi\":\"10.1109/ICCC51575.2020.9345123\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data analysis requires a high level of expertise for domain workers, and AutoML aims to make these decisions in an automated way. But it is still a difficult problem to automatically generate machine learning pipelines with high performance in acceptable time. This paper presents a DFSR (Data Feature and Service Association) approach to automatically generating machine learning pipelines utilizing data features and service associations. The experimental results showed that the performance of the generated pipelines reached the satisfactory level of current AutoML tools, and the time consumption is reduced to the minute level.\",\"PeriodicalId\":386048,\"journal\":{\"name\":\"2020 IEEE 6th International Conference on Computer and Communications (ICCC)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 6th International Conference on Computer and Communications (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCC51575.2020.9345123\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 6th International Conference on Computer and Communications (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCC51575.2020.9345123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据分析需要领域工作者具备高水平的专业知识，而AutoML旨在以自动化的方式做出这些决策。但如何在可接受的时间内自动生成高性能的机器学习管道仍然是一个难题。本文提出了一种利用数据特征和服务关联自动生成机器学习管道的DFSR(数据特征和服务关联)方法。实验结果表明，所生成的管道的性能达到了现有AutoML工具的满意水平，并且将耗时降低到分钟级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Machine Learning Pipeline Generation Approach for Data Analysis

Data analysis requires a high level of expertise for domain workers, and AutoML aims to make these decisions in an automated way. But it is still a difficult problem to automatically generate machine learning pipelines with high performance in acceptable time. This paper presents a DFSR (Data Feature and Service Association) approach to automatically generating machine learning pipelines utilizing data features and service associations. The experimental results showed that the performance of the generated pipelines reached the satisfactory level of current AutoML tools, and the time consumption is reduced to the minute level.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 6th International Conference on Computer and Communications (ICCC)

自引率

0.00%

发文量