{"title":"机器学习管道可视化的自动生成","authors":"Lei Liu, Wei-Peng Chen, M. Bahrami, M. Prasad","doi":"10.1145/3551349.3559504","DOIUrl":null,"url":null,"abstract":"Visualization is very important for machine learning (ML) pipelines because it can show explorations of the data to inspire data scientists and show explanations of the pipeline to improve understandability. In this paper, we present a novel approach that automatically generates visualizations for ML pipelines by learning visualizations from highly-upvoted Kaggle pipelines. The solution extracts both code and dataset features from these high-quality human-written pipelines and corresponding training datasets, learns the mapping rules from code and dataset features to visualizations using association rule mining (ARM), and finally uses the learned rules to predict visualizations for unseen ML pipelines. The evaluation results show that the proposed solution is feasible and effective to generate visualizations for ML pipelines.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"161 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic Generation of Visualizations for Machine Learning Pipelines\",\"authors\":\"Lei Liu, Wei-Peng Chen, M. Bahrami, M. Prasad\",\"doi\":\"10.1145/3551349.3559504\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visualization is very important for machine learning (ML) pipelines because it can show explorations of the data to inspire data scientists and show explanations of the pipeline to improve understandability. In this paper, we present a novel approach that automatically generates visualizations for ML pipelines by learning visualizations from highly-upvoted Kaggle pipelines. The solution extracts both code and dataset features from these high-quality human-written pipelines and corresponding training datasets, learns the mapping rules from code and dataset features to visualizations using association rule mining (ARM), and finally uses the learned rules to predict visualizations for unseen ML pipelines. The evaluation results show that the proposed solution is feasible and effective to generate visualizations for ML pipelines.\",\"PeriodicalId\":197939,\"journal\":{\"name\":\"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering\",\"volume\":\"161 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3551349.3559504\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3551349.3559504","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Generation of Visualizations for Machine Learning Pipelines
Visualization is very important for machine learning (ML) pipelines because it can show explorations of the data to inspire data scientists and show explanations of the pipeline to improve understandability. In this paper, we present a novel approach that automatically generates visualizations for ML pipelines by learning visualizations from highly-upvoted Kaggle pipelines. The solution extracts both code and dataset features from these high-quality human-written pipelines and corresponding training datasets, learns the mapping rules from code and dataset features to visualizations using association rule mining (ARM), and finally uses the learned rules to predict visualizations for unseen ML pipelines. The evaluation results show that the proposed solution is feasible and effective to generate visualizations for ML pipelines.