{"title":"面向服务的KDD:网格数据挖掘工作流的框架","authors":"M. Lackovic, D. Talia, Paolo Trunfio","doi":"10.1109/ICDMW.2008.28","DOIUrl":null,"url":null,"abstract":"Weka4WS is an extension of the Weka toolkit to support remote execution of data mining tasks as grid services. A first version of Weka4WS supporting concurrent execution of multiple data mining tasks on remote grid nodes has been presented in a previous work. In this paper we present a new version supporting also the composition and execution of data mining workflows on a grid. This new version of Weka4WS extends the KnowledgeFlow component of Weka by allowing the data mining tasks of the workflow to run in parallel on different machines, hence reducing the execution time. Besides the performance improvement, the capability of designing data mining applications as workflows allows to define typical patterns and to reuse them in different contexts. In this paper we describe the architecture of the system, the functionalities of the Weka4WS KnowledgeFlow, and some examples of use with their performance.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Service Oriented KDD: A Framework for Grid Data Mining Workflows\",\"authors\":\"M. Lackovic, D. Talia, Paolo Trunfio\",\"doi\":\"10.1109/ICDMW.2008.28\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Weka4WS is an extension of the Weka toolkit to support remote execution of data mining tasks as grid services. A first version of Weka4WS supporting concurrent execution of multiple data mining tasks on remote grid nodes has been presented in a previous work. In this paper we present a new version supporting also the composition and execution of data mining workflows on a grid. This new version of Weka4WS extends the KnowledgeFlow component of Weka by allowing the data mining tasks of the workflow to run in parallel on different machines, hence reducing the execution time. Besides the performance improvement, the capability of designing data mining applications as workflows allows to define typical patterns and to reuse them in different contexts. In this paper we describe the architecture of the system, the functionalities of the Weka4WS KnowledgeFlow, and some examples of use with their performance.\",\"PeriodicalId\":175955,\"journal\":{\"name\":\"2008 IEEE International Conference on Data Mining Workshops\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Conference on Data Mining Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2008.28\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2008.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Service Oriented KDD: A Framework for Grid Data Mining Workflows
Weka4WS is an extension of the Weka toolkit to support remote execution of data mining tasks as grid services. A first version of Weka4WS supporting concurrent execution of multiple data mining tasks on remote grid nodes has been presented in a previous work. In this paper we present a new version supporting also the composition and execution of data mining workflows on a grid. This new version of Weka4WS extends the KnowledgeFlow component of Weka by allowing the data mining tasks of the workflow to run in parallel on different machines, hence reducing the execution time. Besides the performance improvement, the capability of designing data mining applications as workflows allows to define typical patterns and to reuse them in different contexts. In this paper we describe the architecture of the system, the functionalities of the Weka4WS KnowledgeFlow, and some examples of use with their performance.