Zhao Er-dun, Qi Yong-qiang, Xiang Xing-Xing, Chen Yi
{"title":"基于遗传算法的科学工作流数据放置策略","authors":"Zhao Er-dun, Qi Yong-qiang, Xiang Xing-Xing, Chen Yi","doi":"10.1109/CIS.2012.40","DOIUrl":null,"url":null,"abstract":"The data placement strategy is an important issue in the scientific workflows which is devoted to reducing the data movements while placing datasets in a few data centers according to the data centers' storage capacity and the data dependency. The data placement is proved to be a NP hard problem, and several methods for this problem like K-means clustering algorithm are presented in the literatures. K-means clustering algorithm can reduce the number of data movements very well, but it may result that the datasets will be concentrated to few data centers, and so the loads of data centers greatly deviate from each other. The paper proposes a data placement strategy based on heuristic genetic algorithm to reduce data movements among the data centers while balancing the loads of data centers. The simulation results show that the proposed algorithm can effectively reduce data movements and balance the load of data centers.","PeriodicalId":294394,"journal":{"name":"2012 Eighth International Conference on Computational Intelligence and Security","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"A Data Placement Strategy Based on Genetic Algorithm for Scientific Workflows\",\"authors\":\"Zhao Er-dun, Qi Yong-qiang, Xiang Xing-Xing, Chen Yi\",\"doi\":\"10.1109/CIS.2012.40\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The data placement strategy is an important issue in the scientific workflows which is devoted to reducing the data movements while placing datasets in a few data centers according to the data centers' storage capacity and the data dependency. The data placement is proved to be a NP hard problem, and several methods for this problem like K-means clustering algorithm are presented in the literatures. K-means clustering algorithm can reduce the number of data movements very well, but it may result that the datasets will be concentrated to few data centers, and so the loads of data centers greatly deviate from each other. The paper proposes a data placement strategy based on heuristic genetic algorithm to reduce data movements among the data centers while balancing the loads of data centers. The simulation results show that the proposed algorithm can effectively reduce data movements and balance the load of data centers.\",\"PeriodicalId\":294394,\"journal\":{\"name\":\"2012 Eighth International Conference on Computational Intelligence and Security\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Eighth International Conference on Computational Intelligence and Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIS.2012.40\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Eighth International Conference on Computational Intelligence and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS.2012.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Data Placement Strategy Based on Genetic Algorithm for Scientific Workflows
The data placement strategy is an important issue in the scientific workflows which is devoted to reducing the data movements while placing datasets in a few data centers according to the data centers' storage capacity and the data dependency. The data placement is proved to be a NP hard problem, and several methods for this problem like K-means clustering algorithm are presented in the literatures. K-means clustering algorithm can reduce the number of data movements very well, but it may result that the datasets will be concentrated to few data centers, and so the loads of data centers greatly deviate from each other. The paper proposes a data placement strategy based on heuristic genetic algorithm to reduce data movements among the data centers while balancing the loads of data centers. The simulation results show that the proposed algorithm can effectively reduce data movements and balance the load of data centers.