{"title":"DIAO: A Scheme of Cooperative Data Distribution for Grid-Enabled Data Analysis Applications","authors":"Tong Guo, S. Jarvis","doi":"10.1109/ICCGI.2008.26","DOIUrl":null,"url":null,"abstract":"A three-tier data distribution framework is proposed for grid-enabled data analysis applications. This framework is based on existing resource reservation services: the analytical tasks which this framework serves, and their input data, are assigned by an existing performance-aware scheduling system to computational hosts termed 'calculators'. A so-called 'guider' organizes the data delivery from the source server to the 'coordinators', and every coordinator schedules the data propagation from itself to all its calculators. This scheme, which we call DIAO, is peer-to-peer (P2P), in that after downloading a data item, a host may behave as its server. Theoretical modeling reveals that the duration of data distribution depends on the effective utilization of the bottleneck resource in the overlay, and we develop three heuristics to minimize this duration, subject to the available resources. The ability of DIAO to exploit limited resources is demonstrated through simulation.","PeriodicalId":367280,"journal":{"name":"2008 The Third International Multi-Conference on Computing in the Global Information Technology (iccgi 2008)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 The Third International Multi-Conference on Computing in the Global Information Technology (iccgi 2008)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCGI.2008.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A three-tier data distribution framework is proposed for grid-enabled data analysis applications. This framework is based on existing resource reservation services: the analytical tasks which this framework serves, and their input data, are assigned by an existing performance-aware scheduling system to computational hosts termed 'calculators'. A so-called 'guider' organizes the data delivery from the source server to the 'coordinators', and every coordinator schedules the data propagation from itself to all its calculators. This scheme, which we call DIAO, is peer-to-peer (P2P), in that after downloading a data item, a host may behave as its server. Theoretical modeling reveals that the duration of data distribution depends on the effective utilization of the bottleneck resource in the overlay, and we develop three heuristics to minimize this duration, subject to the available resources. The ability of DIAO to exploit limited resources is demonstrated through simulation.