2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing最新文献

筛选
英文 中文
Non-cooperative Scheduling Considered Harmful in Collaborative Volunteer Computing Environments 协同志愿计算环境中被认为有害的非合作调度
Bruno Donassolo, Arnaud Legrand, Cl'udio Geyer
{"title":"Non-cooperative Scheduling Considered Harmful in Collaborative Volunteer Computing Environments","authors":"Bruno Donassolo, Arnaud Legrand, Cl'udio Geyer","doi":"10.1109/CCGrid.2011.34","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.34","url":null,"abstract":"Advances in inter-networking technology and computing components have enabled Volunteer Computing (VC) systems that allows volunteers to donate their computers' idle CPU cycles to a given project. BOINC is the most popular VC infrastructure today with over 580,000 hosts that deliver over 2,300 TeraFLOP per day. BOINC projects usually have hundreds of thousands of independent tasks and are interested in overall throughput. Each project has its own server which is responsible for distributing work units to clients, recovering results and validating them. The BOINC scheduling algorithms are complex and have been used for many years now. Their efficiency and fairness have been assessed in the context of throughput oriented projects. Yet, recently, burst projects, with fewer tasks and interested in response time, have emerged. Many works have proposed new scheduling algorithms to optimize individual response time but their use may be problematic in presence of other projects. In this article we show that the commonly used BOINC scheduling algorithms are unable to enforce fairness and project isolation. Burst projects may dramatically impact the performance of all other projects (burst or non-burst). To study such interactions, we perform a detailed, multi-player and multi-objective game theoretic study. Our analysis and experiments provide a good understanding on the impact of the different scheduling parameters and show that the non-cooperative optimization may result in inefficient and unfair share of the resources.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115999148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Resource and Revenue Sharing with Coalition Formation of Cloud Providers: Game Theoretic Approach 云供应商联盟形成中的资源和收益共享:博弈论方法
D. Niyato, A. Vasilakos, K. Zhu
{"title":"Resource and Revenue Sharing with Coalition Formation of Cloud Providers: Game Theoretic Approach","authors":"D. Niyato, A. Vasilakos, K. Zhu","doi":"10.1109/CCGrid.2011.30","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.30","url":null,"abstract":"In cloud computing, multiple cloud providers can cooperate to establish a resource pool to support internal users and to offer services to public cloud users. In this paper, we study the cooperative behavior of multiple cloud providers. The hierarchical cooperative game model is presented. First, given a group (i.e., coalition) of cloud providers, the resource and revenue sharing of a resource pool is presented. To obtain the solution, we develop the stochastic linear programming game model which takes the uncertainty of internal users from each provider into account. We show that the solution of the stochastic linear programming game is the core of cooperation. Second, we analyze the stability of the coalition formation among cloud providers based on coalitional game. The dynamic model of coalition formation is used to obtain stable coalitional structures. The resource and revenue sharing and coalition formation of cloud providers are intertwined in which the proposed hierarchical cooperative game model can be used to obtain the solution. An extensive performance evaluation is performed to investigate the decision making of cloud providers when cooperation can lead to the higher profit.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114418190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 157
A Robust Communication Framework for Parallel Execution on Volunteer PC Grids 志愿PC网格并行执行的鲁棒通信框架
Eshwar Rohit, Hien Nguyen, N. Kanna, J. Subhlok, E. Gabriel, Qian Wang, M. Cheung, David P. Anderson
{"title":"A Robust Communication Framework for Parallel Execution on Volunteer PC Grids","authors":"Eshwar Rohit, Hien Nguyen, N. Kanna, J. Subhlok, E. Gabriel, Qian Wang, M. Cheung, David P. Anderson","doi":"10.1109/CCGrid.2011.72","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.72","url":null,"abstract":"Volunteer PC grids represent massive computation capacity at a low cost, but are challenging to employ for parallel computing because of variable and unpredictable performance and availability. A communicating parallel program must employ explicit redundancy, or implicit redundancy with uncoordinated checkpoint-restart to make continuous forward progress in such an unreliable environment. A communication model based on one-sided Put/Get calls to an abstract global shared space is a good match as processes can execute their communication operations independently and asynchronously. However, no existing system is designed for redundant communicating processes. The key problem is that a single logical operation that impacts the global program state may be executed by different instances of the same process at different times leading to semantic inconsistency. This paper presents the design, execution model, implementation, and usage of {em Volpex}, a communication layer for robust execution on volunteer PC grids. The research leads to a practical way to employ idle PCs for latency tolerant parallel computing applications.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125578717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Inferring Network Topologies in Infrastructure as a Service Cloud 基础设施即服务云中的网络拓扑推断
Dominic Battré, Natalia Frejnik, Siddhant Goel, O. Kao, Daniel Warneke
{"title":"Inferring Network Topologies in Infrastructure as a Service Cloud","authors":"Dominic Battré, Natalia Frejnik, Siddhant Goel, O. Kao, Daniel Warneke","doi":"10.1109/CCGrid.2011.79","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.79","url":null,"abstract":"Infrastructure as a Service (IaaS) clouds are gaining increasing popularity as a platform for distributed computations. The virtualization layers of those clouds offer new possibilities for rapid resource provisioning, but also hide aspects of the underlying IT infrastructure which have often been exploited in classic cluster environments. One of those hidden aspects is the network topology, i.e. the way the rented virtual machines are physically interconnected inside the cloud. We propose an approach to infer the network topology connecting a set of virtual machines in IaaS clouds and exploit it for data-intensive distributed applications. Our inference approach relies on delay-based end-to-end measurements and can be combined with traditional IP-level topology information, if available. We evaluate the inference accuracy using the popular hyper visors KVM as well as XEN and highlight possible performance gains for distributed applications.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"595 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123425983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Utilizing "Opaque" Resources for Revenue Enhancement on Clouds and Grids 利用“不透明”资源在云和网格上提高收益
Jose Orlando Melendez, S. Majumdar
{"title":"Utilizing \"Opaque\" Resources for Revenue Enhancement on Clouds and Grids","authors":"Jose Orlando Melendez, S. Majumdar","doi":"10.1109/CCGrid.2011.58","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.58","url":null,"abstract":"Clouds and grids are distributed resource infrastructures that are subjected to both On Demand (OD) as well as Advance reservation (AR) requests. This paper focuses on the revenue that can be earned on cloud and grid systems using different matchmaking strategies that map requests to resources. Existing research on matchmaking uses a priori knowledge of the local scheduling policy used at the resources. However, in these dynamic and heterogeneous systems having detailed a priori knowledge about all the resources is not always feasible. Thus, traditional matchmaking strategies cannot use the resources for which the local scheduling policies are unknown. This paper discusses how a novel matchmaking strategy that does not use knowledge of the local scheduling policies used at the resources may be deployed in order to utilize these un-used resources and improve the revenue earned by the service providing enterprise. Based on analytic bounds and simulation, insights gained into system behaviour and revenue earned are presented.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"87 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120848556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Scalable Method for Signalling Dynamic Reconfiguration Events with OpenSM OpenSM动态重配置事件信令的可伸缩方法
Wei Lin Guay, Sven-Arne Reinemo
{"title":"A Scalable Method for Signalling Dynamic Reconfiguration Events with OpenSM","authors":"Wei Lin Guay, Sven-Arne Reinemo","doi":"10.1109/CCGrid.2011.48","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.48","url":null,"abstract":"Rerouting around faulty components, on-the-fly policy changes, and migration of jobs all require reconfiguration of data structures in the Queue Pairs residing in the hosts on an InfiniBand cluster. In addition to a proper implementation at the host, the subnet manager needs to implement a scalable method for signaling reconfiguration events to the hosts. In this paper we propose and evaluate three different implementations for signalling dynamic reconfiguration events with OpenSM. Through our evaluation we demonstrate a scalable solution for signalling host-side reconfiguration events in an InfiniBand network based on an example where dynamic network reconfiguration combined with a topology-agnostic routing function is used to avoid malfunctioning components. Through measurements on our test-cluster and an analytical study we show that our best proposal reduces reconfiguration latency by more than 90%and in certain situations eliminates it completely. Furthermore, the processing overhead in the subnet manager is shown to be minimal.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129473031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Autonomic SLA-Driven Provisioning for Cloud Applications 云应用程序的自主sla驱动配置
N. Bonvin, Thanasis G. Papaioannou, K. Aberer
{"title":"Autonomic SLA-Driven Provisioning for Cloud Applications","authors":"N. Bonvin, Thanasis G. Papaioannou, K. Aberer","doi":"10.1109/CCGrid.2011.24","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.24","url":null,"abstract":"Significant achievements have been made for automated allocation of cloud resources. However, the performance of applications may be poor in peak load periods, unless their cloud resources are dynamically adjusted. Moreover, although cloud resources dedicated to different applications are virtually isolated, performance fluctuations do occur because of resource sharing, and software or hardware failures (e.g. unstable virtual machines, power outages, etc.). In this paper, we propose a decentralized economic approach for dynamically adapting the cloud resources of various applications, so as to statistically meet their SLA performance and availability goals in the presence of varying loads or failures. According to our approach, the dynamic economic fitness of a Web service determines whether it is replicated or migrated to another server, or deleted. The economic fitness of a Web service depends on its individual performance constraints, its load, and the utilization of the resources where it resides. Cascading performance objectives are dynamically calculated for individual tasks in the application workflow according to the user requirements. By fully implementing our framework, we experimentally proved that our adaptive approach statistically meets the performance objectives under peak load periods or failures, as opposed to static resource settings.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128236985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
On the Scheduling of Checkpoints in Desktop Grids 桌面网格中检查点的调度研究
M. Bouguerra, Derrick Kondo, D. Trystram
{"title":"On the Scheduling of Checkpoints in Desktop Grids","authors":"M. Bouguerra, Derrick Kondo, D. Trystram","doi":"10.1109/CCGrid.2011.63","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.63","url":null,"abstract":"Frequent resources failures are a major challenge for the rapid completion of batch jobs. Check pointing and migration is one approach to accelerate job completion avoiding deadlock. We study the problem of scheduling checkpoints of sequential jobs in the context of Desktop Grids, consisting of volunteered distributed resources. We craft a checkpoint scheduling algorithm that is provably optimal for discrete time when failures obey any general probability distribution. We show using simulations with parameters based on real-world systems that this optimal strategy scales and outperforms other strategies significantly in terms of check pointing costs and batch completion times.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131057450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Self-Healing Distributed Scheduling Platform 自修复分布式调度平台
M. Frîncu, Norha M. Villegas, D. Petcu, H. Müller, Romain Rouvoy
{"title":"Self-Healing Distributed Scheduling Platform","authors":"M. Frîncu, Norha M. Villegas, D. Petcu, H. Müller, Romain Rouvoy","doi":"10.1109/CCGrid.2011.23","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.23","url":null,"abstract":"Distributed systems require effective mechanisms to manage the reliable provisioning of computational resources from different and distributed providers. Moreover, the dynamic environment that affects the behaviour of such systems and the complexity of these dynamics demand autonomous capabilities to ensure the behaviour of distributed scheduling platforms and to achieve business and user objectives. In this paper we propose a self-adaptive distributed scheduling platform composed of multiple agents implemented as intelligent feedback control loops to support policy-based scheduling and expose self-healing capabilities. Our platform leverages distributed scheduling processes by (i) allowing each provider to maintain its own internal scheduling process, and (ii) implementing self-healing capabilities based on agent module recovery. Simulated tests are performed to determine the optimal number of agents to be used in the negotiation phase without affecting the scheduling cost function. Test results on a real-life platform are presented to evaluate recovery times and optimize platform parameters.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132451794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Performance under Failures of MapReduce Applications MapReduce应用故障情况下的性能
Hui Jin, Kan Qiao, Xian-He Sun, Ying Li
{"title":"Performance under Failures of MapReduce Applications","authors":"Hui Jin, Kan Qiao, Xian-He Sun, Ying Li","doi":"10.1109/CCGrid.2011.84","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.84","url":null,"abstract":"The MapReduce programming paradigm is gaining more and more popularity in recent years due to its ability in supporting easy programming, data distribution, as well as fault tolerance. Failure is an unwanted but inevitable fact that all large-scale parallel computing systems have to face with. MapReduce introduces a novel data replication and task reexecution strategy for fault tolerance. This study intends to lead a better understanding of such fault tolerance mechanisms. In particular, we build a stochastic performance model to quantify the impact of failures on MapReduce applications and to investigate its effectiveness under different computing environments. Simulations also have been carried out to verify the accuracy of the proposed model. Our results show that data replication is an effective approach even when failure rate is high, and the task migration mechanism of MapReduce works well in balancing the reliability difference among individual nodes. This work provides a theoretical foundation for optimizing large-scale MapReduce applications, especially when fault tolerance is the concern.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129194156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信