Proceedings of the Sixth ACM Symposium on Cloud Computing最新文献_第3页

Microsoft azure SQL database telemetry 微软azure SQL数据库遥测

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806845

Willis Lang, Frank Bertsch, D. DeWitt, Nigel Ellis

引用次数: 19

Domino: understanding wide-area, asynchronous event causality in web applications Domino:理解web应用程序中的广域异步事件因果关系

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806940

Ding Li, James W. Mickens, Suman Nath, Lenin Ravindranath

引用次数: 9

dJay: enabling high-density multi-tenancy for cloud gaming servers with dynamic cost-benefit GPU load balancing dJay:为云游戏服务器提供高密度多租户，具有动态成本效益的GPU负载平衡

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806942

Sergey Grizan, David Chu, A. Wolman, Roger Wattenhofer

{"title":"dJay: enabling high-density multi-tenancy for cloud gaming servers with dynamic cost-benefit GPU load balancing","authors":"Sergey Grizan, David Chu, A. Wolman, Roger Wattenhofer","doi":"10.1145/2806777.2806942","DOIUrl":"https://doi.org/10.1145/2806777.2806942","url":null,"abstract":"In cloud gaming, servers perform remote rendering on behalf of thin clients. Such a server must deliver sufficient frame rate (at least 30fps) to each of its clients. At the same time, each client desires an immersive experience, and therefore the server should also provide the best graphics quality possible to each client. Statically provisioning time slices of the server GPU for each client suffers from severe underutilization because clients can come and go, and scenes that the clients need rendered can vary greatly in terms of GPU resource usage over time. In this work, we present dJay, a utility-maximizing cloud gaming server that dynamically tunes client GPU rendering workloads in order to 1) ensure all clients get satisfactory frame rate, and 2) provide the best possible graphics quality across clients. To accomplish this, we develop three main components. First, we build an online profiler that collects key cost and benefit data, and distills the data into a reusable regression model. Second, we build an online utility optimizer that uses the regression model to tune GPU workloads for better graphics quality. The optimizer solves the Multiple Choice Knapsack problem. We demonstrate dJay on two high quality commercial games, Doom 3 and Fable 3. Our results show that when compared to a static configuration, we can respond much better to peaks and troughs, achieving up to four times the multi-tenant density on a single server while offering clients the best possible graphics quality.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117192255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Zorro: zero-cost reactive failure recovery in distributed graph processing 佐罗:分布式图处理中的零成本无功故障恢复

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806934

Mayank Pundir, Luke M. Leslie, Indranil Gupta, R. Campbell

{"title":"Zorro: zero-cost reactive failure recovery in distributed graph processing","authors":"Mayank Pundir, Luke M. Leslie, Indranil Gupta, R. Campbell","doi":"10.1145/2806777.2806934","DOIUrl":"https://doi.org/10.1145/2806777.2806934","url":null,"abstract":"Distributed graph processing systems largely rely on proactive techniques for failure recovery. Unfortunately, these approaches (such as checkpointing) entail a significant overhead. In this paper, we argue that distributed graph processing systems should instead use a reactive approach to failure recovery. The reactive approach trades off completeness of the result (generating a slightly inaccurate result) while reducing the overhead during failure-free execution to zero. We build a system called Zorro that imbues this reactive approach, and integrate Zorro into two graph processing systems -- PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication inherent in today's graph processing systems to quickly rebuild the state of failed servers. Experiments using real-world graphs demonstrate that Zorro is able to recover over 99% of the graph state when 6--12% of the servers fail, and between 87--95% when half the cluster fails. Furthermore, using various graph processing algorithms, Zorro incurs little to no accuracy loss in all experimental failure scenarios, and achieves a worst-case accuracy of 97%.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117237552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Response time service level agreements for cloud-hosted web applications 云托管web应用程序的响应时间服务水平协议

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806842

Hiranya Jayathilaka, C. Krintz, R. Wolski

{"title":"Response time service level agreements for cloud-hosted web applications","authors":"Hiranya Jayathilaka, C. Krintz, R. Wolski","doi":"10.1145/2806777.2806842","DOIUrl":"https://doi.org/10.1145/2806777.2806842","url":null,"abstract":"Cloud computing is a successful model for hosting web-facing applications that are accessed by their users as services. While clouds currently offer Service Level Agreements (SLAs) containing guarantees of availability, they do not make performance guarantees for deployed applications. In this work we present Cerebro -- a system for establishing statistical guarantees of application response time in cloud settings. Cerebro combines off-line static analysis of application control structure with on-line cloud performance monitoring and statistical forecasting to predict bounds on the response time of web-facing application programming interfaces (APIs). Because Cerebro does not require application instrumentation or per-application cloud benchmarking, it does not impose any runtime overhead, and is suitable for use at cloud scales. Also, because the bounds are statistical, they are appropriate for use as the basis for SLAs between cloud-hosted applications and their users. We investigate the correctness of Cerebro predictions, the tightness of their bounds, and the duration over which the bounds persist in both Google App Engine and AppScale (public and private cloud platforms respectively). We also detail the effectiveness of our SLA prediction methodology compared to other performance bound estimation methods based on simple statistical analysis.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131672403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Potassium: penetration testing as a service 钾:渗透测试即服务

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806935

Richard Li, Dallin Abendroth, Xing Lin, Yuankai Guo, H. Baek, E. Eide, R. Ricci, J. Merwe

{"title":"Potassium: penetration testing as a service","authors":"Richard Li, Dallin Abendroth, Xing Lin, Yuankai Guo, H. Baek, E. Eide, R. Ricci, J. Merwe","doi":"10.1145/2806777.2806935","DOIUrl":"https://doi.org/10.1145/2806777.2806935","url":null,"abstract":"Penetration testing---the process of probing a deployed system for security vulnerabilities---involves a fundamental tension. If one tests a production system, there is a real danger of collateral damage; this is particularly true for systems hosted in the cloud due to the presence of other tenants. If one tests against a separate system brought up to model the live one, the dynamic state of the production system is not captured, and the value of the test is reduced. This paper presents Potassium, which provides penetration testing as a service (PTaaS) and resolves this tension for system owners, penetration testers, and cloud providers. Potassium uses techniques originally developed for live migration of virtual machines to clone them instead, capturing their full disk, memory, and network state. Potassium isolates the cloned system from the rest of the cloud, providing confidence that side effects of the penetration test will not harm other tenants. The penetration tester effectively owns the cloned system, allowing testing to be more thorough, efficient, and automatable. Experiments with our Potassium prototype show that PTaaS can detect real-world vulnerabilities while having minimal impact on cloud-based production systems.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115736524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

DSwitch: a dual mode direct and network attached disk DSwitch:双模式直连和网络挂载磁盘

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806850

Quanlu Zhang, Yafei Dai, Lintao Zhang

{"title":"DSwitch: a dual mode direct and network attached disk","authors":"Quanlu Zhang, Yafei Dai, Lintao Zhang","doi":"10.1145/2806777.2806850","DOIUrl":"https://doi.org/10.1145/2806777.2806850","url":null,"abstract":"Putting computers into low power mode (e.g., suspend-to-RAM) could potentially save significant amount of power when the computers are not in use. Unfortunately, this is often infeasible in practice because data stored on the computers (i.e., directly attached disks, DAS) might need to be accessed by others. Separating storage from computation by attaching storage on the network (e.g., NAS and SAN) could potentially solve this problem, at the cost of lower performance, more network congestion, increased peak power consumption, and higher equipment cost. Though DAS does not suffer these problems, it is not flexible for power saving. In this paper, we present DSwitch, an architecture that, depending on the workload, allows a disk to be attached either directly or through network. We design flexible workload migration based on DSwitch, and show that a wide variety of applications in both data center and home/office settings can be well supported. The experiments demonstrate that our prototype DSwitch achieves a power savings of 91.9% to 97.5% when a disk is in low power network attached mode, while incurring no performance degradation and minimal power overhead when it is in high performance directly attached mode.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116888432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards a comprehensive performance model of virtual machine live migration 迈向一个全面的虚拟机动态迁移性能模型

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806838

Senthil Nathan, U. Bellur, Purushottam Kulkarni

引用次数: 54

Scheduling jobs across geo-distributed datacenters 跨地理分布数据中心调度作业

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806780

Chien-Chun Hung, L. Golubchik, Minlan Yu

引用次数: 132

Energy proportionality and workload consolidation for latency-critical applications 延迟关键型应用程序的能量比例和工作负载整合

Proceedings of the Sixth ACM Symposium on Cloud Computing Pub Date : 2015-08-27 DOI: 10.1145/2806777.2806848

G. Prekas, Mia Primorac, A. Belay, C. Kozyrakis, Edouard Bugnion

{"title":"Energy proportionality and workload consolidation for latency-critical applications","authors":"G. Prekas, Mia Primorac, A. Belay, C. Kozyrakis, Edouard Bugnion","doi":"10.1145/2806777.2806848","DOIUrl":"https://doi.org/10.1145/2806777.2806848","url":null,"abstract":"Energy proportionality and workload consolidation are important objectives towards increasing efficiency in large-scale datacenters. Our work focuses on achieving these goals in the presence of applications with μs-scale tail latency requirements. Such applications represent a growing subset of datacenter workloads and are typically deployed on dedicated servers, which is the simplest way to ensure low tail latency across all loads. Unfortunately, it also leads to low energy efficiency and low resource utilization during the frequent periods of medium or low load. We present the OS mechanisms and dynamic control needed to adjust core allocation and voltage/frequency settings based on the measured delays for latency-critical workloads. This allows for energy proportionality and frees the maximum amount of resources per server for other background applications, while respecting service-level objectives. Monitoring hardware queue depths allows us to detect increases in queuing latencies. Carefully coordinated adjustments to the NIC's packet redirection table enable us to reassign flow groups between the threads of a latency-critical application in milliseconds without dropping or reordering packets. We compare the efficiency of our solution to the Pareto-optimal frontier of 224 distinct static configurations. Dynamic resource control saves 44%--54% of processor energy, which corresponds to 85%--93% of the Pareto-optimal upper bound. Dynamic resource control also allows background jobs to run at 32%--46% of their standalone throughput, which corresponds to 82%--92% of the Pareto bound.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115739050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 71