ACM SIGOPS Oper. Syst. Rev.最新文献_第10页

Dependability issues in cloud computing: extended papers from the 1st international workshop on dependability issues in cloud computing -- DISCCO 云计算中的可靠性问题:来自第一届云计算可靠性问题国际研讨会——DISCCO的扩展论文

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-07-23 DOI: 10.1145/2506164.2506169

M. Correia, N. Mittal

引用次数: 0

Multi-core systems modeling for formal verification of parallel algorithms 并行算法形式化验证的多核系统建模

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-07-23 DOI: 10.1145/2506164.2506174

M. Desnoyers, P. McKenney, M. Dagenais

引用次数: 14

Our troubles with Linux Kernel upgrades and why you should care 我们在Linux内核升级中遇到的麻烦以及为什么你应该关心

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-07-23 DOI: 10.1145/2506164.2506175

Ashif S. Harji, P. Buhr, Tim Brecht

{"title":"Our troubles with Linux Kernel upgrades and why you should care","authors":"Ashif S. Harji, P. Buhr, Tim Brecht","doi":"10.1145/2506164.2506175","DOIUrl":"https://doi.org/10.1145/2506164.2506175","url":null,"abstract":"Linux and other open-source Unix variants (and their distributors) provide researchers with full-fledged operating systems that are widely used. However, due to their complexity and rapid development, care should be exercised when using these operating systems for performance experiments, especially in systems research. In particular, the size and continual evolution of the Linux code-base makes it difficult to understand, and as a result, decipher and explain the reasons for performance improvements. In addition, the rapid kernel development cycle means that experimental results can be viewed as out of date, or meaningless, very quickly. We demonstrate that this viewpoint is incorrect because kernel changes can and have introduced both bugs and performance degradations.\u0000 This paper describes some of our experiences using Linux and FreeBSD as platforms for conducting performance evaluations and some performance regressions we have found. Our results show, these performance regressions can be serious (e.g., repeating identical experiments results in large variability in results) and long lived despite having a large negative effect on performance (one problem was present for more than 3 years). Based on these experiences, we argue: it is sometimes reasonable to use an older kernel version, experimental results need careful analysis to explain why a performance effect occurs, and publishing papers validating prior research is essential.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"12 11","pages":"66-72"},"PeriodicalIF":0.0,"publicationDate":"2013-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72569576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Boosting energy efficiency with mirrored data block replication policy and energy scheduler 使用镜像数据块复制策略和能源调度器提高能源效率

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-07-23 DOI: 10.1145/2506164.2506171

Sara Arbab Yazd, S. Venkatesan, N. Mittal

引用次数: 16

Dagstuhl seminar report: security and dependability for federated cloud platforms, 2012 Dagstuhl研讨会报告:联合云平台的安全性和可靠性，2012

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-07-23 DOI: 10.1145/2506164.2506166

A. Shraer, R. Kapitza

{"title":"Dagstuhl seminar report: security and dependability for federated cloud platforms, 2012","authors":"A. Shraer, R. Kapitza","doi":"10.1145/2506164.2506166","DOIUrl":"https://doi.org/10.1145/2506164.2506166","url":null,"abstract":"The Security and Dependability for Federated Cloud Platforms seminar [3] was held in Schloss Dagstuhl1, July 8-13, 2012. Schloss Dagstuhl, also known as the Leibniz-Zentrum fur Informatik, is a renovated castle located in the scenic countryside of Saarland, Germany. Dagstuhl offers a unique concept: 30-45 participants, all of whom receive invitations from Dagstuhl on behalf of the organizers, stay in the castle during the seminar (typically 3-5 days) enjoying all that the castle has to offer. Amongst other things this includes an impressive library, a music room full of musical instruments, an excellent restaurant, as well as a wine cellar where a variety of cheese, wine and local beer is available daily for the lateevening social meetings. The organizers of our seminar Matthias Schunter, Marc Shapiro, Paulo Verissimo and Michael Waidner targeted a four day event and gathered a mixed group of senior, established and promising young researches from all over the world. The program of the seminar was not set in advance, but most participants provided an abstract [3] and gave short talks on recent or ongoing work. The main purpose of these talks was generating discussion and collaboration among the participants. During some","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"215 1","pages":"4-5"},"PeriodicalIF":0.0,"publicationDate":"2013-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75591651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Banking on decoupling: budget-driven sustainability for HPC applications on auction-based clouds 基于解耦的银行:基于拍卖的云上HPC应用的预算驱动可持续性

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-07-23 DOI: 10.1145/2506164.2506172

Moussa Taifi

{"title":"Banking on decoupling: budget-driven sustainability for HPC applications on auction-based clouds","authors":"Moussa Taifi","doi":"10.1145/2506164.2506172","DOIUrl":"https://doi.org/10.1145/2506164.2506172","url":null,"abstract":"Cloud providers are auctioning their excess capacity using dynamically priced virtual instances. These spot instances provide significant savings compared to on-demand or fixed price instances. The users willing to use these resources are asked to provide a maximum bid price per hour, and the cloud provider runs the instances as long as the market price is below the user's bid price. By using such resources, the users are exposed explicitly to failures, and need to adapt their applications to provide some level of fault tolerance. In this paper, we expose the effect of bidding in the case of virtual HPC clusters composed of spot instances. We describe the interesting effect of uniform versus non-uniform bidding in terms of both the failure rate and the failure model. We propose an initial attempt to deal with the problem of predicting the runtime of a parallel application under various bidding strategies and various system parameters. We describe the relationship between bidding strategies and programming models, and we build a preliminary optimization model that uses real price traces from Amazon Web Services as inputs, as well as instrumented values related to the processing and network capacities of cluster instances on the EC2 services. Our results show preliminary insights into the relationship between non-uniform bidding and application scaling strategies.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"13 1","pages":"41-50"},"PeriodicalIF":0.0,"publicationDate":"2013-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90113245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Bridging the gap between applications and networks in data centers 弥合数据中心中应用程序和网络之间的差距

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-01-29 DOI: 10.1145/2433140.2433143

Paolo Costa

{"title":"Bridging the gap between applications and networks in data centers","authors":"Paolo Costa","doi":"10.1145/2433140.2433143","DOIUrl":"https://doi.org/10.1145/2433140.2433143","url":null,"abstract":"Modern data centers host tens (if not hundreds) of thousands of servers and are used by companies such as Amazon, Google, and Microsoft to provide online services to millions of individuals distributed across the Internet. They use commodity hardware and their network infrastructure adopts principles evolved from enterprise and Internet networking. Applications use UDP datagrams or TCP sockets as the primary interface to other applications running inside the data center. This effectively isolates the network from the end-systems, which then have little control over how the network handles packets. Likewise, the network has limited visibility on the application logic. An application injects a packet with a destination address and the network just delivers the packet. Network and applications effectively treat each other as black-boxes. This strict separation between applications and networks (also referred to as dumb network) is a direct outcome of the so-called end-to-end argument [49] and has arguably been one of the main reasons why the Internet has been capable of evolving from a small research project to planetary scale, supporting a multitude of different hardware and network technologies as well as a slew of very diverse applications, and using networks owned by competing ISPs. Despite being so instrumental in the success of the Internet, this black-box design is also one of the root causes of inefficiencies in large-scale data centers. Given the little control and visibility over network resources, applications need to use low-level hacks, e.g., to extract network properties (e.g., using traceroute and IP addresses to infer the network topology) and to prioritize traffic (e.g., increasing the number of TCP flows used by an application to increase its bandwidth share). Further, a simple functionality like multicast or anycast routing is not available and developers must resort to application-level overlays. This, however, leads to inefficiencies as typically multiple logical links are mapped to the same physical link, significantly reducing application throughput. Even with perfect knowledge of the underlying topology, there is still the constraint that servers","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"28 1","pages":"3-8"},"PeriodicalIF":0.0,"publicationDate":"2013-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77996551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Toward accurate and practical network tomography 走向准确实用的网络断层扫描

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-01-29 DOI: 10.1145/2433140.2433146

Denisa Ghita, K. Argyraki, Patrick Thiran

{"title":"Toward accurate and practical network tomography","authors":"Denisa Ghita, K. Argyraki, Patrick Thiran","doi":"10.1145/2433140.2433146","DOIUrl":"https://doi.org/10.1145/2433140.2433146","url":null,"abstract":"Troubleshooting large networks is hard; when an end-user complains that she has “network problems,” there is typically a large number of possible causes. For example, the end-user’s own machine may be damaged, misconfigured, or compromised, a network element that handles her traffic may be congested or malfunctioning, or the destination she is trying to reach may be filtering her traffic. To diagnose such problems, a network operator normally has to probe the network’s elements to collect relevant statistics, like packet loss or bandwidth utilization. The challenge, though, is that the network operator often does not have direct access to all the suspected network elements, hence cannot probe them— e.g., the operator of an edge network does not have access to the equipment of her Internet service provider (ISP). Network tomography is an elegant approach to network troubleshooting: just as medical tomography observes an organ from different vantage points and combines the observations to get knowledge of the organ’s internals (without dissecting it), so does network tomography observe the characteristics of different end-to-end network paths and combines the observations to infer the characteristics of individual network links (without probing them). This approach is applicable in scenarios where one needs to monitor the behavior and performance of a network without having direct access to its elements. For instance, the operators of edge networks could use network tomography to monitor the behavior and performance of their ISPs; an ISP operator could use it to monitor the behavior and performance of its peers. However, there are reasons to be skeptical about the usefulness of network tomography in practice. Even though it was invented more than 10 years ago and is still a topic of active research, it has not seen any real deployment. We believe the reason is that existing tomography algorithmsmake certain simplifying assumptions that do not always hold in a real network, which means that the algorithms’ results may be inaccurate. Most importantly, there is no way to determine the extent of this inaccuracy. In other words, today there is no way for a network operator who employs tomography for network troubleshooting to compute the certainty of its diagnosis.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"143 1","pages":"22-26"},"PeriodicalIF":0.0,"publicationDate":"2013-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85436865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

A framework to compute statistics of system parameters from very large trace files 从非常大的跟踪文件中计算系统参数统计信息的框架

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-01-29 DOI: 10.1145/2433140.2433151

Naser Ezzati-Jivan, M. Dagenais

引用次数: 17

Workshop report on LADIS 2012 2012年LADIS研讨会报告

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-01-29 DOI: 10.1145/2433140.2433142

D. Malkhi, R. V. Renesse

{"title":"Workshop report on LADIS 2012","authors":"D. Malkhi, R. V. Renesse","doi":"10.1145/2433140.2433142","DOIUrl":"https://doi.org/10.1145/2433140.2433142","url":null,"abstract":"The 6th Workshop on Large-Scale Distributed Systems and Middleware was held July 18 and 19 on the island of Madeira, Portugal, co-located with the ACM Symposium on Principles Of Distributed Computing (PODC). LADIS brings together researchers and professionals to discuss new trends and techniques in distributed systems and middlewares which surface in large scale data centers, cloud computing, web services, and other important systems. This year, all LADIS contributions were by invitation only and underwent one round of reviews for quality assurance and providing constructive feedback to the authors. Each paper received five reviews. As is tradition for LADIS, we also invited keynote speakers from academia and industry. The keynote speakers were invited to provide abstracts. As in previous years, we invited the authors of four of the abstracts to provide full papers for a special ACM SIGOPS Operating Systems Review issue. These abstracts were selected based on rankings provided by the reviewers. The selected papers received three more detailed reviews and you see before you the revisions that resulted. Below, we provide a short report of the workshop itself. Scott Shenker (UC Berkeley and ICSI) started the workshop with a keynote presentation on Software Defined Networking (SDN), which was held before a joint audience of LADIS and PODC participants. Scott described the current lack of natural abstractions in the network control plane and how SDN tries to address this shortcoming. The concept is to provide modularity and standardization to network control to simplify management and encourage experimentation. OpenFlow is a well-known instantiation of SDN. The keynote was followed by two SDN-related presentations on cloud networking. Paulo Costa of Imperial College London argued that the traditional separation between applications and networks has to be revisited for modern datacenters. He described his CamCube project that has developed a programmable torus-shaped network for a datacenter, and is now proposing a research agenda called NetworkAs-A-Service. The full paper is included in this issue. Theo","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"57 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2013-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81497934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1