Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems最新文献

Protecting Data on Smartphones and Tablets from Memory Attacks 保护智能手机和平板电脑上的数据免受内存攻击

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2694344.2694380

Patrick Colp, Jiawen Zhang, James Gleeson, Sahil Suneja, E. D. Lara, Himanshu Raj, S. Saroiu, A. Wolman

引用次数: 109

Architectural Support for Cyber-Physical Systems 网络物理系统的架构支持

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2786763.2694375

Edward A. Lee

{"title":"Architectural Support for Cyber-Physical Systems","authors":"Edward A. Lee","doi":"10.1145/2786763.2694375","DOIUrl":"https://doi.org/10.1145/2786763.2694375","url":null,"abstract":"Cyber-physical systems are integrations of computation, communication networks, and physical dynamics. Although time plays a central role in the physical world, all widely used software abstractions lack temporal semantics. The notion of correct execution of a program written in every widely-used programming language today does not depend on the temporal behavior of the program. But temporal behavior matters in almost all systems, and most particularly in cyber-physical systems. In this talk, I will argue that time can and must become part of the semantics of programs for a large class of applications. To illustrate that this is both practical and useful, we will describe a recent effort at Berkeley in the design and implementation of timing-centric software systems. Specifically, I will describe PRET machines, which redefine the instruction-set architecture (ISA) of a microprocessor to embrace temporal semantics. Such machines can be used in high-confidence and safety-critical systems, in energy-constrained systems, in mixed-criticality systems, and as a Real-Time Unit (RTU) that cooperates with a general-purpose processor to provide real-time services, in a manner similar to how a GPU provides graphics services.","PeriodicalId":403247,"journal":{"name":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129339253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Session details: Session 3B: Warehouse Scale Computing II 会议详情:会议3B:仓库规模计算II

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/3251031

Yunji Chen

引用次数: 0

Improving Agility and Elasticity in Bare-metal Clouds 提高裸机云中的敏捷性和弹性

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2694344.2694349

Yushi Omote, Takahiro Shinagawa, Kazuhiko Kato

{"title":"Improving Agility and Elasticity in Bare-metal Clouds","authors":"Yushi Omote, Takahiro Shinagawa, Kazuhiko Kato","doi":"10.1145/2694344.2694349","DOIUrl":"https://doi.org/10.1145/2694344.2694349","url":null,"abstract":"Bare-metal clouds are an emerging infrastructure-as-a-service (IaaS) that leases physical machines (bare-metal instances) rather than virtual machines, allowing resource-intensive applications to have exclusive access to physical hardware. Unfortunately, bare-metal instances require time-consuming or OS-specific tasks for deployment due to the lack of virtualization layers, thereby sacrificing several beneficial features of traditional IaaS clouds such as agility, elasticity, and OS transparency. We present BMcast, an OS deployment system with a special-purpose de-virtualizable virtual machine monitor (VMM) that supports quick and OS-transparent startup of bare-metal instances. BMcast performs streaming OS deployment while allowing direct access to physical hardware from the guest OS, and then disappears after completing the deployment. Quick startup of instances improves agility and elasticity significantly, and OS transparency greatly simplifies management tasks for cloud customers. Experimental results have confirmed that BMcast initiated a bare-metal instance 8.6 times faster than image copying, and database performance on BMcast during streaming OS deployment was comparable to that on a state-of-the-art VMM without performing deployment. BMcast incurred zero overhead after de-virtualization.","PeriodicalId":403247,"journal":{"name":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125625195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory 减少硬件NOrec:一种安全和可扩展的混合事务性内存

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2694344.2694393

A. Matveev, N. Shavit

引用次数: 50

More is Less, Less is More: Molecular-Scale Photonic NoC Power Topologies 多即是少，少即是多:分子尺度光子NoC功率拓扑

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2694344.2694377

Jun Pang, C. Dwyer, A. Lebeck

{"title":"More is Less, Less is More: Molecular-Scale Photonic NoC Power Topologies","authors":"Jun Pang, C. Dwyer, A. Lebeck","doi":"10.1145/2694344.2694377","DOIUrl":"https://doi.org/10.1145/2694344.2694377","url":null,"abstract":"Molecular-scale Network-on-Chip (mNoC) crossbars use quantum dot LEDs as an on-chip light source, and chromophores to provide optical signal filtering for receivers. An mNoC reduces power consumption or enables scaling to larger crossbars for a reduced energy budget compared to current nanophotonic NoC crossbars. Since communication latency is reduced by using a high-radix crossbar, minimizing power consumption becomes a primary design target. Conventional Single Writer Multiple Reader (SWMR) photonic crossbar designs broadcast all packets, and incur the commensurate required power, even if only two nodes are communicating. This paper introduces power topologies, enabled by unique capabilities of mNoC technology, to reduce overall interconnect power consumption. A power topology corresponds to the logical connectivity provided by a given power mode. Broadcast is one power mode and it consumes the maximum power. Additional power modes consume less power but allow a source to communicate with only a statically defined, potentially non-contiguous, subset of nodes. Overall interconnect power is reduced if the more frequently communicating nodes use modes that consume less power, while less frequently communicating nodes use modes that consume more power. We also investigate thread mapping techniques to fully exploit power topologies. We explore various mNoC power topologies with one, two and four power modes for a radix-256 SWMR mNoC crossbar. Our results show that the combination of power topologies and intelligent thread mapping can reduce total mNoC power by up to 51% on average for a set of 12 SPLASH benchmarks. Furthermore performance is 10% better than conventional resonator-based photonic NoCs and energy is reduced by 72%.","PeriodicalId":403247,"journal":{"name":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115479507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Session details: Session 2A: Memory and Security I 会议详情:会议2A:内存和安全

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/3251028

J. Criswell

引用次数: 0

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers 天狼星:一个开放的端到端语音和视觉个人助理及其对未来仓库规模计算机的影响

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2694344.2694347

Johann Hauswald, M. Laurenzano, Yunqi Zhang, Cheng Li, A. Rovinski, Arjun Khurana, R. Dreslinski, T. Mudge, V. Petrucci, Lingjia Tang, Jason Mars

{"title":"Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers","authors":"Johann Hauswald, M. Laurenzano, Yunqi Zhang, Cheng Li, A. Rovinski, Arjun Khurana, R. Dreslinski, T. Mudge, V. Petrucci, Lingjia Tang, Jason Mars","doi":"10.1145/2694344.2694347","DOIUrl":"https://doi.org/10.1145/2694344.2694347","url":null,"abstract":"As user demand scales for intelligent personal assistants (IPAs) such as Apple's Siri, Google's Google Now, and Microsoft's Cortana, we are approaching the computational limits of current datacenter architectures. It is an open question how future server architectures should evolve to enable this emerging class of applications, and the lack of an open-source IPA workload is an obstacle in addressing this question. In this paper, we present the design of Sirius, an open end-to-end IPA web-service application that accepts queries in the form of voice and images, and responds with natural language. We then use this workload to investigate the implications of four points in the design space of future accelerator-based server architectures spanning traditional CPUs, GPUs, manycore throughput co-processors, and FPGAs. To investigate future server designs for Sirius, we decompose Sirius into a suite of 7 benchmarks (Sirius Suite) comprising the computationally intensive bottlenecks of Sirius. We port Sirius Suite to a spectrum of accelerator platforms and use the performance and power trade-offs across these platforms to perform a total cost of ownership (TCO) analysis of various server design points. In our study, we find that accelerators are critical for the future scalability of IPA services. Our results show that GPU- and FPGA-accelerated servers improve the query latency on average by 10x and 16x. For a given throughput, GPU- and FPGA-accelerated servers can reduce the TCO of datacenters by 2.6x and 1.4x, respectively.","PeriodicalId":403247,"journal":{"name":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131083829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 228

GPU Concurrency: Weak Behaviours and Programming Assumptions GPU并发:弱行为和编程假设

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2694344.2694391

J. Alglave, Mark Batty, A. Donaldson, G. Gopalakrishnan, J. Ketema, Daniel Poetzl, Tyler Sorensen, John Wickerson

{"title":"GPU Concurrency: Weak Behaviours and Programming Assumptions","authors":"J. Alglave, Mark Batty, A. Donaldson, G. Gopalakrishnan, J. Ketema, Daniel Poetzl, Tyler Sorensen, John Wickerson","doi":"10.1145/2694344.2694391","DOIUrl":"https://doi.org/10.1145/2694344.2694391","url":null,"abstract":"Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current specifications of languages and hardware are inconclusive; thus programmers often rely on folklore assumptions when writing software. To remedy this state of affairs, we conducted a large empirical study of the concurrent behaviour of deployed GPUs. Armed with litmus tests (i.e. short concurrent programs), we questioned the assumptions in programming guides and vendor documentation about the guarantees provided by hardware. We developed a tool to generate thousands of litmus tests and run them under stressful workloads. We observed a litany of previously elusive weak behaviours, and exposed folklore beliefs about GPU programming---often supported by official tutorials---as false. As a way forward, we propose a model of Nvidia GPU hardware, which correctly models every behaviour witnessed in our experiments. The model is a variant of SPARC Relaxed Memory Order (RMO), structured following the GPU concurrency hierarchy.","PeriodicalId":403247,"journal":{"name":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130562274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 115

Architectural Support for Software-Defined Metadata Processing 软件定义元数据处理的体系结构支持

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2015-03-14 DOI: 10.1145/2694344.2694383

Udit Dhawan, Catalin Hritcu, Raphael Rubin, N. Vasilakis, S. Chiricescu, Jonathan M. Smith, T. Knight, B. Pierce, A. DeHon

{"title":"Architectural Support for Software-Defined Metadata Processing","authors":"Udit Dhawan, Catalin Hritcu, Raphael Rubin, N. Vasilakis, S. Chiricescu, Jonathan M. Smith, T. Knight, B. Pierce, A. DeHon","doi":"10.1145/2694344.2694383","DOIUrl":"https://doi.org/10.1145/2694344.2694383","url":null,"abstract":"Optimized hardware for propagating and checking software-programmable metadata tags can achieve low runtime overhead. We generalize prior work on hardware tagging by considering a generic architecture that supports software-defined policies over metadata of arbitrary size and complexity; we introduce several novel microarchitectural optimizations that keep the overhead of this rich processing low. Our model thus achieves the efficiency of previous hardware-based approaches with the flexibility of the software-based ones. We demonstrate this by using it to enforce four diverse safety and security policies---spatial and temporal memory safety, taint tracking, control-flow integrity, and code and data separation---plus a composite policy that enforces all of them simultaneously. Experiments on SPEC CPU2006 benchmarks with a PUMP-enhanced RISC processor show modest impact on runtime (typically under 10%) and power ceiling (less than 10%), in return for some increase in energy usage (typically under 60%) and area for on-chip memory structures (110%).","PeriodicalId":403247,"journal":{"name":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132860686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 98