千变万化代码:为仓库规模的计算机实现近乎免费的在线代码转换

M. Laurenzano, Yunqi Zhang, Lingjia Tang, Jason Mars
{"title":"千变万化代码:为仓库规模的计算机实现近乎免费的在线代码转换","authors":"M. Laurenzano, Yunqi Zhang, Lingjia Tang, Jason Mars","doi":"10.1109/MICRO.2014.21","DOIUrl":null,"url":null,"abstract":"Rampant dynamism due to load fluctuations, co runner changes, and varying levels of interference poses a threat to application quality of service (QoS) and has limited our ability to allow co-locations in modern warehouse scale computers (WSCs). Instruction set features such as the non-temporal memory access hints found in modern ISAs (both ARM and x86) may be useful in mitigating these effects. However, despite the challenge of this dynamism and the availability of an instruction set mechanism that might help address the problem, a key capability missing in the system software stack in modern WSCs is the ability to dynamically transform (and re-transform) the executing application code to apply these instruction set features when necessary. In this work we introduce protean code, a novel approach for enacting arbitrary compiler transformations at runtime for native programs running on commodity hardware with negligible (<;1%) overhead. The fundamental insight behind the underlying mechanism of protean code is that, instead of maintaining full control throughout the program's execution as with traditional dynamic optimizers, protean code allows the original binary to execute continuously and diverts control flow only at a set of virtualized points, allowing rapid and seamless rerouting to the new code variants. In addition, the protean code compiler embeds IR with high-level semantic information into the program, empowering the dynamic compiler to perform rich analysis and transformations online with little overhead. Using a fully functional protean code compiler and runtime built on LLVM, we design PC3D, Protean Code for Cache Contention in Datacenters. PC3D dynamically employs non-temporal access hints to achieve utilization improvements of up to 2.8x (1.5x on average) higher than state-of-the-art contention mitigation runtime techniques at a QoS target of 98%.","PeriodicalId":6591,"journal":{"name":"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture","volume":"21 1","pages":"558-570"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":"{\"title\":\"Protean Code: Achieving Near-Free Online Code Transformations for Warehouse Scale Computers\",\"authors\":\"M. Laurenzano, Yunqi Zhang, Lingjia Tang, Jason Mars\",\"doi\":\"10.1109/MICRO.2014.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rampant dynamism due to load fluctuations, co runner changes, and varying levels of interference poses a threat to application quality of service (QoS) and has limited our ability to allow co-locations in modern warehouse scale computers (WSCs). Instruction set features such as the non-temporal memory access hints found in modern ISAs (both ARM and x86) may be useful in mitigating these effects. However, despite the challenge of this dynamism and the availability of an instruction set mechanism that might help address the problem, a key capability missing in the system software stack in modern WSCs is the ability to dynamically transform (and re-transform) the executing application code to apply these instruction set features when necessary. In this work we introduce protean code, a novel approach for enacting arbitrary compiler transformations at runtime for native programs running on commodity hardware with negligible (<;1%) overhead. The fundamental insight behind the underlying mechanism of protean code is that, instead of maintaining full control throughout the program's execution as with traditional dynamic optimizers, protean code allows the original binary to execute continuously and diverts control flow only at a set of virtualized points, allowing rapid and seamless rerouting to the new code variants. In addition, the protean code compiler embeds IR with high-level semantic information into the program, empowering the dynamic compiler to perform rich analysis and transformations online with little overhead. Using a fully functional protean code compiler and runtime built on LLVM, we design PC3D, Protean Code for Cache Contention in Datacenters. PC3D dynamically employs non-temporal access hints to achieve utilization improvements of up to 2.8x (1.5x on average) higher than state-of-the-art contention mitigation runtime techniques at a QoS target of 98%.\",\"PeriodicalId\":6591,\"journal\":{\"name\":\"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture\",\"volume\":\"21 1\",\"pages\":\"558-570\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"43\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MICRO.2014.21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICRO.2014.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 43

摘要

由于负载波动、co runner变化和不同程度的干扰而产生的猖獗的动态性对应用程序服务质量(QoS)构成威胁,并限制了我们在现代仓库规模计算机(wsc)中允许共址的能力。指令集特性,如现代isa (ARM和x86)中的非临时内存访问提示,可能有助于减轻这些影响。然而,尽管存在这种动态性的挑战和可能有助于解决问题的指令集机制的可用性,但现代wsc中系统软件堆栈中缺少的一个关键功能是动态转换(和重新转换)正在执行的应用程序代码以在必要时应用这些指令集特性的能力。在这项工作中,我们介绍了protean代码,这是一种在运行时为运行在商用硬件上的本机程序执行任意编译器转换的新方法,开销可以忽略不计(<;1%)。protean代码的基本机制背后的基本见解是,与传统动态优化器在整个程序执行过程中保持完全控制不同,protean代码允许原始二进制文件连续执行,并仅在一组虚拟点上转移控制流,从而允许快速无缝地重新路由到新的代码变体。此外,protean代码编译器将具有高级语义信息的IR嵌入到程序中,使动态编译器能够以很小的开销在线执行丰富的分析和转换。利用一个功能完备的protean代码编译器和基于LLVM的运行时,我们设计了PC3D,数据中心缓存争用的protean代码。PC3D动态地采用非时态访问提示,在98%的QoS目标下,实现比最先进的争用缓解运行时技术高出2.8倍(平均1.5倍)的利用率改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Protean Code: Achieving Near-Free Online Code Transformations for Warehouse Scale Computers
Rampant dynamism due to load fluctuations, co runner changes, and varying levels of interference poses a threat to application quality of service (QoS) and has limited our ability to allow co-locations in modern warehouse scale computers (WSCs). Instruction set features such as the non-temporal memory access hints found in modern ISAs (both ARM and x86) may be useful in mitigating these effects. However, despite the challenge of this dynamism and the availability of an instruction set mechanism that might help address the problem, a key capability missing in the system software stack in modern WSCs is the ability to dynamically transform (and re-transform) the executing application code to apply these instruction set features when necessary. In this work we introduce protean code, a novel approach for enacting arbitrary compiler transformations at runtime for native programs running on commodity hardware with negligible (<;1%) overhead. The fundamental insight behind the underlying mechanism of protean code is that, instead of maintaining full control throughout the program's execution as with traditional dynamic optimizers, protean code allows the original binary to execute continuously and diverts control flow only at a set of virtualized points, allowing rapid and seamless rerouting to the new code variants. In addition, the protean code compiler embeds IR with high-level semantic information into the program, empowering the dynamic compiler to perform rich analysis and transformations online with little overhead. Using a fully functional protean code compiler and runtime built on LLVM, we design PC3D, Protean Code for Cache Contention in Datacenters. PC3D dynamically employs non-temporal access hints to achieve utilization improvements of up to 2.8x (1.5x on average) higher than state-of-the-art contention mitigation runtime techniques at a QoS target of 98%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信