The Race-Timing Prototype

2021 18th International Conference on Privacy, Security and Trust (PST) Pub Date : 2021-12-13 DOI:10.1109/PST52912.2021.9647804

Andrés Rainiero Hernández Coronado, Wonjun Lee, Wei-Ming Lin

{"title":"The Race-Timing Prototype","authors":"Andrés Rainiero Hernández Coronado, Wonjun Lee, Wei-Ming Lin","doi":"10.1109/PST52912.2021.9647804","DOIUrl":null,"url":null,"abstract":"The disclosure of transient execution attacks, such as Meltdown and Spectre, has once again highlighted the threat of cache side-channel configurations. It has now been demonstrated that the cache hierarchy of modern processors functions as a micro-architectural medium where the transient execution may leak private data, in addition to also serving as an attack surface that adversaries can use to directly spy on victim programs. As a response to tamper the effectiveness of cache side-channel configurations, hardware manufacturers, and most notably AMD, have now opted to reduce the processor’s cycle counter resolution to prevent adversaries from mounting side-channels in the higher levels of the cache hierarchy, specially in the L1D cache. This partially works because classic cache side-channel techniques, namely Prime+Probe or Flush+Reload, heavily rely on the processor’s cycle counter to detect changes in the cache state, which is normally done by tracking cache hits and misses. Yet, we present The Race-Timing Prototype to prove that devoted adversaries no longer rely on the processor’s cycle counter resolution to effectively distinguish cache hits from misses, and that a cache side-channel can still be configured by only utilizing controlled memory race conditions. In view that our implementation is hardware agnostic and does not rely on any proprietary instruction set extension, we demonstrate that our prototype works on processors from both Intel and AMD. Additionally, with propose a new semantic of Out-of-Order Detachment that further allows our Race-Timing measurements to closely match the accuracy of hardware Performance Monitoring Counters to distinguish fast L1D and L2 cache accesses in Intel processors. Ultimately, this work demonstrates that an adversary can exploit a cache side-channel with high-precision timing, and without utilizing cycle counters at all.","PeriodicalId":144610,"journal":{"name":"2021 18th International Conference on Privacy, Security and Trust (PST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Conference on Privacy, Security and Trust (PST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PST52912.2021.9647804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The disclosure of transient execution attacks, such as Meltdown and Spectre, has once again highlighted the threat of cache side-channel configurations. It has now been demonstrated that the cache hierarchy of modern processors functions as a micro-architectural medium where the transient execution may leak private data, in addition to also serving as an attack surface that adversaries can use to directly spy on victim programs. As a response to tamper the effectiveness of cache side-channel configurations, hardware manufacturers, and most notably AMD, have now opted to reduce the processor’s cycle counter resolution to prevent adversaries from mounting side-channels in the higher levels of the cache hierarchy, specially in the L1D cache. This partially works because classic cache side-channel techniques, namely Prime+Probe or Flush+Reload, heavily rely on the processor’s cycle counter to detect changes in the cache state, which is normally done by tracking cache hits and misses. Yet, we present The Race-Timing Prototype to prove that devoted adversaries no longer rely on the processor’s cycle counter resolution to effectively distinguish cache hits from misses, and that a cache side-channel can still be configured by only utilizing controlled memory race conditions. In view that our implementation is hardware agnostic and does not rely on any proprietary instruction set extension, we demonstrate that our prototype works on processors from both Intel and AMD. Additionally, with propose a new semantic of Out-of-Order Detachment that further allows our Race-Timing measurements to closely match the accuracy of hardware Performance Monitoring Counters to distinguish fast L1D and L2 cache accesses in Intel processors. Ultimately, this work demonstrates that an adversary can exploit a cache side-channel with high-precision timing, and without utilizing cycle counters at all.

查看原文本刊更多论文

比赛计时原型

瞬态执行攻击的披露，如Meltdown和Spectre，再次强调了缓存侧通道配置的威胁。现在已经证明，现代处理器的缓存层次结构作为一种微体系结构介质，瞬时执行可能会泄漏私有数据，此外还可以作为攻击面，攻击者可以利用它直接监视受害者程序。作为对篡改缓存侧通道配置有效性的回应，硬件制造商，尤其是AMD，现在选择降低处理器的周期计数器分辨率，以防止对手在更高级别的缓存层次中安装侧通道，特别是在L1D缓存中。这在一定程度上是有效的，因为经典的缓存侧通道技术，即Prime+Probe或Flush+Reload，严重依赖处理器的周期计数器来检测缓存状态的变化，这通常是通过跟踪缓存命中和未命中来完成的。然而，我们提出了竞争计时原型来证明专门的对手不再依赖处理器的周期计数器分辨率来有效地区分缓存命中和未命中，并且缓存侧通道仍然可以通过仅利用受控的内存竞争条件来配置。鉴于我们的实现是硬件无关的，不依赖于任何专有的指令集扩展，我们证明了我们的原型可以在英特尔和AMD的处理器上工作。此外，我们提出了一种新的无序分离语义，进一步允许我们的竞赛计时测量与硬件性能监控计数器的准确性紧密匹配，以区分英特尔处理器中的快速L1D和L2缓存访问。最终，这项工作表明攻击者可以利用高精度定时的缓存侧信道，而根本不使用周期计数器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 18th International Conference on Privacy, Security and Trust (PST)

自引率

0.00%

发文量