Space sensitive cache dumping for post-silicon validation

2013 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2013-03-18 DOI:10.7873/DATE.2013.113

Sandeep Chandran, S. Sarangi, P. Panda

{"title":"Space sensitive cache dumping for post-silicon validation","authors":"Sandeep Chandran, S. Sarangi, P. Panda","doi":"10.7873/DATE.2013.113","DOIUrl":null,"url":null,"abstract":"The internal state of complex modern processors often needs to be dumped out frequently during post-silicon validation. Since the last level cache (considered L2 in this paper) holds most of the state, the volume of data dumped and the transfer time are dominated by the L2 cache. The limited bandwidth to transfer data off-chip coupled with the large size of L2 cache results in stalling the processor for long durations when dumping the cache contents off-chip. To alleviate this, we propose to transfer only those cache lines that were updated since the previous dump. Since maintaining a bit-vector with a separate bit to track the status of individual cache lines is expensive, we propose 2 methods: (i) where a bit tracks multiple cache lines and (ii) an Interval Table which stores only the starting and ending addresses of continuous runs of updated cache lines. Both methods require significantly lesser space compared to a bit-vector, and allow the designer to choose the amount of space to allocate for this design-for-debug (DFD) feature. The impact of reducing storage space is that some non-updated cache lines are dumped too. We attempt to minimize such overheads. Further, the Interval Table is independent of the cache size which makes it ideal for large caches. Through experimentation, we also determine the break-even point below which a t-lines/bit bit-vector is beneficial compared to an Interval Table.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"22 3 1","pages":"497-502"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7873/DATE.2013.113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The internal state of complex modern processors often needs to be dumped out frequently during post-silicon validation. Since the last level cache (considered L2 in this paper) holds most of the state, the volume of data dumped and the transfer time are dominated by the L2 cache. The limited bandwidth to transfer data off-chip coupled with the large size of L2 cache results in stalling the processor for long durations when dumping the cache contents off-chip. To alleviate this, we propose to transfer only those cache lines that were updated since the previous dump. Since maintaining a bit-vector with a separate bit to track the status of individual cache lines is expensive, we propose 2 methods: (i) where a bit tracks multiple cache lines and (ii) an Interval Table which stores only the starting and ending addresses of continuous runs of updated cache lines. Both methods require significantly lesser space compared to a bit-vector, and allow the designer to choose the amount of space to allocate for this design-for-debug (DFD) feature. The impact of reducing storage space is that some non-updated cache lines are dumped too. We attempt to minimize such overheads. Further, the Interval Table is independent of the cache size which makes it ideal for large caches. Through experimentation, we also determine the break-even point below which a t-lines/bit bit-vector is beneficial compared to an Interval Table.

查看原文本刊更多论文

用于后硅验证的空间敏感缓存转储

复杂的现代处理器的内部状态通常需要在后硅验证期间频繁地丢弃。由于最后一级缓存(本文认为是L2)保存了大部分状态，因此转储的数据量和传输时间由L2缓存控制。芯片外传输数据的带宽有限，加上L2缓存的大小很大，导致在将缓存内容转储到芯片外时，处理器会长时间停机。为了缓解这种情况，我们建议只传输自上次转储以来更新的缓存行。由于维护一个单独的位向量来跟踪单个缓存线的状态是昂贵的，我们提出了2种方法:(i)一个位跟踪多个缓存线和(ii)一个间隔表，它只存储更新的缓存线的连续运行的开始和结束地址。与位向量相比，这两种方法都需要更少的空间，并且允许设计人员选择为这种调试设计(DFD)特性分配的空间量。减少存储空间的影响是一些未更新的缓存行也会被转储。我们试图把这些开销降到最低。此外，间隔表与缓存大小无关，这使得它非常适合大型缓存。通过实验，我们还确定了与间隔表相比，t线/位矢量的损益平衡点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)

自引率

0.00%

发文量