在硬件事务性内存中使用可重构L1数据缓存进行有效的版本管理

2011 International Conference on Parallel Architectures and Compilation Techniques Pub Date : 2011-10-10 DOI:10.1109/PACT.2011.67

Adrià Armejach, A. Seyedi, Rubén Titos-Gil, I. Hur, Adri´n Cristal, O. Unsal, M. Valero

{"title":"在硬件事务性内存中使用可重构L1数据缓存进行有效的版本管理","authors":"Adrià Armejach, A. Seyedi, Rubén Titos-Gil, I. Hur, Adri´n Cristal, O. Unsal, M. Valero","doi":"10.1109/PACT.2011.67","DOIUrl":null,"url":null,"abstract":"Transactional Memory (TM) potentially simplifies parallel programming by providing atomicity and isolation for executed transactions. One of the key mechanisms to provide such properties is version management, which defines where and how transactional updates (new values) are stored. Version management can be implemented either eagerly or lazily. In Hardware Transactional Memory (HTM) implementations, eager version management puts new values in-place and old values are kept in a software log, while lazy version management stores new values in hardware buffers keeping old values in-place. Current HTM implementations, for both eager and lazy version management schemes, suffer from performance penalties due to the inability to handle two versions of the same logical data efficiently. In this paper, we introduce a reconfigurable L1 data cache architecture that has two execution modes: a 64KB general purpose mode and a 32KB TM mode which is able to manage two versions of the same logical data. The latter allows to handle old and new transactional values within the cache simultaneously when executing transactional workloads. We explain in detail the architectural design and internals of this Reconfigurable Data Cache (RDC), as well as the supported operations that allow to efficiently solve existing version management problems. We describe how the RDC can support both eager and lazy HTM systems, and we present two RDC-HTM designs. Our evaluation shows that the Eager-RDC-HTM and Lazy-RDC-HTM systems achieve 1.36x and 1.18x speedup, respectively, over state-of-the-art proposals. We also evaluate the area and energy effects of our proposal, and we find that RDC designs are 1.92x and 1.38x more energy-delay efficient compared to baseline HTM systems, with less than 0.3% area impact on modern processors.","PeriodicalId":106423,"journal":{"name":"2011 International Conference on Parallel Architectures and Compilation Techniques","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Using a Reconfigurable L1 Data Cache for Efficient Version Management in Hardware Transactional Memory\",\"authors\":\"Adrià Armejach, A. Seyedi, Rubén Titos-Gil, I. Hur, Adri´n Cristal, O. Unsal, M. Valero\",\"doi\":\"10.1109/PACT.2011.67\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transactional Memory (TM) potentially simplifies parallel programming by providing atomicity and isolation for executed transactions. One of the key mechanisms to provide such properties is version management, which defines where and how transactional updates (new values) are stored. Version management can be implemented either eagerly or lazily. In Hardware Transactional Memory (HTM) implementations, eager version management puts new values in-place and old values are kept in a software log, while lazy version management stores new values in hardware buffers keeping old values in-place. Current HTM implementations, for both eager and lazy version management schemes, suffer from performance penalties due to the inability to handle two versions of the same logical data efficiently. In this paper, we introduce a reconfigurable L1 data cache architecture that has two execution modes: a 64KB general purpose mode and a 32KB TM mode which is able to manage two versions of the same logical data. The latter allows to handle old and new transactional values within the cache simultaneously when executing transactional workloads. We explain in detail the architectural design and internals of this Reconfigurable Data Cache (RDC), as well as the supported operations that allow to efficiently solve existing version management problems. We describe how the RDC can support both eager and lazy HTM systems, and we present two RDC-HTM designs. Our evaluation shows that the Eager-RDC-HTM and Lazy-RDC-HTM systems achieve 1.36x and 1.18x speedup, respectively, over state-of-the-art proposals. We also evaluate the area and energy effects of our proposal, and we find that RDC designs are 1.92x and 1.38x more energy-delay efficient compared to baseline HTM systems, with less than 0.3% area impact on modern processors.\",\"PeriodicalId\":106423,\"journal\":{\"name\":\"2011 International Conference on Parallel Architectures and Compilation Techniques\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Parallel Architectures and Compilation Techniques\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACT.2011.67\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Parallel Architectures and Compilation Techniques","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.2011.67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

事务性内存(TM)通过为执行的事务提供原子性和隔离性，可能简化并行编程。提供这些属性的关键机制之一是版本管理，它定义了存储事务更新(新值)的位置和方式。版本管理的实现可以是主动的，也可以是惰性的。在硬件事务性内存(Hardware Transactional Memory, HTM)实现中，即时版本管理将新值保存在适当位置，旧值保存在软件日志中，而延迟版本管理将新值存储在硬件缓冲区中，并保留旧值。由于无法有效地处理相同逻辑数据的两个版本，当前的HTM实现(无论是急于版本管理方案还是惰性版本管理方案)都会遭受性能损失。在本文中，我们介绍了一种可重构的L1数据缓存架构，该架构具有两种执行模式:64KB通用模式和32KB TM模式，后者能够管理相同逻辑数据的两个版本。后者允许在执行事务性工作负载时同时处理缓存中的新旧事务性值。我们详细解释了这个可重构数据缓存(RDC)的架构设计和内部结构，以及支持的操作，这些操作允许有效地解决现有的版本管理问题。我们描述了RDC如何支持渴望和惰性HTM系统，并给出了两种RDC-HTM设计。我们的评估表明，与最先进的方案相比，Eager-RDC-HTM和Lazy-RDC-HTM系统分别实现了1.36倍和1.18倍的加速。我们还评估了我们的建议的面积和能源效应，我们发现RDC设计比基准HTM系统的能源延迟效率高1.92倍和1.38倍，对现代处理器的面积影响小于0.3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using a Reconfigurable L1 Data Cache for Efficient Version Management in Hardware Transactional Memory

Transactional Memory (TM) potentially simplifies parallel programming by providing atomicity and isolation for executed transactions. One of the key mechanisms to provide such properties is version management, which defines where and how transactional updates (new values) are stored. Version management can be implemented either eagerly or lazily. In Hardware Transactional Memory (HTM) implementations, eager version management puts new values in-place and old values are kept in a software log, while lazy version management stores new values in hardware buffers keeping old values in-place. Current HTM implementations, for both eager and lazy version management schemes, suffer from performance penalties due to the inability to handle two versions of the same logical data efficiently. In this paper, we introduce a reconfigurable L1 data cache architecture that has two execution modes: a 64KB general purpose mode and a 32KB TM mode which is able to manage two versions of the same logical data. The latter allows to handle old and new transactional values within the cache simultaneously when executing transactional workloads. We explain in detail the architectural design and internals of this Reconfigurable Data Cache (RDC), as well as the supported operations that allow to efficiently solve existing version management problems. We describe how the RDC can support both eager and lazy HTM systems, and we present two RDC-HTM designs. Our evaluation shows that the Eager-RDC-HTM and Lazy-RDC-HTM systems achieve 1.36x and 1.18x speedup, respectively, over state-of-the-art proposals. We also evaluate the area and energy effects of our proposal, and we find that RDC designs are 1.92x and 1.38x more energy-delay efficient compared to baseline HTM systems, with less than 0.3% area impact on modern processors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 International Conference on Parallel Architectures and Compilation Techniques

自引率

0.00%

发文量