Transactions in the Era of Non Volatile Memory and Heterogeneous Memory Architectures

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering Pub Date : 2021-04-19 DOI:10.1145/3447545.3451904

P. Romano

{"title":"Transactions in the Era of Non Volatile Memory and Heterogeneous Memory Architectures","authors":"P. Romano","doi":"10.1145/3447545.3451904","DOIUrl":null,"url":null,"abstract":"Transactions are a simple, yet powerful, abstraction that aims at masking programmers from the complexity of having to ensure correct and efficient synchronization of concurrent code. Originally introduced in the domain of database systems, transactions have recently garnered significant interest in the broader domain of concurrent programming, via the Transactional Memory (TM) paradigm. Nowadays, hardware supports for TM are provided in commodity CPUs (e.g., by Intel and IBM) and, at the software level, TM has been integrated in mainstream programming languages, such as C/C++ and Java. In this talk I will present the novel challenges and research opportunities that arise in the area of TM due to the emergence of two recent hardware trends, namely Non-Volatile Memory (NVM) and heterogeneous computing architectures. On the front of NVM, I will focus on the problem of how to allow the execution of transactions over NVM using unmodified commodity hardware TM (HTM) implementations. However, the reliance of commodity HTM implementations on CPU caches raises a crucial problem when applications access data stored in NVM from within a HTM transaction. Since CPU caches are volatile in today's systems, HTM implementations do not guarantee that the effects of a hardware transaction are atomically transposed to PM when the transaction commits --- although such effects are immediately visible to subsequent transactions. In this talk, I will overview somoe recent approaches to tackle this problem and present experimental results highlighting the existence of several bottlenecks that hinder the scalability of existing solutions. Next, I will show how these limitations can be addressed by presenting SPHT. SPHT introduces a novel commit logic that considerably mitigates the scalability bottlenecks of previous alternatives, providing up to 2.6x/2.2x speedups at 64 threads in, resp., STAMP/TPC-C. Moreover, SPHT introduces a novel approach to log replay that employs cross-transaction log linking and a NUMA-aware parallel background replayer. In large persistent heaps, the proposed approach achieves gains of 2.8x. On the front of heterogeneous computing, I will present the abstraction of Heterogeneous Transactional Memory (HeTM). HeTM provides programmers with the illusion of a single memory region, shared among the CPUs and the (discrete) GPU(s) of a heterogeneous system, with support for atomic transactions. Besides introducing the abstract semantics and programming model of HeTM, I will present the design and evaluation of a concrete implementation of the proposed abstraction, which we named Speculative HeTM (SHeTM). SHeTM makes use of a novel design that leverages speculative techniques that aim at hiding the large communication latency between CPUs and discrete GPUs and at minimizing inter-device synchronization overhead.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3447545.3451904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Transactions are a simple, yet powerful, abstraction that aims at masking programmers from the complexity of having to ensure correct and efficient synchronization of concurrent code. Originally introduced in the domain of database systems, transactions have recently garnered significant interest in the broader domain of concurrent programming, via the Transactional Memory (TM) paradigm. Nowadays, hardware supports for TM are provided in commodity CPUs (e.g., by Intel and IBM) and, at the software level, TM has been integrated in mainstream programming languages, such as C/C++ and Java. In this talk I will present the novel challenges and research opportunities that arise in the area of TM due to the emergence of two recent hardware trends, namely Non-Volatile Memory (NVM) and heterogeneous computing architectures. On the front of NVM, I will focus on the problem of how to allow the execution of transactions over NVM using unmodified commodity hardware TM (HTM) implementations. However, the reliance of commodity HTM implementations on CPU caches raises a crucial problem when applications access data stored in NVM from within a HTM transaction. Since CPU caches are volatile in today's systems, HTM implementations do not guarantee that the effects of a hardware transaction are atomically transposed to PM when the transaction commits --- although such effects are immediately visible to subsequent transactions. In this talk, I will overview somoe recent approaches to tackle this problem and present experimental results highlighting the existence of several bottlenecks that hinder the scalability of existing solutions. Next, I will show how these limitations can be addressed by presenting SPHT. SPHT introduces a novel commit logic that considerably mitigates the scalability bottlenecks of previous alternatives, providing up to 2.6x/2.2x speedups at 64 threads in, resp., STAMP/TPC-C. Moreover, SPHT introduces a novel approach to log replay that employs cross-transaction log linking and a NUMA-aware parallel background replayer. In large persistent heaps, the proposed approach achieves gains of 2.8x. On the front of heterogeneous computing, I will present the abstraction of Heterogeneous Transactional Memory (HeTM). HeTM provides programmers with the illusion of a single memory region, shared among the CPUs and the (discrete) GPU(s) of a heterogeneous system, with support for atomic transactions. Besides introducing the abstract semantics and programming model of HeTM, I will present the design and evaluation of a concrete implementation of the proposed abstraction, which we named Speculative HeTM (SHeTM). SHeTM makes use of a novel design that leverages speculative techniques that aim at hiding the large communication latency between CPUs and discrete GPUs and at minimizing inter-device synchronization overhead.

查看原文本刊更多论文

非易失性存储器和异构存储器架构时代的事务

事务是一种简单但功能强大的抽象，其目的是使程序员不必担心必须确保并发代码的正确和有效同步的复杂性。事务最初是在数据库系统领域引入的，最近通过事务性内存(Transactional Memory, TM)范式在更广泛的并发编程领域引起了极大的兴趣。现在，TM的硬件支持是在商品cpu(如Intel和IBM)中提供的，在软件层面，TM已经集成到主流编程语言中，如C/ c++和Java。在这次演讲中，我将介绍由于最近两种硬件趋势的出现，即非易失性存储器(NVM)和异构计算架构，在TM领域出现的新的挑战和研究机会。在NVM的前端，我将重点讨论如何允许使用未经修改的商品硬件TM (HTM)实现在NVM上执行事务。然而，当应用程序从HTM事务中访问存储在NVM中的数据时，商品HTM实现对CPU缓存的依赖引发了一个关键问题。由于CPU缓存在当今的系统中是不稳定的，HTM实现不能保证在事务提交时将硬件事务的影响自动转移到PM中——尽管这种影响对后续事务是立即可见的。在这次演讲中，我将概述一些最近解决这个问题的方法，并展示一些实验结果，强调存在一些阻碍现有解决方案可扩展性的瓶颈。接下来，我将通过介绍SPHT来说明如何解决这些限制。SPHT引入了一种新颖的提交逻辑，大大缓解了以前替代方案的可伸缩性瓶颈，在64个线程中提供高达2.6倍/2.2倍的速度提升。邮票/ tpc - c。此外，SPHT引入了一种新的日志重播方法，该方法采用了跨事务日志链接和numa感知的并行后台重播器。在大型持久堆中，建议的方法实现了2.8倍的增益。在异构计算方面，我将介绍异构事务性内存(HeTM)的抽象。HeTM为程序员提供了单一内存区域的错觉，在异构系统的cpu和(离散的)GPU之间共享，并支持原子事务。除了介绍HeTM的抽象语义和编程模型外，我还将介绍提出的抽象的具体实现的设计和评估，我们将其命名为Speculative HeTM (SHeTM)。SHeTM利用了一种新颖的设计，该设计利用了旨在隐藏cpu和分立gpu之间的大通信延迟和最小化设备间同步开销的推测技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

自引率

0.00%

发文量