Rethinking main memory OLTP recovery

2014 IEEE 30th International Conference on Data Engineering Pub Date : 2014-05-19 DOI:10.1109/ICDE.2014.6816685

Nirmesh Malviya, Ariel Weisberg, S. Madden, M. Stonebraker

{"title":"Rethinking main memory OLTP recovery","authors":"Nirmesh Malviya, Ariel Weisberg, S. Madden, M. Stonebraker","doi":"10.1109/ICDE.2014.6816685","DOIUrl":null,"url":null,"abstract":"Fine-grained, record-oriented write-ahead logging, as exemplified by systems like ARIES, has been the gold standard for relational database recovery. In this paper, we show that in modern high-throughput transaction processing systems, this is no longer the optimal way to recover a database system. In particular, as transaction throughputs get higher, ARIES-style logging starts to represent a non-trivial fraction of the overall transaction execution time. We propose a lighter weight, coarse-grained command logging technique which only records the transactions that were executed on the database. It then does recovery by starting from a transactionally consistent checkpoint and replaying the commands in the log as if they were new transactions. By avoiding the overhead of fine-grained logging of before and after images (both CPU complexity as well as substantial associated 110), command logging can yield significantly higher throughput at run-time. Recovery times for command logging are higher compared to an ARIEs-style physiological logging approach, but with the advent of high-availability techniques that can mask the outage of a recovering node, recovery speeds have become secondary in importance to run-time performance for most applications. We evaluated our approach on an implementation of TPCC in a main memory database system (VoltDB), and found that command logging can offer 1.5 x higher throughput than a main-memory optimized implementation of ARIEs-style physiological logging.","PeriodicalId":159130,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"135","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 30th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2014.6816685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 135

Abstract

Fine-grained, record-oriented write-ahead logging, as exemplified by systems like ARIES, has been the gold standard for relational database recovery. In this paper, we show that in modern high-throughput transaction processing systems, this is no longer the optimal way to recover a database system. In particular, as transaction throughputs get higher, ARIES-style logging starts to represent a non-trivial fraction of the overall transaction execution time. We propose a lighter weight, coarse-grained command logging technique which only records the transactions that were executed on the database. It then does recovery by starting from a transactionally consistent checkpoint and replaying the commands in the log as if they were new transactions. By avoiding the overhead of fine-grained logging of before and after images (both CPU complexity as well as substantial associated 110), command logging can yield significantly higher throughput at run-time. Recovery times for command logging are higher compared to an ARIEs-style physiological logging approach, but with the advent of high-availability techniques that can mask the outage of a recovering node, recovery speeds have become secondary in importance to run-time performance for most applications. We evaluated our approach on an implementation of TPCC in a main memory database system (VoltDB), and found that command logging can offer 1.5 x higher throughput than a main-memory optimized implementation of ARIEs-style physiological logging.

查看原文本刊更多论文

重新思考主存OLTP恢复

细粒度的、面向记录的预写日志记录，如ARIES等系统所示，已经成为关系数据库恢复的黄金标准。在本文中，我们展示了在现代高吞吐量事务处理系统中，这不再是恢复数据库系统的最佳方式。特别是，随着事务吞吐量越来越高，aries风格的日志记录开始占整个事务执行时间的很大一部分。我们提出了一种轻量级、粗粒度的命令日志技术，它只记录在数据库上执行的事务。然后，它通过从事务一致的检查点开始，并将日志中的命令当作新事务来重放，从而进行恢复。通过避免对前后映像进行细粒度日志记录的开销(CPU复杂性和大量相关的110)，命令日志记录可以在运行时显著提高吞吐量。与aries风格的生理日志方法相比，命令日志的恢复时间更长，但是随着高可用性技术的出现，恢复节点的中断可以被掩盖，对于大多数应用程序来说，恢复速度对于运行时性能来说已经变得次要。我们在主内存数据库系统(voldb)中的TPCC实现上评估了我们的方法，发现命令日志记录可以提供比主内存优化的aries式生理日志记录高1.5倍的吞吐量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE 30th International Conference on Data Engineering

自引率

0.00%

发文量