An analysis of database workload performance on simultaneous multithreaded processors

Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235) Pub Date : 1998-04-16 DOI:10.1145/279358.279367

J. Lo, L. Barroso, S. Eggers, K. Gharachorloo, H. Levy, S. Parekh

{"title":"An analysis of database workload performance on simultaneous multithreaded processors","authors":"J. Lo, L. Barroso, S. Eggers, K. Gharachorloo, H. Levy, S. Parekh","doi":"10.1145/279358.279367","DOIUrl":null,"url":null,"abstract":"Simultaneous multithreading (SMT) is an architectural technique in which the processor issues multiple instructions from multiple threads each cycle. While SMT has been shown to be effective on scientific workloads, its performance on database systems is still an open question. In particular, database systems have poor cache performance, and the addition of multithreading has the potential to exacerbate cache conflicts. This paper examines database performance on SMT processors using traces of the Oracle database management system. Our research makes three contributions. First, it characterizes the memory-system behavior of database systems running on-line transaction processing and decision support system workloads. Our data show that while DBMS workloads have large memory footprints, there is substantial data reuse in a small, cacheable \"critical\" working set. Second, we show that the additional data cache conflicts caused by simultaneous-multithreaded instruction scheduling can be nearly eliminated by the proper choice of software-directed policies for virtual-to-physical page mapping and per-process address offsetting. Our results demonstrate that with the best policy choices, D-cache miss rates on an 8-context SMT are roughly equivalent to those on a single-threaded superscalar. Multithreading also leads to better interthread instruction cache sharing, reducing I-cache miss rates by up to 35%. Third, we show that SMT's latency tolerance is highly effective for database applications. For example, using a memory-intensive OLTP workload, an 8-context SMT processor achieves a 3-fold increase in instruction throughput over a single-threaded superscalar with similar resources.","PeriodicalId":393075,"journal":{"name":"Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"254","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/279358.279367","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 254

Abstract

Simultaneous multithreading (SMT) is an architectural technique in which the processor issues multiple instructions from multiple threads each cycle. While SMT has been shown to be effective on scientific workloads, its performance on database systems is still an open question. In particular, database systems have poor cache performance, and the addition of multithreading has the potential to exacerbate cache conflicts. This paper examines database performance on SMT processors using traces of the Oracle database management system. Our research makes three contributions. First, it characterizes the memory-system behavior of database systems running on-line transaction processing and decision support system workloads. Our data show that while DBMS workloads have large memory footprints, there is substantial data reuse in a small, cacheable "critical" working set. Second, we show that the additional data cache conflicts caused by simultaneous-multithreaded instruction scheduling can be nearly eliminated by the proper choice of software-directed policies for virtual-to-physical page mapping and per-process address offsetting. Our results demonstrate that with the best policy choices, D-cache miss rates on an 8-context SMT are roughly equivalent to those on a single-threaded superscalar. Multithreading also leads to better interthread instruction cache sharing, reducing I-cache miss rates by up to 35%. Third, we show that SMT's latency tolerance is highly effective for database applications. For example, using a memory-intensive OLTP workload, an 8-context SMT processor achieves a 3-fold increase in instruction throughput over a single-threaded superscalar with similar resources.

查看原文本刊更多论文

并发多线程处理器上的数据库工作负载性能分析

同步多线程(SMT)是一种架构技术，其中处理器每个周期从多个线程发出多条指令。虽然SMT已被证明在科学工作负载上是有效的，但它在数据库系统上的性能仍然是一个悬而未决的问题。特别是，数据库系统具有较差的缓存性能，并且多线程的添加有可能加剧缓存冲突。本文使用Oracle数据库管理系统的跟踪来检查SMT处理器上的数据库性能。我们的研究有三个贡献。首先，它描述了运行在线事务处理和决策支持系统工作负载的数据库系统的内存系统行为。我们的数据表明，虽然DBMS工作负载占用大量内存，但在一个小的、可缓存的“关键”工作集中有大量的数据重用。其次，我们展示了由同步多线程指令调度引起的额外数据缓存冲突可以通过正确选择软件导向策略来进行虚拟到物理页面映射和每个进程地址偏移来消除。我们的结果表明，在最佳策略选择下，8上下文SMT上的D-cache缺失率与单线程标量上的D-cache缺失率大致相当。多线程还带来了更好的线程间指令缓存共享，将I-cache丢失率降低了35%。第三，我们证明SMT的延迟容忍对数据库应用程序非常有效。例如，使用内存密集型OLTP工作负载，8个上下文的SMT处理器与具有类似资源的单线程超标量相比，指令吞吐量增加了3倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235)

自引率

0.00%

发文量