Distilling the Real Cost of Production Garbage Collectors

2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) Pub Date : 2021-12-15 DOI:10.1109/ISPASS55109.2022.00005

Zixian Cai

{"title":"Distilling the Real Cost of Production Garbage Collectors","authors":"Zixian Cai","doi":"10.1109/ISPASS55109.2022.00005","DOIUrl":null,"url":null,"abstract":"Despite the long history of garbage collection (GC) and its prevalence in modern programming languages, there is surprisingly little clarity about its true cost. Without understanding their cost, crucial tradeoffs made by garbage collectors (GCs) go unnoticed. This can lead to misguided design constraints and evaluation criteria used by GC researchers and users, hindering the development of high-performance, low-cost GCs. In this paper, we develop a methodology that allows us to empirically estimate the cost of GC for any given set of metrics. This fundamental quantification has eluded the research community, even when using modern, well-established methodologies. By distilling out the explicitly identifiable GC cost, we estimate the intrinsic application execution cost using different GCs. The minimum distilled cost forms a baseline. Subtracting this baseline from the total execution costs, we can then place an empirical lower bound on the absolute costs of different GCs. Using this methodology, we study five production GCs in OpenJDK 17, a high-performance Java runtime. We measure the cost of these collectors, and expose their respective key performance tradeoffs. We find that with a modestly sized heap, production GCs incur substantial overheads across a diverse suite of modern benchmarks, spending at least 7-82% more wall-clock time and 6-92% more CPU cycles relative to the baseline cost. We show that these costs can be masked by concurrency and generous provisioning of memory/compute. In addition, we find that newer low-pause GCs are significantly more expensive than older GCs, and, surprisingly, sometimes deliver worse application latency than stop-the-world GCs. Our findings reaffirm that GC is by no means a solved problem and that a low-cost, low-latency GC remains elusive. We recommend adopting the distillation methodology together with a wider range of cost metrics for future GC evaluations. This will not only help the community more comprehensively understand the performance characteristics of different GCs, but also reveal opportunities for future GC optimizations.","PeriodicalId":115391,"journal":{"name":"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS55109.2022.00005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Despite the long history of garbage collection (GC) and its prevalence in modern programming languages, there is surprisingly little clarity about its true cost. Without understanding their cost, crucial tradeoffs made by garbage collectors (GCs) go unnoticed. This can lead to misguided design constraints and evaluation criteria used by GC researchers and users, hindering the development of high-performance, low-cost GCs. In this paper, we develop a methodology that allows us to empirically estimate the cost of GC for any given set of metrics. This fundamental quantification has eluded the research community, even when using modern, well-established methodologies. By distilling out the explicitly identifiable GC cost, we estimate the intrinsic application execution cost using different GCs. The minimum distilled cost forms a baseline. Subtracting this baseline from the total execution costs, we can then place an empirical lower bound on the absolute costs of different GCs. Using this methodology, we study five production GCs in OpenJDK 17, a high-performance Java runtime. We measure the cost of these collectors, and expose their respective key performance tradeoffs. We find that with a modestly sized heap, production GCs incur substantial overheads across a diverse suite of modern benchmarks, spending at least 7-82% more wall-clock time and 6-92% more CPU cycles relative to the baseline cost. We show that these costs can be masked by concurrency and generous provisioning of memory/compute. In addition, we find that newer low-pause GCs are significantly more expensive than older GCs, and, surprisingly, sometimes deliver worse application latency than stop-the-world GCs. Our findings reaffirm that GC is by no means a solved problem and that a low-cost, low-latency GC remains elusive. We recommend adopting the distillation methodology together with a wider range of cost metrics for future GC evaluations. This will not only help the community more comprehensively understand the performance characteristics of different GCs, but also reveal opportunities for future GC optimizations.

查看原文本刊更多论文

提炼生产垃圾收集器的实际成本

尽管垃圾收集(GC)有着悠久的历史，并且在现代编程语言中很流行，但令人惊讶的是，它的真实成本却很少明确。如果不了解它们的成本，垃圾收集器(gc)所做的关键权衡就会被忽视。这可能会导致GC研究人员和用户使用错误的设计约束和评估标准，从而阻碍高性能、低成本GC的开发。在本文中，我们开发了一种方法，使我们能够根据任何给定的指标集经验估计GC的成本。即使使用现代的、成熟的方法，这种基本的量化也一直没有得到研究界的重视。通过提取出显式可识别的GC成本，我们估计使用不同GC的内在应用程序执行成本。最低蒸馏成本构成基准。从总执行成本中减去这个基线，我们就可以对不同gc的绝对成本设置一个经验下限。使用这种方法，我们研究了OpenJDK 17(一个高性能Java运行时)中的五个生产gc。我们测量了这些收集器的成本，并揭示了它们各自的关键性能权衡。我们发现，对于中等大小的堆，生产gc在不同的现代基准测试套件中会产生大量的开销，相对于基准成本，至少要多花费7-82%的时钟时间和6-92%的CPU周期。我们表明，这些成本可以通过并发性和大量提供内存/计算来掩盖。此外，我们发现较新的低暂停gc比旧的gc要昂贵得多，并且，令人惊讶的是，有时比停止世界gc提供更差的应用程序延迟。我们的发现再次表明，GC绝不是一个已解决的问题，低成本、低延迟的GC仍然难以实现。我们建议采用蒸馏方法和更广泛的GC评估成本指标。这不仅可以帮助社区更全面地了解不同GC的性能特征，还可以为将来的GC优化提供机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

自引率

0.00%

发文量