减少集群收集的暂停时间

Proceedings of the 2015 International Symposium on Memory Management Pub Date : 2015-06-14 DOI:10.1145/2754169.2754184

Cody Cutler, R. Morris

{"title":"减少集群收集的暂停时间","authors":"Cody Cutler, R. Morris","doi":"10.1145/2754169.2754184","DOIUrl":null,"url":null,"abstract":"Each full garbage collection in a program with millions of objects can pause the program for multiple seconds. Much of this work is typically repeated, as the collector re-traces parts of the object graph that have not changed since the last collection. Clustered Collection reduces full collection pause times by eliminating much of this repeated work. Clustered Collection identifies clusters: regions of the object graph that are reachable from a single \"head\" object, so that reachability of the head implies reachability of the whole cluster. As long as it is not written, a cluster need not be re-traced by successive full collections. The main design challenge is coping with program writes to clusters while ensuring safe, complete, and fast collections. In some cases program writes require clusters to be dissolved, but in most cases Clustered Collection can handle writes without having to re-trace the affected cluster. Clustered Collection chooses clusters likely to suffer few writes and to yield high savings from re-trace avoidance. Clustered Collection is implemented as modifications to the Racket collector. Measurements of the code and data from the Hacker News web site (which suffers from significant garbage collection pauses) and a Twitter-like application show that Clustered Collection decreases full collection pause times by a factor of three and six respectively. This improvement is possible because both applications have gigabytes of live data, modify only a small fraction of it, and usually write in ways that do not result in cluster dissolution. Identifying clusters takes more time than a full collection, but happens much less frequently than full collection.","PeriodicalId":136399,"journal":{"name":"Proceedings of the 2015 International Symposium on Memory Management","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Reducing pause times with clustered collection\",\"authors\":\"Cody Cutler, R. Morris\",\"doi\":\"10.1145/2754169.2754184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Each full garbage collection in a program with millions of objects can pause the program for multiple seconds. Much of this work is typically repeated, as the collector re-traces parts of the object graph that have not changed since the last collection. Clustered Collection reduces full collection pause times by eliminating much of this repeated work. Clustered Collection identifies clusters: regions of the object graph that are reachable from a single \\\"head\\\" object, so that reachability of the head implies reachability of the whole cluster. As long as it is not written, a cluster need not be re-traced by successive full collections. The main design challenge is coping with program writes to clusters while ensuring safe, complete, and fast collections. In some cases program writes require clusters to be dissolved, but in most cases Clustered Collection can handle writes without having to re-trace the affected cluster. Clustered Collection chooses clusters likely to suffer few writes and to yield high savings from re-trace avoidance. Clustered Collection is implemented as modifications to the Racket collector. Measurements of the code and data from the Hacker News web site (which suffers from significant garbage collection pauses) and a Twitter-like application show that Clustered Collection decreases full collection pause times by a factor of three and six respectively. This improvement is possible because both applications have gigabytes of live data, modify only a small fraction of it, and usually write in ways that do not result in cluster dissolution. Identifying clusters takes more time than a full collection, but happens much less frequently than full collection.\",\"PeriodicalId\":136399,\"journal\":{\"name\":\"Proceedings of the 2015 International Symposium on Memory Management\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2015 International Symposium on Memory Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2754169.2754184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 International Symposium on Memory Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2754169.2754184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

在具有数百万个对象的程序中，每次完整的垃圾收集都可能使程序暂停数秒。当收集器重新跟踪对象图中自上次收集以来未更改的部分时，大部分工作通常是重复的。集群收集通过消除大部分重复工作来减少完整收集暂停时间。集群集合识别集群:对象图中可从单个“头部”对象到达的区域，因此头部的可达性意味着整个集群的可达性。只要没有写入集群，就不需要通过连续的完整收集重新跟踪集群。主要的设计挑战是在确保安全、完整和快速收集的同时处理程序对集群的写入。在某些情况下，程序写入需要解散集群，但在大多数情况下，集群收集可以处理写入，而不必重新跟踪受影响的集群。集群收集选择可能遭受很少写操作的集群，并通过避免重新跟踪而节省大量资源。集群收集是作为对Racket收集器的修改实现的。对Hacker News网站(它遭受了严重的垃圾收集暂停)和一个类似twitter的应用程序的代码和数据的测量表明，集群收集将完全收集暂停时间分别减少了3倍和6倍。这种改进是可能的，因为这两个应用程序都有千兆字节的实时数据，只修改其中的一小部分，并且通常以不会导致集群解散的方式编写。识别集群比完全收集需要更多的时间，但是发生的频率比完全收集要低得多。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reducing pause times with clustered collection

Each full garbage collection in a program with millions of objects can pause the program for multiple seconds. Much of this work is typically repeated, as the collector re-traces parts of the object graph that have not changed since the last collection. Clustered Collection reduces full collection pause times by eliminating much of this repeated work. Clustered Collection identifies clusters: regions of the object graph that are reachable from a single "head" object, so that reachability of the head implies reachability of the whole cluster. As long as it is not written, a cluster need not be re-traced by successive full collections. The main design challenge is coping with program writes to clusters while ensuring safe, complete, and fast collections. In some cases program writes require clusters to be dissolved, but in most cases Clustered Collection can handle writes without having to re-trace the affected cluster. Clustered Collection chooses clusters likely to suffer few writes and to yield high savings from re-trace avoidance. Clustered Collection is implemented as modifications to the Racket collector. Measurements of the code and data from the Hacker News web site (which suffers from significant garbage collection pauses) and a Twitter-like application show that Clustered Collection decreases full collection pause times by a factor of three and six respectively. This improvement is possible because both applications have gigabytes of live data, modify only a small fraction of it, and usually write in ways that do not result in cluster dissolution. Identifying clusters takes more time than a full collection, but happens much less frequently than full collection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2015 International Symposium on Memory Management

自引率

0.00%

发文量