{"title":"GSC:贪心分片缓存算法,用于提高GraphChi的I/O效率","authors":"Dagang Li, Zehua Zheng","doi":"10.1109/ICNP.2017.8117588","DOIUrl":null,"url":null,"abstract":"Disk-based large scale graph computation on a single machine has been attracting much attention, with GraphChi as one of the most well-accepted solutions. However, we find out that the performance of GraphChi becomes I/O-constrained when memory is moderately abundant, and from some point adding more memory does not help with the performance any more. In this work, a greedy caching algorithm GSC is proposed for GraphChi to make better use of the memory. It alleviates the I/O constraint by caching and delaying the write-backs of GraphChi shards that have already been loaded into the memory. Experimental results show that by minimizing unnecessary I/Os, GSC can be up to 4x faster during computation than standard GraphChi under memory constraint, and achieve about 3x performance gain when sufficient memory is available.","PeriodicalId":6462,"journal":{"name":"2017 IEEE 25th International Conference on Network Protocols (ICNP)","volume":"4 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GSC: Greedy shard caching algorithm for improved I/O efficiency in GraphChi\",\"authors\":\"Dagang Li, Zehua Zheng\",\"doi\":\"10.1109/ICNP.2017.8117588\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Disk-based large scale graph computation on a single machine has been attracting much attention, with GraphChi as one of the most well-accepted solutions. However, we find out that the performance of GraphChi becomes I/O-constrained when memory is moderately abundant, and from some point adding more memory does not help with the performance any more. In this work, a greedy caching algorithm GSC is proposed for GraphChi to make better use of the memory. It alleviates the I/O constraint by caching and delaying the write-backs of GraphChi shards that have already been loaded into the memory. Experimental results show that by minimizing unnecessary I/Os, GSC can be up to 4x faster during computation than standard GraphChi under memory constraint, and achieve about 3x performance gain when sufficient memory is available.\",\"PeriodicalId\":6462,\"journal\":{\"name\":\"2017 IEEE 25th International Conference on Network Protocols (ICNP)\",\"volume\":\"4 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 25th International Conference on Network Protocols (ICNP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNP.2017.8117588\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 25th International Conference on Network Protocols (ICNP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNP.2017.8117588","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在单个机器上基于磁盘的大规模图计算已经引起了很多关注,GraphChi是最被广泛接受的解决方案之一。然而,我们发现,当内存适度充裕时,GraphChi的性能会受到I/ o约束,从某种程度上说,增加更多的内存对性能不再有帮助。为了更好地利用GraphChi的内存,本文提出了一种贪心缓存算法GSC。它通过缓存和延迟已经加载到内存中的GraphChi分片的回写来缓解I/O约束。实验结果表明,通过最小化不必要的I/ o, GSC在内存限制下的计算速度可以比标准GraphChi快4倍,并且在足够的内存可用时可以实现约3倍的性能提升。
GSC: Greedy shard caching algorithm for improved I/O efficiency in GraphChi
Disk-based large scale graph computation on a single machine has been attracting much attention, with GraphChi as one of the most well-accepted solutions. However, we find out that the performance of GraphChi becomes I/O-constrained when memory is moderately abundant, and from some point adding more memory does not help with the performance any more. In this work, a greedy caching algorithm GSC is proposed for GraphChi to make better use of the memory. It alleviates the I/O constraint by caching and delaying the write-backs of GraphChi shards that have already been loaded into the memory. Experimental results show that by minimizing unnecessary I/Os, GSC can be up to 4x faster during computation than standard GraphChi under memory constraint, and achieve about 3x performance gain when sufficient memory is available.