Yuxiang Shan, Zhan Shi, D. Feng, Ouyang Mengyun, F. Wang
{"title":"海量图的缓存友好数据布局","authors":"Yuxiang Shan, Zhan Shi, D. Feng, Ouyang Mengyun, F. Wang","doi":"10.1109/NAS.2018.8515737","DOIUrl":null,"url":null,"abstract":"Storage hierarchy is widely used to mitigate the vast performance gap between different storage components economically, and the cache plays an important role in increasing the efficiency of memory access. However, the in-memory data organization of traditional graph computing framework is not well-optimized for various caches, especially the CPU cache, since classical caches are effectiveless towards irregular access pattern of graph applications. This work presents a cache- friendly graph data layout strategy to improve the efficiency of graph processing. By both considering the parameters of cache line and the pattern of access to adjacent list, we sort the edges to generate a sequential layout, and use BFS (Breadth First Search) algorithm to reorder the vertices for improving locality, thus benefit the CPU cache, without altering the code of graph processing toolkits. The efficiency improvement of Sort layout ranges from 11% up to 78.76% on BGL and SNAP with CC (Connected Components [1]). The BFS layout can benefit 3 classical algorithms on GraphChi: CC, TC (Triangle Counting) and PageRank [2], with the ratios of 15%–20%.","PeriodicalId":115970,"journal":{"name":"2018 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Cache-Friendly Data Layout for Massive Graph\",\"authors\":\"Yuxiang Shan, Zhan Shi, D. Feng, Ouyang Mengyun, F. Wang\",\"doi\":\"10.1109/NAS.2018.8515737\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Storage hierarchy is widely used to mitigate the vast performance gap between different storage components economically, and the cache plays an important role in increasing the efficiency of memory access. However, the in-memory data organization of traditional graph computing framework is not well-optimized for various caches, especially the CPU cache, since classical caches are effectiveless towards irregular access pattern of graph applications. This work presents a cache- friendly graph data layout strategy to improve the efficiency of graph processing. By both considering the parameters of cache line and the pattern of access to adjacent list, we sort the edges to generate a sequential layout, and use BFS (Breadth First Search) algorithm to reorder the vertices for improving locality, thus benefit the CPU cache, without altering the code of graph processing toolkits. The efficiency improvement of Sort layout ranges from 11% up to 78.76% on BGL and SNAP with CC (Connected Components [1]). The BFS layout can benefit 3 classical algorithms on GraphChi: CC, TC (Triangle Counting) and PageRank [2], with the ratios of 15%–20%.\",\"PeriodicalId\":115970,\"journal\":{\"name\":\"2018 IEEE International Conference on Networking, Architecture and Storage (NAS)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Networking, Architecture and Storage (NAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NAS.2018.8515737\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Networking, Architecture and Storage (NAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2018.8515737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Storage hierarchy is widely used to mitigate the vast performance gap between different storage components economically, and the cache plays an important role in increasing the efficiency of memory access. However, the in-memory data organization of traditional graph computing framework is not well-optimized for various caches, especially the CPU cache, since classical caches are effectiveless towards irregular access pattern of graph applications. This work presents a cache- friendly graph data layout strategy to improve the efficiency of graph processing. By both considering the parameters of cache line and the pattern of access to adjacent list, we sort the edges to generate a sequential layout, and use BFS (Breadth First Search) algorithm to reorder the vertices for improving locality, thus benefit the CPU cache, without altering the code of graph processing toolkits. The efficiency improvement of Sort layout ranges from 11% up to 78.76% on BGL and SNAP with CC (Connected Components [1]). The BFS layout can benefit 3 classical algorithms on GraphChi: CC, TC (Triangle Counting) and PageRank [2], with the ratios of 15%–20%.