Yun Gao, W. Zhou, Jizhong Han, Dan Meng, Zhang Zhang, Zhiyong Xu
{"title":"图处理框架在五个关键问题上的评价与分析","authors":"Yun Gao, W. Zhou, Jizhong Han, Dan Meng, Zhang Zhang, Zhiyong Xu","doi":"10.1145/2742854.2742884","DOIUrl":null,"url":null,"abstract":"With the continuously emerging applications in fields like social media analysis, mining massive graphs has drawn increasing attentions from industry and academia. To aid the development of distributed graph algorithms, various programming frameworks have been proposed. To better understand their performance differences under specific scenarios, we analyzed and compared a set of seven representative frameworks under five design aspects, including distribution policy, on-disk data organization, programming model, synchronization policy and message model. Our experiments reveal some interesting phenomena. For example, We observed that the vertex-cut method overweighs the edge-cut method on neighbor-based algorithms while leads to inefficiency for non-neighbor-based algorithms. Furthermore, we observed that using asynchronous update can reduce the total workload by 20% to 30%, but the processing time may still doubled due to fine-grained lock conflicts. Overall, we analyzed the pros and cons of each option for the five key issues. We believe our findings will help end-users choose a suitable framework, and designers improve current ones.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"An evaluation and analysis of graph processing frameworks on five key issues\",\"authors\":\"Yun Gao, W. Zhou, Jizhong Han, Dan Meng, Zhang Zhang, Zhiyong Xu\",\"doi\":\"10.1145/2742854.2742884\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the continuously emerging applications in fields like social media analysis, mining massive graphs has drawn increasing attentions from industry and academia. To aid the development of distributed graph algorithms, various programming frameworks have been proposed. To better understand their performance differences under specific scenarios, we analyzed and compared a set of seven representative frameworks under five design aspects, including distribution policy, on-disk data organization, programming model, synchronization policy and message model. Our experiments reveal some interesting phenomena. For example, We observed that the vertex-cut method overweighs the edge-cut method on neighbor-based algorithms while leads to inefficiency for non-neighbor-based algorithms. Furthermore, we observed that using asynchronous update can reduce the total workload by 20% to 30%, but the processing time may still doubled due to fine-grained lock conflicts. Overall, we analyzed the pros and cons of each option for the five key issues. We believe our findings will help end-users choose a suitable framework, and designers improve current ones.\",\"PeriodicalId\":417279,\"journal\":{\"name\":\"Proceedings of the 12th ACM International Conference on Computing Frontiers\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th ACM International Conference on Computing Frontiers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2742854.2742884\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2742854.2742884","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An evaluation and analysis of graph processing frameworks on five key issues
With the continuously emerging applications in fields like social media analysis, mining massive graphs has drawn increasing attentions from industry and academia. To aid the development of distributed graph algorithms, various programming frameworks have been proposed. To better understand their performance differences under specific scenarios, we analyzed and compared a set of seven representative frameworks under five design aspects, including distribution policy, on-disk data organization, programming model, synchronization policy and message model. Our experiments reveal some interesting phenomena. For example, We observed that the vertex-cut method overweighs the edge-cut method on neighbor-based algorithms while leads to inefficiency for non-neighbor-based algorithms. Furthermore, we observed that using asynchronous update can reduce the total workload by 20% to 30%, but the processing time may still doubled due to fine-grained lock conflicts. Overall, we analyzed the pros and cons of each option for the five key issues. We believe our findings will help end-users choose a suitable framework, and designers improve current ones.