Zhiyuan Lin, Minsuk Kahng, Kaeser Md Sabrin, Duen Horng Polo Chau, Ho Lee, U Kang
{"title":"MMap:通过内存映射在PC上的快速十亿尺度图形计算。","authors":"Zhiyuan Lin, Minsuk Kahng, Kaeser Md Sabrin, Duen Horng Polo Chau, Ho Lee, U Kang","doi":"10.1109/BigData.2014.7004226","DOIUrl":null,"url":null,"abstract":"<p><p>Graph computation approaches such as GraphChi and TurboGraph recently demonstrated that a single PC can perform efficient computation on billion-node graphs. To achieve high speed and scalability, they often need sophisticated data structures and memory management strategies. We propose a minimalist approach that forgoes such requirements, by leveraging the fundamental <i>memory mapping</i> (MMap) capability found on operating systems. We contribute: (1) a new insight that MMap is a viable technique for creating fast and scalable graph algorithms that surpasses some of the best techniques; (2) the design and implementation of popular graph algorithms for billion-scale graphs with little code, thanks to memory mapping; (3) extensive experiments on real graphs, including the 6.6 billion edge YahooWeb graph, and show that this new approach is significantly faster or comparable to the highly-optimized methods (e.g., 9.5× faster than GraphChi for computing PageRank on 1.47B edge Twitter graph). We believe our work provides a new direction in the design and development of scalable algorithms. Our packaged code is available at http://poloclub.gatech.edu/mmap/.</p>","PeriodicalId":74501,"journal":{"name":"Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data","volume":"2014 ","pages":"159-164"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BigData.2014.7004226","citationCount":"50","resultStr":"{\"title\":\"MMap: Fast Billion-Scale Graph Computation on a PC via Memory Mapping.\",\"authors\":\"Zhiyuan Lin, Minsuk Kahng, Kaeser Md Sabrin, Duen Horng Polo Chau, Ho Lee, U Kang\",\"doi\":\"10.1109/BigData.2014.7004226\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Graph computation approaches such as GraphChi and TurboGraph recently demonstrated that a single PC can perform efficient computation on billion-node graphs. To achieve high speed and scalability, they often need sophisticated data structures and memory management strategies. We propose a minimalist approach that forgoes such requirements, by leveraging the fundamental <i>memory mapping</i> (MMap) capability found on operating systems. We contribute: (1) a new insight that MMap is a viable technique for creating fast and scalable graph algorithms that surpasses some of the best techniques; (2) the design and implementation of popular graph algorithms for billion-scale graphs with little code, thanks to memory mapping; (3) extensive experiments on real graphs, including the 6.6 billion edge YahooWeb graph, and show that this new approach is significantly faster or comparable to the highly-optimized methods (e.g., 9.5× faster than GraphChi for computing PageRank on 1.47B edge Twitter graph). We believe our work provides a new direction in the design and development of scalable algorithms. Our packaged code is available at http://poloclub.gatech.edu/mmap/.</p>\",\"PeriodicalId\":74501,\"journal\":{\"name\":\"Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data\",\"volume\":\"2014 \",\"pages\":\"159-164\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/BigData.2014.7004226\",\"citationCount\":\"50\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BigData.2014.7004226\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BigData.2014.7004226","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MMap: Fast Billion-Scale Graph Computation on a PC via Memory Mapping.
Graph computation approaches such as GraphChi and TurboGraph recently demonstrated that a single PC can perform efficient computation on billion-node graphs. To achieve high speed and scalability, they often need sophisticated data structures and memory management strategies. We propose a minimalist approach that forgoes such requirements, by leveraging the fundamental memory mapping (MMap) capability found on operating systems. We contribute: (1) a new insight that MMap is a viable technique for creating fast and scalable graph algorithms that surpasses some of the best techniques; (2) the design and implementation of popular graph algorithms for billion-scale graphs with little code, thanks to memory mapping; (3) extensive experiments on real graphs, including the 6.6 billion edge YahooWeb graph, and show that this new approach is significantly faster or comparable to the highly-optimized methods (e.g., 9.5× faster than GraphChi for computing PageRank on 1.47B edge Twitter graph). We believe our work provides a new direction in the design and development of scalable algorithms. Our packaged code is available at http://poloclub.gatech.edu/mmap/.