H. Kang, Yiwei Zhao, G. Blelloch, Laxman Dhulipala, Yan Gu, Charles McGuffey, Phillip B. Gibbons
{"title":"PIM-tree: A Skew-resistant Index for Processing-in-Memory (Abstract)","authors":"H. Kang, Yiwei Zhao, G. Blelloch, Laxman Dhulipala, Yan Gu, Charles McGuffey, Phillip B. Gibbons","doi":"10.1145/3597635.3598029","DOIUrl":"https://doi.org/10.1145/3597635.3598029","url":null,"abstract":"Processing-in-memory (PIM) is an emerging technology to alleviate the high cost of data movement by pushing computation into/near memory modules. There is an inherent tension, however, between minimizing communication (data movement) and achieving load balance in PIM systems in the presence of workload skew. This work introduces PIM-tree, a PIM-based index that simultaneously achieves low communication, good load balance, and low space consumption. It achieves good theoretical bounds in the PIM Model and efficient on a real-world PIM machine, outperforming prior PIM-based and state-of-the-art CPU-based indexes.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123054316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Theoretically and Practically Efficient Parallel Nucleus Decomposition (Abstract)","authors":"Jessica Shi, Laxman Dhulipala, Julian Shun","doi":"10.1145/3597635.3598024","DOIUrl":"https://doi.org/10.1145/3597635.3598024","url":null,"abstract":"Discovering dense substructures in graphs is a fundamental topic in graph mining, and has been studied across many areas including computational biology, spam and fraud-detection, and large-scale network analysis. Recently, Sariyuce et al. introduced the nucleus decomposition problem, which generalizes the influential notions of k-cores and k-trusses to k-(r,s) nucleii, and can better capture higher-order structures. Informally, a k-(r,s) nucleus is the maximal induced subgraph such that every r-clique in the subgraph is contained in at least k s-cliques. The goal of the (r, s) nucleus decomposition problem is to identify for each r-clique in the graph, the largest k such that it is in a k-(r,s) nucleus. Solving the (r, s) nucleus decomposition problem is a significant computational challenge for several reasons. First, simply counting and enumerating s-cliques is a challenging task, even for modest s. Second, storing information for all r-cliques can require a large amount of space, even for relatively small graphs. Third, engineering fast and high-performance solutions to this problem requires taking advantage of parallelism due to the computationally-intensive nature of listing cliques. There are two well-known parallel paradigms for approaching the (r, s) nucleus decomposition problem, a global peeling-based model and a local update model that iterates until convergence. The former is inherently challenging to parallelize due to sequential dependencies and necessary synchronization steps, which we address in this paper, and we demonstrate that the latter requires orders of magnitude more work to converge to the same solution and is thus less performant.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125933749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Faster Parallel Exact Density Peaks Clustering (Abstract)","authors":"Yihao Huang, Shangdi Yu, Julian Shun","doi":"10.1145/3597635.3598021","DOIUrl":"https://doi.org/10.1145/3597635.3598021","url":null,"abstract":"Clustering multidimensional points is a fundamental data mining task, with applications in many fields, such as astronomy, neuroscience, bioinformatics, and computer vision. The goal of clustering algorithms is to group similar objects together. Density-based clustering is a clustering approach that defines clusters as dense regions of points. It has the advantage of being able to detect clusters of arbitrary shapes, rendering it useful in many applications. In this paper, we propose fast parallel algorithms for Density Peaks Clustering (DPC), a popular version of density-based clustering. Existing exact DPC algorithms suffer from low parallelism both in theory and in practice, which limits their application to large-scale data sets. Our most performant algorithm, which is based on priority search d-trees, achieves O (log n log log n) span (parallel time complexity) for a data set of n points. Our algorithm is also work-efficient, achieving a work complexity matching the best existing sequential exact DPC algorithm. In addition, we present another DPC algorithm based on a Fenwick tree that makes fewer assumptions for its average-case complexity to hold. We provide optimized implementations of our algorithms and evaluate their performance via extensive experiments. On a 30-core machine with two-way hyperthreading, we find that our best algorithm achieves a 10.8-13169x speedup over the previous best parallel exact DPC algorithm. Compared to the state-of-the-art parallel approximate DPC algorithm, our best algorithm achieves a 1.5-4206X speedup, while being exact.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130818900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","authors":"","doi":"10.1145/3597635","DOIUrl":"https://doi.org/10.1145/3597635","url":null,"abstract":"","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126571745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}