A Deep Dive Into Understanding The Random Walk-Based Temporal Graph Learning

2021 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2021-11-01 DOI:10.5281/ZENODO.5555384

Nishil Talati, Di Jin, Haojie Ye, Ajay Brahmakshatriya, Ganesh S. Dasika, S. Amarasinghe, T. Mudge, Danai Koutra, R. Dreslinski

{"title":"A Deep Dive Into Understanding The Random Walk-Based Temporal Graph Learning","authors":"Nishil Talati, Di Jin, Haojie Ye, Ajay Brahmakshatriya, Ganesh S. Dasika, S. Amarasinghe, T. Mudge, Danai Koutra, R. Dreslinski","doi":"10.5281/ZENODO.5555384","DOIUrl":null,"url":null,"abstract":"Machine learning on graph data has gained significant interest because of its applicability to various domains ranging from product recommendations to drug discovery. While there is a rapid growth in the algorithmic community, the computer architecture community has so far focused on a subset of graph learning algorithms including Graph Convolution Network (GCN), and a few others. In this paper, we study another, more scalable, graph learning algorithm based on random walks, which operates on dynamic input graphs and has attracted less attention in the architecture community compared to GCN. We propose high-performance CPU and GPU implementations of two important graph learning tasks, that cover a broad class of applications, using random walks on continuous-time dynamic graphs: link prediction and node classification. We show that the resulting workload exhibits distinct characteristics, measured in terms of irregularity, core and memory utilization, and cache hit rates, compared to graph traversals, deep learning, and GCN. We further conduct an in-depth performance analysis focused on both algorithm and hardware to guide future software optimization and architecture exploration. The algorithm-focused study presents a rich trade-off space between algorithmic performance and runtime complexity to identify optimization opportunities. We find an optimal hyperparameter setting that strikes balance in this trade-off space. Using this setting, we also perform a detailed microarchitectural characterization to analyze hardware behavior of these applications and uncover execution bottlenecks, which include high cache misses and dependency-related stalls. The outcome of our study includes recommendations for further performance optimization, and open-source implementations for future investigation.","PeriodicalId":203713,"journal":{"name":"2021 IEEE International Symposium on Workload Characterization (IISWC)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Symposium on Workload Characterization (IISWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.5555384","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Machine learning on graph data has gained significant interest because of its applicability to various domains ranging from product recommendations to drug discovery. While there is a rapid growth in the algorithmic community, the computer architecture community has so far focused on a subset of graph learning algorithms including Graph Convolution Network (GCN), and a few others. In this paper, we study another, more scalable, graph learning algorithm based on random walks, which operates on dynamic input graphs and has attracted less attention in the architecture community compared to GCN. We propose high-performance CPU and GPU implementations of two important graph learning tasks, that cover a broad class of applications, using random walks on continuous-time dynamic graphs: link prediction and node classification. We show that the resulting workload exhibits distinct characteristics, measured in terms of irregularity, core and memory utilization, and cache hit rates, compared to graph traversals, deep learning, and GCN. We further conduct an in-depth performance analysis focused on both algorithm and hardware to guide future software optimization and architecture exploration. The algorithm-focused study presents a rich trade-off space between algorithmic performance and runtime complexity to identify optimization opportunities. We find an optimal hyperparameter setting that strikes balance in this trade-off space. Using this setting, we also perform a detailed microarchitectural characterization to analyze hardware behavior of these applications and uncover execution bottlenecks, which include high cache misses and dependency-related stalls. The outcome of our study includes recommendations for further performance optimization, and open-source implementations for future investigation.

查看原文本刊更多论文

深入了解基于随机行走的时间图学习

基于图数据的机器学习已经获得了极大的兴趣，因为它适用于从产品推荐到药物发现的各个领域。虽然算法社区正在快速发展，但计算机体系结构社区迄今为止只关注图学习算法的一个子集，包括图卷积网络(GCN)和其他一些算法。在本文中，我们研究了另一种更具可扩展性的基于随机游动的图学习算法，该算法对动态输入图进行操作，与GCN相比，在架构界受到的关注较少。我们提出了两个重要的图学习任务的高性能CPU和GPU实现，涵盖了广泛的应用类别，使用连续时间动态图上的随机行走:链接预测和节点分类。我们表明，与图遍历、深度学习和GCN相比，由此产生的工作负载表现出明显的特征，以不规则性、核心和内存利用率以及缓存命中率来衡量。我们进一步从算法和硬件两个方面进行了深入的性能分析，以指导未来的软件优化和架构探索。以算法为中心的研究在算法性能和运行时复杂性之间提供了丰富的权衡空间，以确定优化机会。我们找到了一个最优的超参数设置，在这个权衡空间中达到平衡。使用此设置，我们还执行详细的微体系结构特征，以分析这些应用程序的硬件行为并发现执行瓶颈，其中包括高缓存丢失和依赖相关的延迟。我们的研究结果包括进一步性能优化的建议，以及用于未来研究的开源实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Symposium on Workload Characterization (IISWC)

自引率

0.00%

发文量