Hongtao Zhong, Yu Zhu, Longfei Luo, Taixin Li, Chen Wang, Yixin Xu, Tian Wang, Yao Yu, N. Vijaykrishnan, Yongpan Liu, Liang Shi, Huazhong Yang, Xueqing Li
{"title":"基于三维ffet存储器的图形卷积网络PIM加速器Fe-GCN","authors":"Hongtao Zhong, Yu Zhu, Longfei Luo, Taixin Li, Chen Wang, Yixin Xu, Tian Wang, Yao Yu, N. Vijaykrishnan, Yongpan Liu, Liang Shi, Huazhong Yang, Xueqing Li","doi":"10.1109/ISVLSI59464.2023.10238622","DOIUrl":null,"url":null,"abstract":"Graph convolutional network (GCN) has emerged as a powerful model for many graph-related tasks. In conventional von Neumann architectures, massive data movement and irregular memory access in GCN computation severely degrade the performance and computation efficiency. For GCN acceleration, processing-in-memory (PIM) is promising by reducing the data movement. However, with the emergence of large GCN computation tasks, existing 2D PIM GCN accelerators face the challenge of storing all the necessary data on chip due to the limited PIM memory capacity, resulting in unwanted external memory access and degradation of performance and energy efficiency. This paper presents Fe-GCN, a 3D PIM GCN accelerator with high memory density based on the ferroelectric field-effect transistor (FeFET) memory. Besides, to mitigate the impact of the increased latency of the 3D memory structure, several software-hardware co-optimizations are proposed. Furthermore, an edge merging technique is also proposed to increase the memory utilization for the 3D GCN mapping and computing. Experimental results show that Fe-GCN achieves on average 2,647x, 58x, 18x, and 35x speedup and 26,708x, 1,246x, 25x, and 57x energy efficiency improvement over CPU, GPU, the state-of-the-art accelerators based on RRAM PIM and ASIC, respectively.","PeriodicalId":199371,"journal":{"name":"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fe-GCN: A 3D FeFET Memory Based PIM Accelerator for Graph Convolutional Networks\",\"authors\":\"Hongtao Zhong, Yu Zhu, Longfei Luo, Taixin Li, Chen Wang, Yixin Xu, Tian Wang, Yao Yu, N. Vijaykrishnan, Yongpan Liu, Liang Shi, Huazhong Yang, Xueqing Li\",\"doi\":\"10.1109/ISVLSI59464.2023.10238622\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph convolutional network (GCN) has emerged as a powerful model for many graph-related tasks. In conventional von Neumann architectures, massive data movement and irregular memory access in GCN computation severely degrade the performance and computation efficiency. For GCN acceleration, processing-in-memory (PIM) is promising by reducing the data movement. However, with the emergence of large GCN computation tasks, existing 2D PIM GCN accelerators face the challenge of storing all the necessary data on chip due to the limited PIM memory capacity, resulting in unwanted external memory access and degradation of performance and energy efficiency. This paper presents Fe-GCN, a 3D PIM GCN accelerator with high memory density based on the ferroelectric field-effect transistor (FeFET) memory. Besides, to mitigate the impact of the increased latency of the 3D memory structure, several software-hardware co-optimizations are proposed. Furthermore, an edge merging technique is also proposed to increase the memory utilization for the 3D GCN mapping and computing. Experimental results show that Fe-GCN achieves on average 2,647x, 58x, 18x, and 35x speedup and 26,708x, 1,246x, 25x, and 57x energy efficiency improvement over CPU, GPU, the state-of-the-art accelerators based on RRAM PIM and ASIC, respectively.\",\"PeriodicalId\":199371,\"journal\":{\"name\":\"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISVLSI59464.2023.10238622\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISVLSI59464.2023.10238622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fe-GCN: A 3D FeFET Memory Based PIM Accelerator for Graph Convolutional Networks
Graph convolutional network (GCN) has emerged as a powerful model for many graph-related tasks. In conventional von Neumann architectures, massive data movement and irregular memory access in GCN computation severely degrade the performance and computation efficiency. For GCN acceleration, processing-in-memory (PIM) is promising by reducing the data movement. However, with the emergence of large GCN computation tasks, existing 2D PIM GCN accelerators face the challenge of storing all the necessary data on chip due to the limited PIM memory capacity, resulting in unwanted external memory access and degradation of performance and energy efficiency. This paper presents Fe-GCN, a 3D PIM GCN accelerator with high memory density based on the ferroelectric field-effect transistor (FeFET) memory. Besides, to mitigate the impact of the increased latency of the 3D memory structure, several software-hardware co-optimizations are proposed. Furthermore, an edge merging technique is also proposed to increase the memory utilization for the 3D GCN mapping and computing. Experimental results show that Fe-GCN achieves on average 2,647x, 58x, 18x, and 35x speedup and 26,708x, 1,246x, 25x, and 57x energy efficiency improvement over CPU, GPU, the state-of-the-art accelerators based on RRAM PIM and ASIC, respectively.