Jaeyoung Kang, Behnam Khaleghi, Yeseong Kim, T. Simunic
{"title":"XCelHD:具有并行训练的高效gpu驱动的超维计算","authors":"Jaeyoung Kang, Behnam Khaleghi, Yeseong Kim, T. Simunic","doi":"10.1109/ASP-DAC52403.2022.9712549","DOIUrl":null,"url":null,"abstract":"Hyperdimensional Computing (HDC) is an emerging lightweight machine learning method alternative to deep learning. One of its key strengths is the ability to accelerate it in hardware, as it offers massive parallelisms. Prior work primarily focused on FPGA and ASIC, which do not provide the seamless flexibility required for HDC applications. Few studies that attempted GPU designs are inefficient, partly due to the complexity of accelerating HDC on GPUs because of the bit-level operations of HDC. Besides, HDC training exhibited low hardware utilization due to sequential operations. In this paper, we present XCelHD, a high-performance GPU-powered framework for HDC. XCelHD uses a novel training method to maximize the training speed of the HDC model while fully utilizing hardware. We propose memory optimization strategies specialized for GPU-based HDC, minimizing the access time to different memory subsystems and redundant operations. We show that the proposed training method reduces the required number of training epochs by four-fold to achieve comparable accuracy. Our evaluation results on NVIDIA Jetson TX2 show that XCelHD is up to $35\\times$ faster than the state-of-the-art TensorFlow-based HDC implementation.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"XCelHD: An Efficient GPU-Powered Hyperdimensional Computing with Parallelized Training\",\"authors\":\"Jaeyoung Kang, Behnam Khaleghi, Yeseong Kim, T. Simunic\",\"doi\":\"10.1109/ASP-DAC52403.2022.9712549\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hyperdimensional Computing (HDC) is an emerging lightweight machine learning method alternative to deep learning. One of its key strengths is the ability to accelerate it in hardware, as it offers massive parallelisms. Prior work primarily focused on FPGA and ASIC, which do not provide the seamless flexibility required for HDC applications. Few studies that attempted GPU designs are inefficient, partly due to the complexity of accelerating HDC on GPUs because of the bit-level operations of HDC. Besides, HDC training exhibited low hardware utilization due to sequential operations. In this paper, we present XCelHD, a high-performance GPU-powered framework for HDC. XCelHD uses a novel training method to maximize the training speed of the HDC model while fully utilizing hardware. We propose memory optimization strategies specialized for GPU-based HDC, minimizing the access time to different memory subsystems and redundant operations. We show that the proposed training method reduces the required number of training epochs by four-fold to achieve comparable accuracy. Our evaluation results on NVIDIA Jetson TX2 show that XCelHD is up to $35\\\\times$ faster than the state-of-the-art TensorFlow-based HDC implementation.\",\"PeriodicalId\":239260,\"journal\":{\"name\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASP-DAC52403.2022.9712549\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASP-DAC52403.2022.9712549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
XCelHD: An Efficient GPU-Powered Hyperdimensional Computing with Parallelized Training
Hyperdimensional Computing (HDC) is an emerging lightweight machine learning method alternative to deep learning. One of its key strengths is the ability to accelerate it in hardware, as it offers massive parallelisms. Prior work primarily focused on FPGA and ASIC, which do not provide the seamless flexibility required for HDC applications. Few studies that attempted GPU designs are inefficient, partly due to the complexity of accelerating HDC on GPUs because of the bit-level operations of HDC. Besides, HDC training exhibited low hardware utilization due to sequential operations. In this paper, we present XCelHD, a high-performance GPU-powered framework for HDC. XCelHD uses a novel training method to maximize the training speed of the HDC model while fully utilizing hardware. We propose memory optimization strategies specialized for GPU-based HDC, minimizing the access time to different memory subsystems and redundant operations. We show that the proposed training method reduces the required number of training epochs by four-fold to achieve comparable accuracy. Our evaluation results on NVIDIA Jetson TX2 show that XCelHD is up to $35\times$ faster than the state-of-the-art TensorFlow-based HDC implementation.