XCelHD:具有并行训练的高效gpu驱动的超维计算

Jaeyoung Kang, Behnam Khaleghi, Yeseong Kim, T. Simunic
{"title":"XCelHD:具有并行训练的高效gpu驱动的超维计算","authors":"Jaeyoung Kang, Behnam Khaleghi, Yeseong Kim, T. Simunic","doi":"10.1109/ASP-DAC52403.2022.9712549","DOIUrl":null,"url":null,"abstract":"Hyperdimensional Computing (HDC) is an emerging lightweight machine learning method alternative to deep learning. One of its key strengths is the ability to accelerate it in hardware, as it offers massive parallelisms. Prior work primarily focused on FPGA and ASIC, which do not provide the seamless flexibility required for HDC applications. Few studies that attempted GPU designs are inefficient, partly due to the complexity of accelerating HDC on GPUs because of the bit-level operations of HDC. Besides, HDC training exhibited low hardware utilization due to sequential operations. In this paper, we present XCelHD, a high-performance GPU-powered framework for HDC. XCelHD uses a novel training method to maximize the training speed of the HDC model while fully utilizing hardware. We propose memory optimization strategies specialized for GPU-based HDC, minimizing the access time to different memory subsystems and redundant operations. We show that the proposed training method reduces the required number of training epochs by four-fold to achieve comparable accuracy. Our evaluation results on NVIDIA Jetson TX2 show that XCelHD is up to $35\\times$ faster than the state-of-the-art TensorFlow-based HDC implementation.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"XCelHD: An Efficient GPU-Powered Hyperdimensional Computing with Parallelized Training\",\"authors\":\"Jaeyoung Kang, Behnam Khaleghi, Yeseong Kim, T. Simunic\",\"doi\":\"10.1109/ASP-DAC52403.2022.9712549\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hyperdimensional Computing (HDC) is an emerging lightweight machine learning method alternative to deep learning. One of its key strengths is the ability to accelerate it in hardware, as it offers massive parallelisms. Prior work primarily focused on FPGA and ASIC, which do not provide the seamless flexibility required for HDC applications. Few studies that attempted GPU designs are inefficient, partly due to the complexity of accelerating HDC on GPUs because of the bit-level operations of HDC. Besides, HDC training exhibited low hardware utilization due to sequential operations. In this paper, we present XCelHD, a high-performance GPU-powered framework for HDC. XCelHD uses a novel training method to maximize the training speed of the HDC model while fully utilizing hardware. We propose memory optimization strategies specialized for GPU-based HDC, minimizing the access time to different memory subsystems and redundant operations. We show that the proposed training method reduces the required number of training epochs by four-fold to achieve comparable accuracy. Our evaluation results on NVIDIA Jetson TX2 show that XCelHD is up to $35\\\\times$ faster than the state-of-the-art TensorFlow-based HDC implementation.\",\"PeriodicalId\":239260,\"journal\":{\"name\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASP-DAC52403.2022.9712549\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASP-DAC52403.2022.9712549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

HDC (Hyperdimensional Computing)是一种新兴的轻量级机器学习方法,可以替代深度学习。它的主要优势之一是能够在硬件上加速,因为它提供了大量的并行性。之前的工作主要集中在FPGA和ASIC上,它们不能提供HDC应用所需的无缝灵活性。很少有研究尝试GPU设计是低效的,部分原因是由于HDC的位级操作导致在GPU上加速HDC的复杂性。此外,HDC训练由于顺序操作,硬件利用率较低。在本文中,我们提出了XCelHD,一个高性能的gpu驱动的HDC框架。XCelHD采用了一种新颖的训练方法,在充分利用硬件的同时最大限度地提高了HDC模型的训练速度。我们提出了专门针对基于gpu的HDC的内存优化策略,最大限度地减少了对不同内存子系统的访问时间和冗余操作。我们表明,所提出的训练方法将所需的训练次数减少了四倍,以达到相当的精度。我们在NVIDIA Jetson TX2上的评估结果表明,XCelHD比最先进的基于tensorflow的HDC实现快35倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
XCelHD: An Efficient GPU-Powered Hyperdimensional Computing with Parallelized Training
Hyperdimensional Computing (HDC) is an emerging lightweight machine learning method alternative to deep learning. One of its key strengths is the ability to accelerate it in hardware, as it offers massive parallelisms. Prior work primarily focused on FPGA and ASIC, which do not provide the seamless flexibility required for HDC applications. Few studies that attempted GPU designs are inefficient, partly due to the complexity of accelerating HDC on GPUs because of the bit-level operations of HDC. Besides, HDC training exhibited low hardware utilization due to sequential operations. In this paper, we present XCelHD, a high-performance GPU-powered framework for HDC. XCelHD uses a novel training method to maximize the training speed of the HDC model while fully utilizing hardware. We propose memory optimization strategies specialized for GPU-based HDC, minimizing the access time to different memory subsystems and redundant operations. We show that the proposed training method reduces the required number of training epochs by four-fold to achieve comparable accuracy. Our evaluation results on NVIDIA Jetson TX2 show that XCelHD is up to $35\times$ faster than the state-of-the-art TensorFlow-based HDC implementation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信