THRIFTY

Proceedings of the 39th International Conference on Computer-Aided Design Pub Date : 2020-11-02 DOI:10.1145/3400302.3415723

Saransh Gupta, Justin Morris, M. Imani, R. Ramkumar, Jeffrey Yu, Aniket Tiwari, Baris Aksanli, T. Rosing

{"title":"THRIFTY","authors":"Saransh Gupta, Justin Morris, M. Imani, R. Ramkumar, Jeffrey Yu, Aniket Tiwari, Baris Aksanli, T. Rosing","doi":"10.1145/3400302.3415723","DOIUrl":null,"url":null,"abstract":"Hyperdimensional computing (HDC) is a brain-inspired computing paradigm that works with high-dimensional vectors, hypervectors, instead of numbers. HDC replaces several complex learning computations with bitwise and simpler arithmetic operations, resulting in a faster and more energy-efficient learning algorithm. However, it comes at the cost of an increased amount of data to process due to mapping the data into high-dimensional space. While some datasets may nearly fit in the memory, the resulting hypervectors more often than not can't be stored in memory, resulting in long data transfers from storage. In this paper, we propose THRIFTY, an in-storage computing (ISC) solution that performs HDC encoding and training across the flash hierarchy. To hide the latency of training and enable efficient computation, we introduce the concept of batching in HDC. It allows us to split HDC training into sub-components and process them independently. We also present, for the first time, on-chip acceleration for HDC which uses simple low-power digital circuits to implement HDC encoding in Flash planes. This enables us to explore high internal parallelism provided by the flash hierarchy and encode multiple data points in parallel with negligible latency overhead. THRIFTY also implements a single top-level FPGA accelerator, which further processes the data obtained from the chips. We exploit the state-of-the-art INSIDER ISC infrastructure to implement the top-level accelerator and provide software support to THRIFTY. THRIFTY runs HDC training completely in storage while almost entirely hiding the latency of computation. Our evaluation over five popular classification datasets shows that THRIFTY is on average 1612× faster than a CPU-server and 14.4× faster than the state-of-the-art ISC solution, INSIDER for HDC encoding and training.","PeriodicalId":367868,"journal":{"name":"Proceedings of the 39th International Conference on Computer-Aided Design","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 39th International Conference on Computer-Aided Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3400302.3415723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Hyperdimensional computing (HDC) is a brain-inspired computing paradigm that works with high-dimensional vectors, hypervectors, instead of numbers. HDC replaces several complex learning computations with bitwise and simpler arithmetic operations, resulting in a faster and more energy-efficient learning algorithm. However, it comes at the cost of an increased amount of data to process due to mapping the data into high-dimensional space. While some datasets may nearly fit in the memory, the resulting hypervectors more often than not can't be stored in memory, resulting in long data transfers from storage. In this paper, we propose THRIFTY, an in-storage computing (ISC) solution that performs HDC encoding and training across the flash hierarchy. To hide the latency of training and enable efficient computation, we introduce the concept of batching in HDC. It allows us to split HDC training into sub-components and process them independently. We also present, for the first time, on-chip acceleration for HDC which uses simple low-power digital circuits to implement HDC encoding in Flash planes. This enables us to explore high internal parallelism provided by the flash hierarchy and encode multiple data points in parallel with negligible latency overhead. THRIFTY also implements a single top-level FPGA accelerator, which further processes the data obtained from the chips. We exploit the state-of-the-art INSIDER ISC infrastructure to implement the top-level accelerator and provide software support to THRIFTY. THRIFTY runs HDC training completely in storage while almost entirely hiding the latency of computation. Our evaluation over five popular classification datasets shows that THRIFTY is on average 1612× faster than a CPU-server and 14.4× faster than the state-of-the-art ISC solution, INSIDER for HDC encoding and training.

查看原文本刊更多论文

节俭的

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 39th International Conference on Computer-Aided Design

自引率

0.00%

发文量