基于高级硬件描述语言的fpga上的高性能点阵回归

2021 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2021-12-06 DOI:10.1109/ICFPT52863.2021.9609893

Nathan Zhang, Matthew Feldman, K. Olukotun

{"title":"基于高级硬件描述语言的fpga上的高性能点阵回归","authors":"Nathan Zhang, Matthew Feldman, K. Olukotun","doi":"10.1109/ICFPT52863.2021.9609893","DOIUrl":null,"url":null,"abstract":"Lattice regression-based models are highly-constrainable and interpretable machine learning models used in applications such as query classification and path length prediction for maps. To improve their performance and better serve these models to millions of consumers, we accelerate them using field programmable gate arrays. We adopt a library-based approach using a high level hardware description language (HLHDL) to support the broad family of lattice models. HLHDLs improve productivity by providing both control abstraction such as looping, reductions, and memory hierarchies, as well as automatically handling low-level tasks such as retiming. However, these abstractions can lead to performance bottlenecks if not carefully used. We characterize these bottlenecks and implement a lattice regression library using a streaming tensor abstraction which avoids them. On a pair of models trained for network anomaly detection, we achieve a ${166\\,-\\,256\\times}$ speedup over CPUs even with large batch sizes.","PeriodicalId":376220,"journal":{"name":"2021 International Conference on Field-Programmable Technology (ICFPT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High performance lattice regression on FPGAs via a high level hardware description language\",\"authors\":\"Nathan Zhang, Matthew Feldman, K. Olukotun\",\"doi\":\"10.1109/ICFPT52863.2021.9609893\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lattice regression-based models are highly-constrainable and interpretable machine learning models used in applications such as query classification and path length prediction for maps. To improve their performance and better serve these models to millions of consumers, we accelerate them using field programmable gate arrays. We adopt a library-based approach using a high level hardware description language (HLHDL) to support the broad family of lattice models. HLHDLs improve productivity by providing both control abstraction such as looping, reductions, and memory hierarchies, as well as automatically handling low-level tasks such as retiming. However, these abstractions can lead to performance bottlenecks if not carefully used. We characterize these bottlenecks and implement a lattice regression library using a streaming tensor abstraction which avoids them. On a pair of models trained for network anomaly detection, we achieve a ${166\\\\,-\\\\,256\\\\times}$ speedup over CPUs even with large batch sizes.\",\"PeriodicalId\":376220,\"journal\":{\"name\":\"2021 International Conference on Field-Programmable Technology (ICFPT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Field-Programmable Technology (ICFPT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICFPT52863.2021.9609893\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT52863.2021.9609893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

基于点阵回归的模型是高度约束和可解释的机器学习模型，用于查询分类和地图路径长度预测等应用。为了提高它们的性能并更好地为数百万消费者服务这些模型，我们使用现场可编程门阵列加速它们。我们采用基于库的方法，使用高级硬件描述语言(HLHDL)来支持广泛的晶格模型家族。hlhdl通过提供控制抽象(如循环、缩减和内存层次结构)以及自动处理低级任务(如重新计时)来提高生产率。然而，如果不小心使用，这些抽象可能会导致性能瓶颈。我们描述了这些瓶颈，并使用流张量抽象实现了晶格回归库，从而避免了这些瓶颈。在一对训练用于网络异常检测的模型上，我们实现了在cpu上的${166\，-\，256\倍}$的加速，即使批量大小很大。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High performance lattice regression on FPGAs via a high level hardware description language

Lattice regression-based models are highly-constrainable and interpretable machine learning models used in applications such as query classification and path length prediction for maps. To improve their performance and better serve these models to millions of consumers, we accelerate them using field programmable gate arrays. We adopt a library-based approach using a high level hardware description language (HLHDL) to support the broad family of lattice models. HLHDLs improve productivity by providing both control abstraction such as looping, reductions, and memory hierarchies, as well as automatically handling low-level tasks such as retiming. However, these abstractions can lead to performance bottlenecks if not carefully used. We characterize these bottlenecks and implement a lattice regression library using a streaming tensor abstraction which avoids them. On a pair of models trained for network anomaly detection, we achieve a ${166\,-\,256\times}$ speedup over CPUs even with large batch sizes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 International Conference on Field-Programmable Technology (ICFPT)

自引率

0.00%

发文量