{"title":"基于高级硬件描述语言的fpga上的高性能点阵回归","authors":"Nathan Zhang, Matthew Feldman, K. Olukotun","doi":"10.1109/ICFPT52863.2021.9609893","DOIUrl":null,"url":null,"abstract":"Lattice regression-based models are highly-constrainable and interpretable machine learning models used in applications such as query classification and path length prediction for maps. To improve their performance and better serve these models to millions of consumers, we accelerate them using field programmable gate arrays. We adopt a library-based approach using a high level hardware description language (HLHDL) to support the broad family of lattice models. HLHDLs improve productivity by providing both control abstraction such as looping, reductions, and memory hierarchies, as well as automatically handling low-level tasks such as retiming. However, these abstractions can lead to performance bottlenecks if not carefully used. We characterize these bottlenecks and implement a lattice regression library using a streaming tensor abstraction which avoids them. On a pair of models trained for network anomaly detection, we achieve a ${166\\,-\\,256\\times}$ speedup over CPUs even with large batch sizes.","PeriodicalId":376220,"journal":{"name":"2021 International Conference on Field-Programmable Technology (ICFPT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High performance lattice regression on FPGAs via a high level hardware description language\",\"authors\":\"Nathan Zhang, Matthew Feldman, K. Olukotun\",\"doi\":\"10.1109/ICFPT52863.2021.9609893\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lattice regression-based models are highly-constrainable and interpretable machine learning models used in applications such as query classification and path length prediction for maps. To improve their performance and better serve these models to millions of consumers, we accelerate them using field programmable gate arrays. We adopt a library-based approach using a high level hardware description language (HLHDL) to support the broad family of lattice models. HLHDLs improve productivity by providing both control abstraction such as looping, reductions, and memory hierarchies, as well as automatically handling low-level tasks such as retiming. However, these abstractions can lead to performance bottlenecks if not carefully used. We characterize these bottlenecks and implement a lattice regression library using a streaming tensor abstraction which avoids them. On a pair of models trained for network anomaly detection, we achieve a ${166\\\\,-\\\\,256\\\\times}$ speedup over CPUs even with large batch sizes.\",\"PeriodicalId\":376220,\"journal\":{\"name\":\"2021 International Conference on Field-Programmable Technology (ICFPT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Field-Programmable Technology (ICFPT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICFPT52863.2021.9609893\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT52863.2021.9609893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
High performance lattice regression on FPGAs via a high level hardware description language
Lattice regression-based models are highly-constrainable and interpretable machine learning models used in applications such as query classification and path length prediction for maps. To improve their performance and better serve these models to millions of consumers, we accelerate them using field programmable gate arrays. We adopt a library-based approach using a high level hardware description language (HLHDL) to support the broad family of lattice models. HLHDLs improve productivity by providing both control abstraction such as looping, reductions, and memory hierarchies, as well as automatically handling low-level tasks such as retiming. However, these abstractions can lead to performance bottlenecks if not carefully used. We characterize these bottlenecks and implement a lattice regression library using a streaming tensor abstraction which avoids them. On a pair of models trained for network anomaly detection, we achieve a ${166\,-\,256\times}$ speedup over CPUs even with large batch sizes.