{"title":"基于fpga的加速器快速多目标算法设计协同探索","authors":"Kumud Nepal, O. Ulusel, R. I. Bahar, S. Reda","doi":"10.1109/FCCM.2012.21","DOIUrl":null,"url":null,"abstract":"The reconfigurability of Field Programmable Gate Arrays (FPGAs) makes them an attractive platform for accelerating algorithms. Accelerating a particular algorithm is a challenging task as the large number of possible algorithmic and hardware design parameters lead to different accelerator variant implementations, each with its own metrics such as performance, area, power, and arithmetic accuracy characteristics. To identify these parameters that optimize the accelerator for certain metrics, we propose techniques for fast design space exploration and non-linear multi-objective optimization (e.g., minimize power under arithmetic inaccuracy bounds). Our methodology samples a small part of the design space and uses measurements from the sampled implementations to train mathematical models for the different metrics. To automate and improve the model generation process, we propose the use of L1-regularized least squares regression techniques. To demonstrate the effectiveness of our approach, we implement a high-throughput real-time accelerator for image debluring. We demonstrate the accuracy (e.g., within 8% for power modeling) of our modeling techniques and their ability to identify the optimal accelerator designs with large speed-ups (340×) in comparison to brute-force enumeration.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Fast Multi-Objective Algorithmic Design Co-Exploration for FPGA-based Accelerators\",\"authors\":\"Kumud Nepal, O. Ulusel, R. I. Bahar, S. Reda\",\"doi\":\"10.1109/FCCM.2012.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The reconfigurability of Field Programmable Gate Arrays (FPGAs) makes them an attractive platform for accelerating algorithms. Accelerating a particular algorithm is a challenging task as the large number of possible algorithmic and hardware design parameters lead to different accelerator variant implementations, each with its own metrics such as performance, area, power, and arithmetic accuracy characteristics. To identify these parameters that optimize the accelerator for certain metrics, we propose techniques for fast design space exploration and non-linear multi-objective optimization (e.g., minimize power under arithmetic inaccuracy bounds). Our methodology samples a small part of the design space and uses measurements from the sampled implementations to train mathematical models for the different metrics. To automate and improve the model generation process, we propose the use of L1-regularized least squares regression techniques. To demonstrate the effectiveness of our approach, we implement a high-throughput real-time accelerator for image debluring. We demonstrate the accuracy (e.g., within 8% for power modeling) of our modeling techniques and their ability to identify the optimal accelerator designs with large speed-ups (340×) in comparison to brute-force enumeration.\",\"PeriodicalId\":226197,\"journal\":{\"name\":\"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2012.21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2012.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast Multi-Objective Algorithmic Design Co-Exploration for FPGA-based Accelerators
The reconfigurability of Field Programmable Gate Arrays (FPGAs) makes them an attractive platform for accelerating algorithms. Accelerating a particular algorithm is a challenging task as the large number of possible algorithmic and hardware design parameters lead to different accelerator variant implementations, each with its own metrics such as performance, area, power, and arithmetic accuracy characteristics. To identify these parameters that optimize the accelerator for certain metrics, we propose techniques for fast design space exploration and non-linear multi-objective optimization (e.g., minimize power under arithmetic inaccuracy bounds). Our methodology samples a small part of the design space and uses measurements from the sampled implementations to train mathematical models for the different metrics. To automate and improve the model generation process, we propose the use of L1-regularized least squares regression techniques. To demonstrate the effectiveness of our approach, we implement a high-throughput real-time accelerator for image debluring. We demonstrate the accuracy (e.g., within 8% for power modeling) of our modeling techniques and their ability to identify the optimal accelerator designs with large speed-ups (340×) in comparison to brute-force enumeration.