Hsin-Jung Yang, Kermin Fleming, Michael Adler, J. Emer
{"title":"抽象下优化:使用预取来提高FPGA性能","authors":"Hsin-Jung Yang, Kermin Fleming, Michael Adler, J. Emer","doi":"10.1109/FPL.2013.6645522","DOIUrl":null,"url":null,"abstract":"In an effort to speed the development of FPGA-based accelerators, recent research has focused on providing FPGA developers with memory and communications abstractions. Because abstraction decouples the function of these interfaces from their implementation, these new interfaces present an enormous opportunity for optimization. In this paper we examine stride prefetching as a means of improving the performance of an automatically synthesized, abstract memory hierarchy. We demonstrate, by applying our technique to several large benchmarks, that prefetching can improve preexisting application runtime by 15% on average, and up to 40%, without requiring program modification.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Optimizing under abstraction: Using prefetching to improve FPGA performance\",\"authors\":\"Hsin-Jung Yang, Kermin Fleming, Michael Adler, J. Emer\",\"doi\":\"10.1109/FPL.2013.6645522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In an effort to speed the development of FPGA-based accelerators, recent research has focused on providing FPGA developers with memory and communications abstractions. Because abstraction decouples the function of these interfaces from their implementation, these new interfaces present an enormous opportunity for optimization. In this paper we examine stride prefetching as a means of improving the performance of an automatically synthesized, abstract memory hierarchy. We demonstrate, by applying our technique to several large benchmarks, that prefetching can improve preexisting application runtime by 15% on average, and up to 40%, without requiring program modification.\",\"PeriodicalId\":200435,\"journal\":{\"name\":\"2013 23rd International Conference on Field programmable Logic and Applications\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 23rd International Conference on Field programmable Logic and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPL.2013.6645522\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 23rd International Conference on Field programmable Logic and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPL.2013.6645522","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimizing under abstraction: Using prefetching to improve FPGA performance
In an effort to speed the development of FPGA-based accelerators, recent research has focused on providing FPGA developers with memory and communications abstractions. Because abstraction decouples the function of these interfaces from their implementation, these new interfaces present an enormous opportunity for optimization. In this paper we examine stride prefetching as a means of improving the performance of an automatically synthesized, abstract memory hierarchy. We demonstrate, by applying our technique to several large benchmarks, that prefetching can improve preexisting application runtime by 15% on average, and up to 40%, without requiring program modification.