{"title":"ARACompiler: a prototyping flow and evaluation framework for accelerator-rich architectures","authors":"Yu-Ting Chen, J. Cong, Bingjun Xiao","doi":"10.1109/ISPASS.2015.7095795","DOIUrl":null,"url":null,"abstract":"Accelerator-rich architectures (ARAs) provide energy-efficient solutions for domain-specific computing in the age of dark silicon. However, due to the complex interaction between the general-purpose cores, accelerators, customized onchip interconnects, customized memory systems, and operating systems, it has been difficult to get detailed and accurate evaluations and analyses of ARAs on complex real-life benchmarks using the existing full-system simulators. In this paper we develop the ARACompiler, which is a highly automated design flow for prototyping ARAs and performing evaluation on FPGAs. An efficient system software stack is generated automatically to handle resource management and TLB misses.We further provide application programming interfaces (APIs) for users to develop their applications using accelerators. The flow can provide 2.9x to 42.6x evaluation time saving over the full-system simulations.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095795","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Accelerator-rich architectures (ARAs) provide energy-efficient solutions for domain-specific computing in the age of dark silicon. However, due to the complex interaction between the general-purpose cores, accelerators, customized onchip interconnects, customized memory systems, and operating systems, it has been difficult to get detailed and accurate evaluations and analyses of ARAs on complex real-life benchmarks using the existing full-system simulators. In this paper we develop the ARACompiler, which is a highly automated design flow for prototyping ARAs and performing evaluation on FPGAs. An efficient system software stack is generated automatically to handle resource management and TLB misses.We further provide application programming interfaces (APIs) for users to develop their applications using accelerators. The flow can provide 2.9x to 42.6x evaluation time saving over the full-system simulations.