Rui Hou, Lixin Zhang, Michael C. Huang, Kun Wang, H. Franke, Y. Ge, Xiaotao Chang
{"title":"Efficient data streaming with on-chip accelerators: Opportunities and challenges","authors":"Rui Hou, Lixin Zhang, Michael C. Huang, Kun Wang, H. Franke, Y. Ge, Xiaotao Chang","doi":"10.1109/HPCA.2011.5749739","DOIUrl":null,"url":null,"abstract":"The transistor density of microprocessors continues to increase as technology scales. Microprocessors designers have taken advantage of the increased transistors by integrating a significant number of cores onto a single die. However, a large number of cores are met with diminishing returns due to software and hardware scalability issues and hence designers have started integrating on-chip special-purpose logic units (i.e., accelerators) that were previously available as PCI-attached units. It is anticipated that more accelerators will be integrated on-chip due to the increasing abundance of transistors and the fact that not all logic can be powered at all times due to power budget limits. Thus, on-chip accelerator architectures deserve more attention from the research community. There is a wide spectrum of research opportunities for design and optimization of accelerators. This paper attempts to bring out some insights by studying the data access streams of on-chip accelerators that hopefully foster some future research in this area. Specifically, this paper uses a few simple case studies to show some of the common characteristics of the data streams introduced by on-chip accelerators, discusses challenges and opportunities in exploiting these characteristics to optimize the power and performance of accelerators, and then analyzes the effectiveness of some simple optimizing extensions proposed.","PeriodicalId":126976,"journal":{"name":"2011 IEEE 17th International Symposium on High Performance Computer Architecture","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 17th International Symposium on High Performance Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2011.5749739","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30
Abstract
The transistor density of microprocessors continues to increase as technology scales. Microprocessors designers have taken advantage of the increased transistors by integrating a significant number of cores onto a single die. However, a large number of cores are met with diminishing returns due to software and hardware scalability issues and hence designers have started integrating on-chip special-purpose logic units (i.e., accelerators) that were previously available as PCI-attached units. It is anticipated that more accelerators will be integrated on-chip due to the increasing abundance of transistors and the fact that not all logic can be powered at all times due to power budget limits. Thus, on-chip accelerator architectures deserve more attention from the research community. There is a wide spectrum of research opportunities for design and optimization of accelerators. This paper attempts to bring out some insights by studying the data access streams of on-chip accelerators that hopefully foster some future research in this area. Specifically, this paper uses a few simple case studies to show some of the common characteristics of the data streams introduced by on-chip accelerators, discusses challenges and opportunities in exploiting these characteristics to optimize the power and performance of accelerators, and then analyzes the effectiveness of some simple optimizing extensions proposed.