2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)最新文献_第4页

Run-time adaption for highly-complex multi-core systems 高度复杂的多核系统的运行时适应

2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2013-09-29 DOI: 10.1109/CODES-ISSS.2013.6659000

J. Henkel, N. Vijaykrishnan, S. Parameswaran, J. Teich

{"title":"Run-time adaption for highly-complex multi-core systems","authors":"J. Henkel, N. Vijaykrishnan, S. Parameswaran, J. Teich","doi":"10.1109/CODES-ISSS.2013.6659000","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659000","url":null,"abstract":"As embedded on-chip systems grow more and more complex and are about to be deployed in automotive and other demanding application areas (beyond the main-stream of consumer electronics), run-time adaptation is a prime design consideration for many reasons: i) reliability is a major concern when migrating to technology nodes of 32nm and beyond, ii) efficiency i.e. computational power per Watt etc. is a challenge as computing models do not keep up with hardware-provided computing capabilities, iii) power densities increase rapidly as Dennard Scaling fails resulting in what is dubbed “Dark Silicon”, iv) highly complex embedded applications are hard to predict etc. All these scenarios (and further not listed here) make proactive and sophisticated run-time adaption techniques a prime design consideration for generations of multi-core architectures to come. The intend of this paper is to present problems and solutions of top research initiatives from diverse angles with the common denominator of the dire need for run-time adaption: The first part tackles the thermal problem i.e. high power densities and the related short and long-term effects it has on the reliability and it presents scalable techniques to cope the related problems. The second section demonstrates the potential of steep slope devices on thread scheduling of multi-cores. The third approach presents embedded pipelined architectures running complex multi-media applications whereas the fourth section introduces the paradigm of invasive computing i.e. a novel computing approach promising high efficiency through a highly-adaptive hardware/software architecture. In summary, the paper presents snapshots on four highly-adaptive solutions and platforms from different angles for challenges of complex future multi-core systems.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134104887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Bound-oriented parallel pruning approaches for efficient resource constrained scheduling of high-level synthesis 面向边界的并行剪枝方法用于高级综合的高效资源约束调度

2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2013-09-29 DOI: 10.1109/CODES-ISSS.2013.6659001

Mingsong Chen, Lei Zhou, G. Pu, Jifeng He

引用次数: 4

On the automatic generation of GPU-oriented software applications from RTL IPs 从RTL ip自动生成面向gpu的软件应用程序

2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2013-09-29 DOI: 10.1109/CODES-ISSS.2013.6658999

N. Bombieri, F. Fummi, S. Vinco

{"title":"On the automatic generation of GPU-oriented software applications from RTL IPs","authors":"N. Bombieri, F. Fummi, S. Vinco","doi":"10.1109/CODES-ISSS.2013.6658999","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6658999","url":null,"abstract":"Graphics processing units (GPUs) have been explored as a new computing paradigm for accelerating computation intensive applications. In particular, the combination between GPUs and CPU has proved to be an effective solution for accelerating the software execution, by mixing the few CPU cores optimized for serial processing with many smaller GPU cores designed for massively parallel computations. In addition, sustained by the need of low power consumption besides high performance, a recent trend is combining GPUs and CPU onto a single die (e.g., AMD Fusion, Intel Sandy Bridge, NVIDIA Tegra). The good tradeoff between computing capability and power consumption makes the integrated GPUs a promising alternative for accelerating a wide range of software application for embedded systems. Nevertheless, algorithms must be redesigned to take advantage of these architectures and such a manual parallelization often results in being unsatisfactory. This paper presents a methodology to automatically generate software applications for GPUs, by reusing existing and preverified register-transfer level (RTL) intellectual-properties (IPs). The methodology aims at exploiting the intrinsic parallelism of RTL IPs (such as process concurrency and pipeline micro-architecture) for generating the parallel software implementation of the functionality. The experimental results show how the performance obtained by running the RTL functionality as software applications on GPUs outperform those provided by the RTL code mapped into a hardware accelerator.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133780749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

System level synthesis of hardware for DSP applications using pre-characterized function implementations 使用预表征功能实现的DSP应用的系统级硬件综合

2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2013-09-29 DOI: 10.1109/CODES-ISSS.2013.6659003

Shuo Li, Nasim Farahini, A. Hemani, Kathrin Rosvall, I. Sander

引用次数: 31

Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling 学习具有扩展范围电压/频率缩放的多核心系统的最佳工作点

2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2013-09-01 DOI: 10.1109/CODES-ISSS.2013.6658995

Da-Cheng Juan, S. Garg, Jinpyo Park, Diana Marculescu

{"title":"Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling","authors":"Da-Cheng Juan, S. Garg, Jinpyo Park, Diana Marculescu","doi":"10.1109/CODES-ISSS.2013.6658995","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6658995","url":null,"abstract":"Near-Threshold Computing (NTC) has emerged as a solution that promises to significantly increase the energy efficiency of next-generation multi-core systems. This paper evaluates and analyzes the behavior of dynamic voltage and frequency scaling (DVFS) control algorithms for multi-core systems operating under near-threshold, nominal, or turbo-mode conditions. We adapt the model selection technique from machine learning to learn the relationship between performance and power. The theoretical results show that the resulting models satisfy convexity properties essential to efficiently determining optimal voltage/frequency operating points for minimizing energy consumption under throughput constraints or maximizing throughput under a given power budget. Our experimental results show that, compared with DVFS in the conventional operating range, extended range DVFS control including turbo-mode and near-threshold operation achieves an additional (1) 13.28% average energy reduction under isoperformance conditions, and (2) 7.54% average throughput increase under iso-power conditions.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121667835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

Online OLED dynamic voltage scaling for video streaming applications on mobile devices 移动设备上视频流应用的在线OLED动态电压缩放

2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2013-07-01 DOI: 10.1145/2518148.2518156

Mengying Zhao, Yiran Chen, Xiang Chen, C. Xue

引用次数: 20