Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia最新文献

High performance network-on-chip simulation by interval-based timing predictions 基于间隔时序预测的高性能片上网络仿真

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3139320

Sascha Roloff, Frank Hannig, J. Teich

{"title":"High performance network-on-chip simulation by interval-based timing predictions","authors":"Sascha Roloff, Frank Hannig, J. Teich","doi":"10.1145/3139315.3139320","DOIUrl":"https://doi.org/10.1145/3139315.3139320","url":null,"abstract":"Current multi- and many-core computer architectures heavily use Network-on-Chip (NoC communication in order to meet the increased bandwidth demands between the processors and for reasons of scalability. For the proper analysis of concurrency utilization, and workload distribution of parallel multi-media applications running on such NoC-based architectures, high-speed simulation techniques are required. Apart from accurate timing simulation of compute resources, it is of utmost importance also to accurately model the delays caused by the packet-based network communication in order to reliably verify performance numbers, or to identify any bottlenecks of the underlying architecture, or to study workload distribution techniques or routing algorithms. In this paper, we present a novel simulation approach for NoCs that allows to simulate such communication delays equally accurate but much faster in average than on a flit-by-flit basis. We propose novel algorithmic and analytical techniques that predict the transmission intervals dynamically based on the arrival of communication requests, actual congestion in the NoC, routing information, packet lengths, and other parameters. According to such predictions, the simulation time may in many cases be automatically advanced, thus reducing the number of events to process in the simulator to a large extent. The presented NoC simulation technique has been integrated into a system-level multi-core architecture simulator. Experiments in running parallel real-world and multi-media applications on a simulated scalable NoC architecture show that we are able to achieve speedups of three orders of magnitude compared to cycle-accurate NoC simulators, while preserving a timing accuracy of above 95%.","PeriodicalId":208026,"journal":{"name":"Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129733221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia 第15届IEEE/ACM实时多媒体嵌入式系统研讨会论文集

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315

S. Stuijk, Akash Kumar

引用次数: 0

ML-Gov: a machine learning enhanced integrated CPU-GPU DVFS governor for mobile gaming ML-Gov:用于手机游戏的机器学习增强集成CPU-GPU DVFS调控器

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3139317

Jurn-Gyu Park, N. Dutt, Sung-Soo Lim

{"title":"ML-Gov: a machine learning enhanced integrated CPU-GPU DVFS governor for mobile gaming","authors":"Jurn-Gyu Park, N. Dutt, Sung-Soo Lim","doi":"10.1145/3139315.3139317","DOIUrl":"https://doi.org/10.1145/3139315.3139317","url":null,"abstract":"Modern heterogeneous CPU-GPU based mobile architectures that execute intensive mobile games and other graphics applications use software governors to achieve high performance with energy-efficiency. For dynamic and diverse gaming workloads on heterogeneous platforms, existing governors typically utilize statistical or heuristic models assuming linear relationships for a small set of mobile games, resulting in high prediction errors. To overcome these limitations, we propose ML-Gov: a machine learning enhanced integrated CPU-GPU governor that builds tree-based piecewise linear models offline, and deploys these models for online estimation into an integrated CPU-GPU Dynamic Voltage Frequency Scaling (DVFS) governor. Our experiments on a test set of 20 mobile games exhibiting diverse characteristics show that our governor achieved significant energy efficiency gains of over 10% improvements on average in energy-per-frame with a surprising-but-modest 3% improvement in Frames-per-Second (FPS) performance, compared to a typical state-of-the-art governor that employs simple linear regression models.","PeriodicalId":208026,"journal":{"name":"Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130727418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Asynchronous one-sided communications and synchronizations for a clustered manycore processor 集群多核处理器的异步单侧通信和同步

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3139318

Julien Hascoët, B. Dinechin, P. G. D. Massas, Minh Quan Ho

引用次数: 14

Worst case delay analysis of shared resource access in partitioned multi-core systems 分区多核系统中共享资源访问的最坏情况延迟分析

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3139322

Donghyun Kang, Junchul Choi, S. Ha

引用次数: 3

Mobile heterogeneous computing: a software perspective 移动异构计算:软件视角

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3151619

T. Mitra

{"title":"Mobile heterogeneous computing: a software perspective","authors":"T. Mitra","doi":"10.1145/3139315.3151619","DOIUrl":"https://doi.org/10.1145/3139315.3151619","url":null,"abstract":"Mobile heterogeneous computing, materialized in the form of multiprocessor system-on-chips (MPSoC) comprising of various processing elements such as general-purpose cores with differing characteristics, GPUs, DSPs, non-programmable accelerators, and reconfigurable computing, are expected to dominate the current and the future mobile platform landscape. The heterogeneity enables a computational kernel with specific requirements to be paired with the processing element(s) ideally suited to perform that computation, leading to substantially improved performance and energy-efficiency. While heterogeneous computing is an attractive proposition in theory, considerable software support at all levels is essential to fully realize its promises. The system software needs to orchestrate the different on-chip compute resources in a synergistic manner with minimal engagement from the application developers. The current state-of-the-art is inadequate in the software dimension despite tremendous progress and success in designing heterogeneous MPSoCs for mobile devices. This talk will put the spotlight on the software perspective of mobile heterogeneous computing, especially in the context of popular emerging applications, such as 3D gaming, multimedia processing and analytics. The talk will introduce the technology trends driving the mobile heterogeneous computing revolution, provide an overview of computationally and performance divergent compute elements, and present efforts at compiler and run-time management layers to unleash its potential towards high-performance energy-efficient computing.","PeriodicalId":208026,"journal":{"name":"Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126657169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reliable mapping and partitioning of performance-constrained openCL applications on CPU-GPU MPSoCs 性能受限的openCL应用在CPU-GPU mpsoc上的可靠映射和分区

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3157088

E. Wächter, G. Merrett, B. Al-Hashimi, A. Singh

引用次数: 7

Evaluating and mitigating degradation effects in multimedia circuits 评估和减轻多媒体电路中的退化效应

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3143527

H. Amrouch, J. Henkel

{"title":"Evaluating and mitigating degradation effects in multimedia circuits","authors":"H. Amrouch, J. Henkel","doi":"10.1145/3139315.3143527","DOIUrl":"https://doi.org/10.1145/3139315.3143527","url":null,"abstract":"The nano-CMOS era continuously introduces reliability challenges with every new generation. Short-term and long-term degradation effects due to temperature and aging, respectively, can cause a considerable increase in the delay of a circuit and hence timing errors due to path violations. To overcome such degradations, designers inevitably need to employ wide timing guardbands manifest as reduced efficiency and performance. In fact, narrowing guardbands is one of the key optimization goals in current and upcoming technology nodes. In this work, we investigate whether do designers really need to employ guardbands even in error-tolerant (e.g., multimedia) circuits? This investigation enables us to trade off guardbands with quality. In addition, we demonstrate how our proposed degradation-aware cell libraries, degradation-aware timing analysis and degradation-aware logic synthesis are indispensable, not only to link the physical level with the system level (i.e. quantifying the final impact of degradation effects on the quality of processed images) but also to increase effectively the resiliency of circuits against degradations.","PeriodicalId":208026,"journal":{"name":"Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122963871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Approximate data reuse-based processor: a case study on image compression 基于近似数据重用的处理器:图像压缩的案例研究

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3139316

Hisashi Osawa, Yuko Hara-Azumi

{"title":"Approximate data reuse-based processor: a case study on image compression","authors":"Hisashi Osawa, Yuko Hara-Azumi","doi":"10.1145/3139315.3139316","DOIUrl":"https://doi.org/10.1145/3139315.3139316","url":null,"abstract":"In most embedded systems, how to design accelerators of end applications under stringent design constraints has been a crucial issue. In this paper, we employ a new computation paradigm \"approximate computing\" to resolve this issue. More specifically, our work focuses on and reuses computations which have recently produced results that are expected to be similar enough to the current ones - \"approximate data reuse.\" This concept enables to reduce computations by skipping instructions. We develop accelerator designs with this concept holistically from both hardware (architecture) and software (compilation) to achieve sufficient speedup and energy saving while mitigating the area overhead at the cost of some error. This paper provides mainly three contributions: architectural extensions applicable to a variety of processors even under a stringent constraint on circuit area, parameterization of important features of our method so that the degree of approximate data reuse can be easily tuned for different applications, and exhaustive evaluations on combinations of key parameters through our case study. A case study was quantitatively conducted using a realistic application (image compression) to demonstrate the effectiveness of our method over conventional ones.","PeriodicalId":208026,"journal":{"name":"Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130459502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Fluid wireless protocols: energy-efficient design and implementation 流体无线协议:节能设计和实现

Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia Pub Date : 2017-10-15 DOI: 10.1145/3139315.3139321

Ganapati Bhat, S. Srinivas, Vamsi Chagari, Jaehyun Park, Thomas McGiffen, Hyunseok Lee, D. Bliss, C. Chakrabarti, Ümit Y. Ogras

引用次数: 1