PARMA-DITAM '14最新文献

筛选
英文 中文
Effective Platform-Level Exploration for Heterogeneous Multicores Exploiting Simulation-Induced Slacks 异构多核利用仿真诱导松弛的有效平台级探索
PARMA-DITAM '14 Pub Date : 2014-01-20 DOI: 10.1145/2556863.2556864
Efstathios Sotiriou-Xanthopoulos, S. Xydis, K. Siozios, G. Economakos, D. Soudris
{"title":"Effective Platform-Level Exploration for Heterogeneous Multicores Exploiting Simulation-Induced Slacks","authors":"Efstathios Sotiriou-Xanthopoulos, S. Xydis, K. Siozios, G. Economakos, D. Soudris","doi":"10.1145/2556863.2556864","DOIUrl":"https://doi.org/10.1145/2556863.2556864","url":null,"abstract":"Heterogeneous Multi-Processor Systems-on-Chip (MPSoC) exhibit increased design complexity due to numerous architectural parameters and hardware/software partitioning schemes. Automated Design Space Exploration (DSE) becomes an essential design procedure to discover optimized solutions in a reasonable time. For high-quality DSE, the accurate solution evaluation is a strong requirement. To this direction, High-Level Synthesis (HLS) can be used for the characterization of the design solutions. In this paper, we propose (a) a platform design methodology that exploits simulation-induced slacks generated by avoiding simulation re-initializations and exploits the gained time for HLS, and (b) a DSE tool-flow which takes into account multiple HW/SW partitioning schemes and intelligently schedules system evaluations. Experimental results show that the proposed methodology achieves 17% simulation improvements together with 77% higher accuracy, in comparison to a typical exploration approach.","PeriodicalId":210814,"journal":{"name":"PARMA-DITAM '14","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130260233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
On Expressing Strategies for Directive-Driven Multicore Programing Models 指令驱动多核编程模型的表达策略研究
PARMA-DITAM '14 Pub Date : 2014-01-20 DOI: 10.1145/2556863.2556870
Ricardo Nobre, Pedro Pinto, Tiago Carvalho, João MP Cardoso, P. Diniz
{"title":"On Expressing Strategies for Directive-Driven Multicore Programing Models","authors":"Ricardo Nobre, Pedro Pinto, Tiago Carvalho, João MP Cardoso, P. Diniz","doi":"10.1145/2556863.2556870","DOIUrl":"https://doi.org/10.1145/2556863.2556870","url":null,"abstract":"A common migration path for applications to high-performance multicore architectures relies on code annotations with concurrent semantics. Some annotations, however, are very target architecture specific and thus highly non-portable. In this paper we describe a source-to-source code transformation system that allows programmers to specify transformations using an aspect-oriented domain specific language - LARA. LARA allows programmers to specify strategies to search large code transformation design spaces while preserving the original source code. As the experimental results reveal, this approach leads to a substantial reduction in code maintenance costs, and promotes the portability of both programmers and performance.","PeriodicalId":210814,"journal":{"name":"PARMA-DITAM '14","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127807503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Extending a Run-time Resource Management framework to support OpenCL and Heterogeneous Systems 扩展运行时资源管理框架以支持OpenCL和异构系统
PARMA-DITAM '14 Pub Date : 2014-01-20 DOI: 10.1145/2556863.2556868
G. Massari, Chiara Caffarri, P. Bellasi, W. Fornaciari
{"title":"Extending a Run-time Resource Management framework to support OpenCL and Heterogeneous Systems","authors":"G. Massari, Chiara Caffarri, P. Bellasi, W. Fornaciari","doi":"10.1145/2556863.2556868","DOIUrl":"https://doi.org/10.1145/2556863.2556868","url":null,"abstract":"From Mobile to High-Performance Computing (HPC) systems, performance and energy efficiency are becoming always more challenging requirements. In this regard, heterogeneous systems, made by a general-purpose processor and one or more hardware accelerators, are emerging as affordable solutions. However, the effective exploitation of such platforms requires specific programming languages, like for instance OpenCL, and suitable run-time software layers. This work illustrates the extension of a run-time resource management (RTRM) framework, to support the execution of OpenCL applications on systems featuring a multi-core CPU and multiple GPUs. Early results show how this solution leads to benefits both for the applications, in terms of performance, and for the system, in terms of resource utilization, i.e. load balancing and thermal leveling over the computing devices.","PeriodicalId":210814,"journal":{"name":"PARMA-DITAM '14","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130863966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A cycle-accurate synthesizable MIPS simulator in Simulink 一个周期精确的可合成MIPS模拟器
PARMA-DITAM '14 Pub Date : 2014-01-20 DOI: 10.1145/2556863.2556867
Thomas Sideropoulos, N. Pitsianis
{"title":"A cycle-accurate synthesizable MIPS simulator in Simulink","authors":"Thomas Sideropoulos, N. Pitsianis","doi":"10.1145/2556863.2556867","DOIUrl":"https://doi.org/10.1145/2556863.2556867","url":null,"abstract":"We introduce a novel methodology for creating a synthesizable, cycle-accurate simulator of the MIPS32 processor with concise, high-level programming expressions using Simulink and other matlab tools. The simulator, named SimuMIPS, is capable of running binaries generated by the GNU gcc compiler and associated binutils. It can be easily configured, modified and extended not only for academic instruction but also to be included in commercial SOC products. Synthesizable instantiations of SimuMIPS in Verilog and VHDL may be generated by Simulink HDL Coder for FPGA programming and system-on-chip prototyping. In addition, the SimuMIPS simulator can run on embedded processors, rapid prototyping boards, and off-the-shelf microprocessors via the Embedded Coder generated C and C++ implementations.","PeriodicalId":210814,"journal":{"name":"PARMA-DITAM '14","volume":"187 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114089402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting Performance Counters for Energy Efficient Co-Scheduling of Mixed Workloads on Multi-Core Platforms 利用性能计数器实现多核平台上混合工作负载的节能协同调度
PARMA-DITAM '14 Pub Date : 2014-01-20 DOI: 10.1145/2556863.2556866
Simone Libutti, G. Massari, P. Bellasi, W. Fornaciari
{"title":"Exploiting Performance Counters for Energy Efficient Co-Scheduling of Mixed Workloads on Multi-Core Platforms","authors":"Simone Libutti, G. Massari, P. Bellasi, W. Fornaciari","doi":"10.1145/2556863.2556866","DOIUrl":"https://doi.org/10.1145/2556863.2556866","url":null,"abstract":"Mainstream multicore architectures allow the execution of mixed workloads where multiple parallel applications run concurrently competing on shared computational resources. As different applications exhibit different and time varying resources needs, a suitable allocation policy is required to properly select and map resources at run-time on demanding applications.\u0000 We demonstrate how a user-space run-time resource manager could be extended to easily take advantage of performance counters in order to optimize both workloads execution time and energy consumption. Our approach, initially evaluated on a quad-core Intel machine considering a representative set of mixed-workloads from a standard benchmark suite, attains a 49,9% mean energy-delay-product (EDP) speed-up over the standard Linux case, and a 13.4% EDP speed-up over our previous work.","PeriodicalId":210814,"journal":{"name":"PARMA-DITAM '14","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116193808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
An Interactive Tool based on Polly for Detection and Parallelization of Loops 基于Polly的循环检测与并行化交互工具
PARMA-DITAM '14 Pub Date : 2014-01-20 DOI: 10.1145/2556863.2556869
D. Göhringer, Jan Tepelmann
{"title":"An Interactive Tool based on Polly for Detection and Parallelization of Loops","authors":"D. Göhringer, Jan Tepelmann","doi":"10.1145/2556863.2556869","DOIUrl":"https://doi.org/10.1145/2556863.2556869","url":null,"abstract":"In many applications, such as signal and image processing, most computation time is spent within loops. Therefore, these loops are ideal candidates for performance increase when moving to parallel architectures, such as multi- or many-core systems. However, manual parallelization of existing applications is a complex and cumbersome task. To leverage this, we introduce in this paper an interactive tool based on Polly, LLVM and the linux perf tools. With the help of our tool compute intensive loops can be found and parallelized. Polly is a polyhedral optimizer for LLVM. In the polyhedral model, loops are described in an abstract mathematical way and loop optimizations are mathematical transformations on this abstract description. Loops must meet specific requirements to be representable in the polyhedral model. If only one requirement is not satisfied, the loop cannot be optimized with Polly. Our tool can help here by showing the user all the problems which prevent an automatic optimization with Polly. Such an optimization is only worthwhile for compute intensive loops. To find such loops our tool uses the linux perf tools for performance profiling. Evaluation results for the following two applications are presented: Tiff2rgba and 2D Cross-Correlation image processing algorithm.","PeriodicalId":210814,"journal":{"name":"PARMA-DITAM '14","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124071208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Fine-Grained Link Locking Within Power and Latency Transaction Level Modelling in Wormhole Switching Non-Preemptive Networks On Chip 片上虫洞交换非抢占网络的功率内细粒度链路锁定和延迟事务级建模
PARMA-DITAM '14 Pub Date : 2014-01-20 DOI: 10.1145/2556863.2556865
J. Harbin, L. Indrusiak
{"title":"Fine-Grained Link Locking Within Power and Latency Transaction Level Modelling in Wormhole Switching Non-Preemptive Networks On Chip","authors":"J. Harbin, L. Indrusiak","doi":"10.1145/2556863.2556865","DOIUrl":"https://doi.org/10.1145/2556863.2556865","url":null,"abstract":"An increasingly time-consuming part of the design flow of on-chip multiprocessors is simulation of the network on chip (NoC) architecture. Cycle-accurate simulation of state-of-the art network-on-chip interconnects can be prohibitively slow for realistic application examples. In this paper, we identify a time-predictable non-preemptive network-on-chip architecture and propose a TLM model with fine-grained locking of links. The model is tested via simulation of two benchmark application scenarios. Results demonstrate that the proposed algorithm can model the latency upon the majority of flows very closely to the cycle-accurate model, while providing more than 97% accurate power consumption modelling even on the worst case links. This is achieved while simulating nearly three orders of magnitude faster compared to a cycle-accurate model of the same interconnect.","PeriodicalId":210814,"journal":{"name":"PARMA-DITAM '14","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125416665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信