2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing最新文献

筛选
英文 中文
Energy-Efficient Task Scheduling in Manycore Processors with Frequency Scaling Overhead 具有频率缩放开销的多核处理器的节能任务调度
Patrick Eitschberger, J. Keller
{"title":"Energy-Efficient Task Scheduling in Manycore Processors with Frequency Scaling Overhead","authors":"Patrick Eitschberger, J. Keller","doi":"10.1109/PDP.2015.64","DOIUrl":"https://doi.org/10.1109/PDP.2015.64","url":null,"abstract":"We investigate deadline scheduling of independent tasks on parallel processors with discrete frequency levels, when the latency for frequency scaling cannot be neglected. This situation frequently occurs in applications, e.g. streaming applications with soft real-time requirements. We demonstrate that previous algorithms for energy-optimal static scheduling of independent tasks are non-optimal in this setting. We present a scheduling heuristic based on bin packing with a cost function that takes latency for frequency scaling into account. We evaluate our heuristic against previous approaches with benchmark task sets and achieve energy reductions between 3% and 13%. We further demonstrate that for a concrete embedded multicore processor, the power curves vary over the identical cores, so that the processor looks heterogeneous from a power perspective. We adapt our bin packing heuristic and demonstrate that for the benchmark task sets, further energy reductions up to 4% can be achieved.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"63 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116442662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multicast On-chip Traffic Analysis Targeting Manycore NoC Design 面向多核NoC设计的多播片上流量分析
S. Abadal, Albert Mestres, E. Alarcón, A. Cabellos-Aparicio, R. Martínez
{"title":"Multicast On-chip Traffic Analysis Targeting Manycore NoC Design","authors":"S. Abadal, Albert Mestres, E. Alarcón, A. Cabellos-Aparicio, R. Martínez","doi":"10.1109/PDP.2015.26","DOIUrl":"https://doi.org/10.1109/PDP.2015.26","url":null,"abstract":"The scalability of Network-on-Chip (NoC) designs has become a rising concern as we enter the many core era. Multicast support represents a particular yet relevant case within this context and has been the focus of different research efforts, mainly due to the poor performance of NoCs in the presence of this increasingly important type of traffic. However, most of the proposed schemes have been evaluated using synthetic traffic or within a full system, which is either unrealistic or costly. While traffic models would allow to better assess their performance, existing proposals do not distinguish between unicast and multicast flows and often are bound to a given number of cores. In this paper, a trace-based multicast traffic characterization is presented with the aim to provide guidelines for the modeling of multicast communications in many core settings. To this end, the scaling trends of aspects such as the multicast traffic intensity or the spatiotemporal injection distribution are analyzed. The novelty of this work resides both on its scalability-oriented approach and on the use of correlation metrics to evaluate potential prediction opportunities.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133035818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Generalized Extraction of Real-Time Parameters for Homogeneous Synchronous Dataflow Graphs 同构同步数据流图实时参数的广义提取
H. Ali, B. Akesson, L. M. Pinho
{"title":"Generalized Extraction of Real-Time Parameters for Homogeneous Synchronous Dataflow Graphs","authors":"H. Ali, B. Akesson, L. M. Pinho","doi":"10.1109/PDP.2015.57","DOIUrl":"https://doi.org/10.1109/PDP.2015.57","url":null,"abstract":"Many embedded multi-core systems incorporate both dataflow applications with timing constraints and traditional real-time applications. Applying real-time scheduling techniques on such systems provides real-time guarantees that all running applications will execute safely without violating their deadlines. However, to apply traditional real-time scheduling techniques on such mixed systems, a unified model to represent both types of applications running on the system is required. Several earlier works have addressed this problem and solutions have been proposed that address acyclic graphs, implicit-deadline models or are able to extract timing parameters considering specific scheduling algorithms. In this paper, we present an algorithm for extracting real-time parameters (offsets, deadlines and periods) that are independent of the schedulability analysis, other applications running in the system, and the specific platform. The proposed algorithm: 1) enables applying traditional real-time schedulers and analysis techniques on cyclic or acyclic Homogeneous Synchronous Dataflow (HSDF) applications with periodic sources, 2) captures overlapping iterations, which is a main characteristic of the execution of dataflow applications, 3) provides a method to assign offsets and individual deadlines for HSDF actors, and 4) is compatible with widely used deadline assignment techniques, such as NORM and PURE. The paper proves the correctness of the proposed algorithm through formal proofs and examples.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116531509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Channel Interface: A Primitive Model for Memory Efficient Communication 通道接口:内存高效通信的基本模型
T. Nanri, T. Soga, Yuichiro Ajima, Yoshiyuki Morie, H. Honda, Taizo Kobayashi, T. Takami, S. Sumimoto
{"title":"Channel Interface: A Primitive Model for Memory Efficient Communication","authors":"T. Nanri, T. Soga, Yuichiro Ajima, Yoshiyuki Morie, H. Honda, Taizo Kobayashi, T. Takami, S. Sumimoto","doi":"10.1109/PDP.2015.83","DOIUrl":"https://doi.org/10.1109/PDP.2015.83","url":null,"abstract":"Though the size of the system is getting larger towards exa-scale computation, the amount of available memory on computing nodes is expected to remain the same or to decrease. Therefore, memory efficiency is becoming an important issue for achieving scalability. This paper pointed out the problem of memory-inefficiency in the de-facto standard parallel programming model, Message Passing Interface (MPI). To solve this problem, the channel interface was introduced in the paper. This interface enables the programmers to appropriately allocate and de-allocate channels so that the program consumes just-enough amount of memory for communication. In addition to that, by limiting the message transfer supported by a channel as simple as possible, the memory consumption and the overhead for handling messages with this interface can be minimal. This paper showed a sample implementation of this interface. Then, the memory efficiency of the implementation is examined by the models of the memory consumption and the performance.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"178 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116638431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Solutions for Processing K Nearest Neighbor Joins for Massive Data on MapReduce MapReduce处理海量数据K近邻连接的解决方案
Ge Song, Justine Rochas, F. Huet, F. Magoulès
{"title":"Solutions for Processing K Nearest Neighbor Joins for Massive Data on MapReduce","authors":"Ge Song, Justine Rochas, F. Huet, F. Magoulès","doi":"10.1109/PDP.2015.79","DOIUrl":"https://doi.org/10.1109/PDP.2015.79","url":null,"abstract":"Given a point p and a set of points S, the kNN operation finds the k closest points to p in S. It is a computational intensive task with a large range of applications such as knowledge discovery or data mining. However, as the volume and the dimension of data increase, only distributed approaches can perform such costly operation in a reasonable time. Recent works have focused on implementing efficient solutions using the MapReduce programming model because it is suitable for large scale data processing. Also, it can easily be executed in a distributed environment. Although these works provide different solutions to the same problem, each one has particular constraints and properties. There is no readily available comparison to help users choose the one most appropriate for their needs. This is the problem we address in this work. Firstly, we show that all kNN implementations go through a common workflow, which we use as a basis for classification. Secondly, we describe precisely the different techniques published so far. And lastly, we provide a set of objective criteria that can be used to make informed decisions.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"55 88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124776428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Simultaneous Optimisation of Task Mapping and Priority Assignment for Real-Time Embedded NoCs 实时嵌入式noc任务映射和优先级分配的同步优化
M. Sayuti, L. Indrusiak
{"title":"Simultaneous Optimisation of Task Mapping and Priority Assignment for Real-Time Embedded NoCs","authors":"M. Sayuti, L. Indrusiak","doi":"10.1109/PDP.2015.84","DOIUrl":"https://doi.org/10.1109/PDP.2015.84","url":null,"abstract":"In a hard real-time embedded system based on a fixed priority pre-emptive Networks-On-Chip (NoC), the provision of guaranteed services may require pre-emption of some tasks and messages based on their priorities. In a worst case scenario, the interference imposed to low priority tasks can cause substantial computation and communication delays that can exceed their deadlines, leading to an unschedulable system. In a task mapping optimisation process, changing task mappings does not always produce a schedulable task mapping. In this paper, we propose an approach that simultaneously optimises task mapping and priority assignment, aiming to find a configuration that can completely satisfy the timing constraints of the system. Differing to the state-of-the-art, our approach takes into account the overall schedulability of the system by considering the worst-case end-to-end response time of all mapped tasks. As a result, we are able to increase the quality of task mappings at the same time improving the convergence of the optimisation algorithm, better than the previous approaches that solely focus on the task mapping optimisation to make the system schedulable.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125122297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Portable Framework for Real-Time Parallel Image Processing on High Performance Embedded Platforms 高性能嵌入式平台上实时并行图像处理的便携式框架
Clemens Eisserer
{"title":"Portable Framework for Real-Time Parallel Image Processing on High Performance Embedded Platforms","authors":"Clemens Eisserer","doi":"10.1109/PDP.2015.31","DOIUrl":"https://doi.org/10.1109/PDP.2015.31","url":null,"abstract":"The trend to efficient, however more complex, multicore designs has also reached the world of Digital Signal Processors (DSP), a field where typically low-level programming has been prevalent. To overcome the additional complexity of programming multi-core and multi-chip DSP systems, we present an object-oriented framework for task-based parallel programming on the highly power efficient Texas Instruments TSMC320C6678 platform. Our framework incorporates hardware architectural details of this platform such as DMA units in a high-level manner, while maintaining portability - guiding the path for algorithmic designers from PCs to embedded DSP platforms. The whole framework has been designed and implemented with real-time requirements and low overhead in mind, which is crucial for the acceptance of higher-level solutions on embedded systems.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125184181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concentration and Its Impact on Mesh and Torus-Based NoC Performance 浓度及其对网格和环基NoC性能的影响
S. Loucif
{"title":"Concentration and Its Impact on Mesh and Torus-Based NoC Performance","authors":"S. Loucif","doi":"10.1109/PDP.2015.35","DOIUrl":"https://doi.org/10.1109/PDP.2015.35","url":null,"abstract":"This paper investigates the effects of concentration on the performance of k-ary n-cubes. Simulation results indicate that only large ratios of packet length-to-average hop-count are in favor of concentrated mesh and torus. The Cmesh takes full advantage of its high channel bandwidth to outperform Ctorus. Moreover, non-local traffic suffers more from performance bottleneck than local traffic at routers. Providing dedicated input ports, one for each IP, at routers, reduces the average packet latency compared to a configuration with a single input port shared by all IP cores of the cluster.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115258974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Application of GPU Parallel Computing to Power Flow Calculation in HVDC Networks GPU并行计算在高压直流电网潮流计算中的应用
Przemyslaw Blaskiewicz, M. Zawada, P. Balcerek, P. Dawidowski
{"title":"An Application of GPU Parallel Computing to Power Flow Calculation in HVDC Networks","authors":"Przemyslaw Blaskiewicz, M. Zawada, P. Balcerek, P. Dawidowski","doi":"10.1109/PDP.2015.110","DOIUrl":"https://doi.org/10.1109/PDP.2015.110","url":null,"abstract":"Numerical computation on GPU has become easily accessible and offers good computation power for relatively little cost. Recently an application of Newton-Rap son method for analyzing power flow in multi-terminal high-voltage direct current (HVDC) networks was proposed and shown to have good results on five terminal grids. Since this method involves costly matrix operation, especially the inverse, increasing the number of terminals in the grid yields prohibitively large execution times in sequential operation. To address this issue, we adjust the algorithm so that it benefits from parallel computation and test our approach on recent GPU from NVidia. We give experimental results for grids up to few thousand terminals and show that execution time is still acceptable for real applications. We also provide some benchmarks of the GPU computation compared with other platforms.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115362605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Green Perspective on Structured Parallel Programming 结构化并行编程的绿色视角
M. Danelutto, M. Torquati, P. Kilpatrick
{"title":"A Green Perspective on Structured Parallel Programming","authors":"M. Danelutto, M. Torquati, P. Kilpatrick","doi":"10.1109/PDP.2015.116","DOIUrl":"https://doi.org/10.1109/PDP.2015.116","url":null,"abstract":"Structured parallel programming, and in particular programming models using the algorithmic skeleton or parallel design pattern concepts, are increasingly considered to be the only viable means of supporting effective development of scalable and efficient parallel programs. Structured parallel programming models have been assessed in a number of works in the context of performance. In this paper we consider how the use of structured parallel programming models allows knowledge of the parallel patterns present to be harnessed to address both performance and energy consumption. We consider different features of structured parallel programming that may be leveraged to impact the performance/energy trade-off and we discuss a preliminary set of experiments validating our claims.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124611520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信