2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)最新文献_第6页

Implementing the Open Community Runtime for Shared-Memory and Distributed-Memory Systems 实现共享内存和分布式内存系统的开放社区运行时

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-04-04 DOI: 10.1109/PDP.2016.81

J. Dokulil, Martin Sandrieser, S. Benkner

引用次数: 17

Towards a General Framework for Ensuring and Reusing Proofs of Termination Detection in Distributed Computing 分布式计算中终端检测证明保证与重用的通用框架

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-04-04 DOI: 10.1109/PDP.2016.113

Maha Boussabbeh, M. Tounsi, A. Kacem, M. Mosbah

{"title":"Towards a General Framework for Ensuring and Reusing Proofs of Termination Detection in Distributed Computing","authors":"Maha Boussabbeh, M. Tounsi, A. Kacem, M. Mosbah","doi":"10.1109/PDP.2016.113","DOIUrl":"https://doi.org/10.1109/PDP.2016.113","url":null,"abstract":"Distributed algorithms are designed to run on interconnected autonomous computing entities for achieving a common task: each entity executes asynchronously the same code and interacts locally with its immediate neighbours. It is widely agreed that the lack of knowledge of the global state makes termination detection one of the most important and complex problems in distributed computing. By relying on refinement, we prove that an algorithm computing a spanning tree with Local Termination Detection (each entity is able to determine only its own termination condition), can be reused and adapted in order to compute the same algorithm with Global Termination Detection (at least one entity is aware that the entire computation is achieved in the network). The main idea relies upon specifying a combination of a well known algorithm namely SSP and the spanning tree algorithm, following a top/down approach. This paper is a starting point towards a general framework for enhancing termination detection property of distributed algorithms and reusing their proofs.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116117666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A Time Synchronization Protocol for Modular Robots 模块化机器人的时间同步协议

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-04-04 DOI: 10.1109/PDP.2016.73

André Naz, Benoît Piranda, S. Goldstein, J. Bourgeois

引用次数: 9

Estimation Models for NoSQL Database Consistency Characteristics NoSQL数据库一致性特征估计模型

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-04-04 DOI: 10.1109/PDP.2016.23

A. Burdakov, Y. Grigorev, A. Ploutenko, Eugene Ttsviashchenko

引用次数: 6

Evaluation of the Memory Communication Traffic in a Hierarchical Cache Model for Massively-Manycore Processors 海量多核处理器分层缓存模型中内存通信流量的评估

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-04-04 DOI: 10.1109/PDP.2016.30

Sharifa Al Khanjari, W. Vanderbauwhede

{"title":"Evaluation of the Memory Communication Traffic in a Hierarchical Cache Model for Massively-Manycore Processors","authors":"Sharifa Al Khanjari, W. Vanderbauwhede","doi":"10.1109/PDP.2016.30","DOIUrl":"https://doi.org/10.1109/PDP.2016.30","url":null,"abstract":"The scaling of semiconductor technologies is leading to processors with increasing numbers of cores. A key enabler in manycore systems is the use of Networks-on-Chip (NoC) as a global communication mechanism. The adoption of NoCs in manycore systems requires a shift in focus from computation to communication, as communication is fast becoming the dominant factor in processor performance. Many researchers have focused on direct communication between cores in the NoC, however in a manycore processor the communication is actually between the cores and the memory hierarchy. In this work, we investigate the memory communication traffic of shared threads in a hierarchical cache architecture. We argue that the performance scalability for shared-memory applications in a hierarchical cache architecture for systems with thousands of processor cores depends on the distance between threads sharing memory in terms of the cache hierarchy (the \"memory distance\"). We present latency and throughput results comparing fat quadtree, concentrated mesh and mesh topologies as a function of the \"memory distance\" between the threads. Our results using the ITRS physical data for 2023 show that the model of thread placement and the distance of placing them significantly affects the NoC performance, and that scale-invariant topologies perform better than flat topologies.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121468446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

GPU-Accelerated Texture Analysis Using Steerable Riesz Wavelets 使用可控Riesz小波的gpu加速纹理分析

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-04-04 DOI: 10.1109/PDP.2016.105

A. Vizitiu, L. Itu, Ranveer Joyseeree, A. Depeursinge, H. Müller, C. Suciu

引用次数: 2

Reasoning about Fences and Relaxed Atomics 关于篱笆和放松原子的推理

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-02-17 DOI: 10.1109/PDP.2016.103

Mengda He, Viktor Vafeiadis, S. Qin, J. Ferreira

{"title":"Reasoning about Fences and Relaxed Atomics","authors":"Mengda He, Viktor Vafeiadis, S. Qin, J. Ferreira","doi":"10.1109/PDP.2016.103","DOIUrl":"https://doi.org/10.1109/PDP.2016.103","url":null,"abstract":"For efficiency reasons, weak (or relaxed) memory is now the norm on modern architectures. To cater for this trend, modern programming languages are adapting their memory models. The new C11 memory model [1] allows several levels of memory weakening, including non-atomics, relaxed atomics, release-acquire atomics, and sequentially consistent atomics. Under such weak memory models, multithreaded programs exhibit more behaviours, some of which would have been inconsistent under the traditional strong (i.e. sequentially consistent) memory model. This makes the task of reasoning about concurrent programs even more challenging. The GPS framework, recently developed by Turon et al.[22], has made a step forward towards tackling this challenge. By integrating ghost states, per-location protocols and separation logic, GPS can successfully verify programs with release-acquire atomics. In this paper, we present a program logic, an enhancement of the GPS framework, that can support the verification of a bigger class of C11 programs, that is, programs with release-acquire atomics, relaxed atomics and release-acquire fences. Key elements of our proposed logic include two new types of assertions, a more expressive resource model and a set of newly-designed verification rules.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"441 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134276467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Energy Efficient Scheduling of Real Time Signal Processing Applications through Combined DVFS and DPM 结合DVFS和DPM的实时信号处理应用节能调度

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-02-17 DOI: 10.1109/PDP.2016.15

Erwan Nogues, M. Pelcat, D. Ménard, Alexandre Mercat

{"title":"Energy Efficient Scheduling of Real Time Signal Processing Applications through Combined DVFS and DPM","authors":"Erwan Nogues, M. Pelcat, D. Ménard, Alexandre Mercat","doi":"10.1109/PDP.2016.15","DOIUrl":"https://doi.org/10.1109/PDP.2016.15","url":null,"abstract":"This paper proposes a framework to design energy efficient signal processing systems. The energy efficiency is provided by combining Dynamic Frequency and Voltage Scaling (DVFS) and Dynamic Power Management (DPM). The framework is based on Synchronous Dataflow (SDF) modeling of signal processing applications. A transformation to a single rate form is performed to expose the application parallelism. An automated scheduling is then performed, minimizing the constraint of energy efficiency and providing DVFS and DPM decisions. This framework uses an architecture model including the number of available cores, the per-actor processing load and the energy per-cycle, derived from time and power measurements of modelled applications. After introducing the proposed framework, the energy characterization of big.LITTLE SoC systems is described. A generic approach is presented to generate the energy model of a platform from power measurements as customized polynomials. Finally, the experimental results on a Samsung Exynos 5410 big.LITTLE processor show that the energy optimal execution is not obtained by Linux governors that can execute either as-fast-as-possible or as-slow-as-possible. Instead, the most energy efficient scheduling is obtained by adapting both DVFS and DPM to application needs.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130727189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Predicting Performance and Power Consumption of Parallel Applications 预测并行应用程序的性能和功耗

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-02-01 DOI: 10.1109/PDP.2016.41

D. D. Sensi

{"title":"Predicting Performance and Power Consumption of Parallel Applications","authors":"D. D. Sensi","doi":"10.1109/PDP.2016.41","DOIUrl":"https://doi.org/10.1109/PDP.2016.41","url":null,"abstract":"Current architectures provide many control knobs for the reduction of power consumption of applications, like reducing the number of used cores or scaling down their frequency. However, choosing the right values for these knobs in order to satisfy requirements on performance and/or power consumption is a complex task and trying all the possible combinations of these values is an unfeasible solution since it would require too much time. For this reasons, there is the need for techniques that allow an accurate estimation of the performance and power consumption of an application when a specific configuration of the control knobs values is used. Usually, this is done by executing the application with different configurations and by using these information to predict its behaviour when the values of the knobs are changed. However, since this is a time consuming process, we would like to execute the application in the fewest number of configurations possible. In this work, we consider as control knobs the number of cores used by the application and the frequency of these cores. We show that on most Parsec benchmark programs, by executing the application in 1% of the total possible configurations and by applying a multiple linear regression model we are able to achieve an average accuracy of 96% in predicting its execution time and power consumption in all the other possible knobs combinations.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121096555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Using Nested Graphs to Distribute Parallel and Distributed Multi-agent Systems 使用嵌套图分布并行和分布式多智能体系统

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-02-01 DOI: 10.1109/PDP.2016.91

A. Rousset, B. Herrmann, C. Lang, L. Philippe, Hadrien Bride

引用次数: 2