Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing最新文献

筛选
英文 中文
Incorporating memory layout in the modeling of message passing programs 在消息传递程序的建模中加入内存布局
F. Seinstra, D. Koelma
{"title":"Incorporating memory layout in the modeling of message passing programs","authors":"F. Seinstra, D. Koelma","doi":"10.1109/EMPDP.2002.994294","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994294","url":null,"abstract":"One of the most fundamental tasks of an automatic parallelization tool is to find an optimal domain decomposition for a given application. For regular domain problems (such as simple matrix manipulations) this task may seem trivial. However, communication costs in message passing programs often significantly depend on the memory layout of data blocks to be transmitted. As a consequence, straightforward domain decompositions may be non-optimal. In this paper we introduce a new point-to-point communication model (called P-3PC) that is specifically designed to overcome this problem. In comparison with related models (e.g., LogGP) P-3PC is similar in complexity, but more accurate in many situations. Although the model is aimed at MPI's standard point-to-point operations, it is applicable to similar message passing definitions as well. The effectiveness of the model is tested in a framework for automatic parallelization of imaging applications. Experiments are performed on two Beowulf-type systems, each having a different interconnection network, and a different MPI implementation. Results show that, where other models frequently fail, P-3PC correctly predicts the communication costs related to any type of domain decomposition.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128224436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Increasing the adaptivity of routing algorithms for k-ary n-cubes 提高k-ary n-立方体路由算法的自适应
Elvira Baydal, P. López, J. Duato
{"title":"Increasing the adaptivity of routing algorithms for k-ary n-cubes","authors":"Elvira Baydal, P. López, J. Duato","doi":"10.1109/EMPDP.2002.994333","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994333","url":null,"abstract":"In this paper, we show that routing algorithms may exploit not only the flexibility obtained by crossing network dimensions in any order but also that obtained in the same network dimension, thanks to the availability of bidirectional channels. We analyze the behavior of adaptive routing algorithms both for deadlock avoidance and recovery, exploiting this increased routing flexibility, and compare them with previous proposals in order to evaluate the contribution of the additional routing freedom on network performance. Simulation results show that this simple improvement in the routing algorithm allows one to achieve throughput improvements of up to 45% in networks with low radix, for a uniform distribution of message destinations.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127049798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
On improving the performance of data partitioning oriented parallel irregular reductions 面向数据分区的并行不规则约简性能改进研究
E. Gutiérrez, O. Plata, E. Zapata
{"title":"On improving the performance of data partitioning oriented parallel irregular reductions","authors":"E. Gutiérrez, O. Plata, E. Zapata","doi":"10.1109/EMPDP.2002.994330","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994330","url":null,"abstract":"Different parallelization techniques for reductions have been classified in this paper into two classes: LPO (loop partitioning-oriented techniques) and DPO (data partitioning-oriented techniques). We have analyzed both classes in terms of a set of performance properties: data locality, memory overhead, parallelism and workload balancing. We propose several techniques to increase the exploited parallelism and to introduce load balancing into a DPO method. Regarding parallelism, the solution is based on the partial expansion of the reduction array. For load balancing, the first technique is generic, as it can deal with any kind of load unbalance present in the problem domain. The second technique handles a special case of load unbalancing appearing when there are a large number of write operations on small regions of the reduction arrays. Efficient implementations of the proposed optimizing solutions for the DWA-LIP (data write affinity-loop index prefetching) DPO method are presented, experimentally tested on static and dynamic kernel codes, and compared with other parallel reduction methods.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131020595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On the impossibility of implementing perpetual failure detectors in partially synchronous systems 在部分同步系统中实现永久故障检测器的不可能性
M. Larrea, Antonio Fernández, S. Arévalo
{"title":"On the impossibility of implementing perpetual failure detectors in partially synchronous systems","authors":"M. Larrea, Antonio Fernández, S. Arévalo","doi":"10.1109/EMPDP.2002.994241","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994241","url":null,"abstract":"In this paper we study the implementability of different classes of failure detectors in several models of partial synchrony. We show that no failure detector with perpetual accuracy (namely, P, Q, S, and W) can be implemented in any of the models of partial synchrony proposed previously in systems with even a single failure. We also show that, in these models of partial synchrony, it is necessary for a majority of correct processes to implement a failure detector of class /spl Theta/.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"268 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120895683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Removing the latency overhead of the ITB mechanism in COWs with source routing 通过源路由消除奶牛中ITB机制的延迟开销
J. Flich, Manuel P. Malumbres, P. López, J. Duato
{"title":"Removing the latency overhead of the ITB mechanism in COWs with source routing","authors":"J. Flich, Manuel P. Malumbres, P. López, J. Duato","doi":"10.1109/EMPDP.2002.994334","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994334","url":null,"abstract":"Clusters of workstations (COWs) are becoming increasingly popular as a cost-effective alternative to parallel computers. The in-transit buffer (ITB) mechanism can improve network performance when applied to COWs with irregular topology and source routing. This mechanism considerably improves the performance of this kind of network when compared to current source routing algorithms; however, it introduces a latency penalty. An implementation of this mechanism was performed, showing that the latency overhead of the mechanism may be noticeable, especially for short messages and at low network loads. In this paper, we analyze in detail the latency overhead of ITBs, proposing several mechanisms to reduce, hide and remove it. Firstly, we show, by simulation, the effect of an ITB implementation that is much slower than the one implemented. Then we propose three mechanisms that try to overcome the latency penalty. All the mechanisms are simple and can be easily implemented; also, they are out of the critical path of the ITB packet-processing procedure. The results show very good behaviour of the proposed mechanisms, considerably reducing or even completely removing the latency overhead.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121522378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nodes bearing grudges: towards routing security, fairness, and robustness in mobile ad hoc networks 承载怨恨的节点:移动自组织网络中的路由安全性、公平性和鲁棒性
S. Buchegger, J. Boudec
{"title":"Nodes bearing grudges: towards routing security, fairness, and robustness in mobile ad hoc networks","authors":"S. Buchegger, J. Boudec","doi":"10.1109/EMPDP.2002.994321","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994321","url":null,"abstract":"Devices in mobile ad hoc networks work as network nodes and relay packets originated by other nodes. Mobile ad hoc networks can work properly only if the participating nodes cooperate in routing and forwarding. For individual nodes it might be advantageous not to cooperate. The new routing protocol extensions presented in this paper make it possible to detect and isolate misbehaving nodes, thus making denying cooperation undesirable. In the presented scheme, trust relationships and routing decisions are made based on experienced, observed, or reported routing and forwarding behavior of other nodes. A hybrid scheme of selective altruism and utilitarianism is presented to strengthen mobile ad hoc network protocols in their resistance to security attacks, while aiming at keeping network throughput high. This paper focuses particularly on the network layer using the dynamic source routing (DSR) protocol as an example.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134414926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 462
Geometric scheduling of 2-D UET-UCT uniform dependence loops 二维UET-UCT均匀依赖回路的几何调度
Ioannis Drositis, T. Andronikos, G. Manis, G. Papakonstantinou, N. Koziris
{"title":"Geometric scheduling of 2-D UET-UCT uniform dependence loops","authors":"Ioannis Drositis, T. Andronikos, G. Manis, G. Papakonstantinou, N. Koziris","doi":"10.1109/EMPDP.2002.994305","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994305","url":null,"abstract":"Finding an optimal time schedule is one of the primary tasks in the area of parallelizing uniform dependence loops. Due to the existence of dependence vectors, the index space of such a loop, is split into subspaces of points that can be executed at different time instances. The geometric representation of these sets form certain polygonal shapes called patterns, with special attributes and characteristics. In this paper we present a scheduling technique that is based on the geometric attributes of the index space and the dependence vector set. Our strategy can be applied to architectures that consider unit execution-zero communication delay (UET) or unit execution-unit communication (UET-UCT) model, as a new method for transforming UET-UCT problems to UET equivalent ones is presented.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114693023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model oriented profiling of parallel programs 并行程序的面向模型分析
J. González, C. León, J. R. García, C. Rodríguez, J. Rodríguez, F. D. Sande, A. M. Printista
{"title":"Model oriented profiling of parallel programs","authors":"J. González, C. León, J. R. García, C. Rodríguez, J. Rodríguez, F. D. Sande, A. M. Printista","doi":"10.1109/EMPDP.2002.994212","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994212","url":null,"abstract":"The prediction analysis model presented extends BSP to cover both oblivious synchronization and group partitioning. These generalizations imply that different processors may finish the same superstep at different times. The other consideration is that, even if the numbers of individual communication or computation operations in two stages are the same, the actual times for these two stages may differ. These differences are due to the separate nature of the operations or to the particular pattern followed by the messages. Even worse, the assumption that a constant number of machine instructions takes constant time is far from the truth. Current memory hierarchies imply that memory access vary from a few cycles to several thousands. A natural proposal is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each \"communication block\". Unfortunately, to use this approach implies that the evaluation parameters not only depend on given architecture, but also reflect algorithm characteristics. Such parameter evaluation must be done for every algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We have developed a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter giving us, among other information, the values of those parameters.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114963581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible service provision considering specific customer resource needs 灵活的服务提供考虑到特定的客户资源需求
D. Thißen
{"title":"Flexible service provision considering specific customer resource needs","authors":"D. Thißen","doi":"10.1109/EMPDP.2002.994283","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994283","url":null,"abstract":"The development of global networks like the Internet has opened new possibilities for the co-operation of various organisations. A computing resource can be offered by one organisation and it can be remotely used by customers, i.e. other organisations or individuals, to perform some task or access some service on it. Such a resource not only has to be provided for a suitable price but, additionally, it has to be deployed in an efficient way, promising a good performance in service provision to satisfy the customers. Because existing infrastructures have to be integrated and used in the service provision process, it becomes necessary to develop new concepts for the management of the arising service-oriented distributed systems and the resources involved. This paper discusses a mechanism for the performance management of services in distributed environments. A service trader is used as a central component, supporting a customer in choosing a suitable service while considering the global state of the distributed system's resources using a load balancer. Management proxies encapsulate services or service groups and observe the performance and availability characteristics of the resources involved in a service usage process to fulfil the quality characteristics of a mediated service. This approach is designed to cause minimal involvement of service providers and customers in the selection and management process.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125015400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dynamically reconfigurable system-on-programmable-chip 可编程芯片上的动态可重构系统
Heiko Kalte, D. Langen, E. Vonnahme, A. Brinkmann, U. Rückert
{"title":"Dynamically reconfigurable system-on-programmable-chip","authors":"Heiko Kalte, D. Langen, E. Vonnahme, A. Brinkmann, U. Rückert","doi":"10.1109/EMPDP.2002.994277","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994277","url":null,"abstract":"Today's high-density FPGAs and intellectual property (IP) components enable the integration of complex systems in one programmable chip. New design strategies and concepts have to be developed in order to utilize the new system-level integration facilities. The approach introduced in this paper describes the implementation of a communication infrastructure that provides a number of on-chip IP-sockets. By using the FPGA feature of partial dynamic reconfiguration, different IP components can be plugged into these sockets at run-time. This leads to a reconfigurable system that can be adapted to varying demands. In this context, we designed a 32-bit RISC processor and an AMBA (Advanced Microcontroller Bus Architecture) on-chip interconnection bus. Finally, we mapped these components on to a reconfigurable system-level FPGA. The resulting hardware sizes and the utilization of the FPGA's resources are presented.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134482659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信