Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)最新文献_第5页

Mapping instruction sequences onto EPOM-processor arrays: a framework for parallel data processing 将指令序列映射到epom处理器阵列:并行数据处理的框架

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737977

Jean-Paul Theis, Harald Schlimper

{"title":"Mapping instruction sequences onto EPOM-processor arrays: a framework for parallel data processing","authors":"Jean-Paul Theis, Harald Schlimper","doi":"10.1109/HIPC.1998.737977","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737977","url":null,"abstract":"The paper introduces an optimized mapping methodology for mapping instruction sequences (ISs) onto EPOM-processor arrays. The new features of this mapping methodology result from a systematic specification and exploitation of both instruction and processor level parallelism: ultra-low granularity of ISs requires an allocation and scheduling of individual instructions onto the given processor array. Moreover, this mapping methodology is complete in the sense that it considers both array bus-bandwidths and processor resource constraints. The mapping methodology is based on two concepts: 1) instruction sequences (ISs) which represent a generalized form of directed cyclic graphs (DCGs) and allow efficient specification of algorithm parallelism, and graph nodes represent instructions from the instruction set of a target processor architecture (J.P. Theis, 1997); 2) the EPOM-processor architecture which represents an optimized target VLIW processor architecture for parallel implementation of ISs (J.P. Theis and L. Thiele, 1996) and especially suited for parallel image/multimedia processing (J.P. Theis and L. Thiele, 1995). Special attention is paid to the optimization, of the mapping process of ISs onto EPOM-processor arrays. Algorithm execution time minimization is used as optimization goal. The mapping methodology is partially based on integer linear programming and heuristic techniques. The solution time complexity is substantially reduced by developing a two-phase hierarchical model, decoupling processor array allocation from subsequent scheduling. The efficiency of this mapping methodology was validated through experimental results on ISs of well known algorithm routines.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"6 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125920878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Distributed routing balancing for interconnection network communication 用于互联网络通信的分布式路由均衡

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737996

I. Garcés, Daniel Franco, E. Luque

{"title":"Distributed routing balancing for interconnection network communication","authors":"I. Garcés, Daniel Franco, E. Luque","doi":"10.1109/HIPC.1998.737996","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737996","url":null,"abstract":"An efficient design of the interconnection network is crucial because of its impact on the parallel computer performance. A high speed routing scheme that minimises contention and avoids the formation of hot-spots should be included in the design. We have developed a new method to uniformly balance communication traffic over the interconnection network called distributed routing balancing (DRB) that is based on limited and load-controlled path expansion in order to maintain a low message latency. The method uniformly distributes the communication load between all links of the interconnection network and maintains latency control provided that total bandwidth requirements do not exceed total available link bandwidth in the interconnection network. DRB defines how to create alternative paths to expand single paths (expanded path definition) and when to use them depending on traffic load (expanded path selection carried out by DRB routing). Some conclusions of the experimentation and comparisons with existing methods are given. It is demonstrated that DRB is a method to effectively balance network traffic.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121542938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Measurement-based modeling and analysis methodology for characterizing parallel I/O performance 用于表征并行I/O性能的基于测量的建模和分析方法

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.738013

S. Sharma, R. Iyer

引用次数: 0

Extrapolation in distributed adaptive integration 分布式自适应集成中的外推

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737975

E. Doncker, Ajay Gupta, Rodger Zanny, J. Maile

引用次数: 2

A comparative study of some network subsystem organizations 若干网络子系统组织的比较研究

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.738019

D. Ponomarev, K. Ghose

引用次数: 2

A simple optimal list ranking algorithm 一个简单的最优列表排序算法

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737971

A. Ranade

引用次数: 12

Performance-driven design and redesign of high-speed local area networks 高速局域网的性能驱动设计与再设计

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.738016

C. Ravikumar, Dilip R. Pandit, A. Mishra

{"title":"Performance-driven design and redesign of high-speed local area networks","authors":"C. Ravikumar, Dilip R. Pandit, A. Mishra","doi":"10.1109/HIPC.1998.738016","DOIUrl":"https://doi.org/10.1109/HIPC.1998.738016","url":null,"abstract":"Although distributed computing over a network of computers has become a reality, its success mainly depends on the performance of the underlying network. In this paper, we consider the problem of designing a local area network with specified cost and performance constraints. The cost and performance of a local area network (LAN) are directly related to its topology. Using the a priori knowledge of the approximate number of users of the network and the kind of communication traffic that must be supported, the designer can optimize the design of the of a LAN for superior performance. Design decisions include the number of LAN segments, number of bridges, assignment of users to segments, and the method to interconnect the segments through bridges. In case of ATM networks, the decisions are regarding the number of ATM switches, the assignment of hosts to switches, and the way to connect switches through cross-connects. While assigning too many users to the same segment may cause large delays due to the sharing of network bandwidth, splitting the LAN into too many segments will increase the cost of the LAN. We report a greedy heuristic algorithm for local area network design. We propose an interesting method to construct good initial solutions to the topology design problem using a heuristic method which is based on the three-opt technique for solving the travelling salesperson problem. Our experimental results indicate that the heuristic algorithm finds good solutions.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115207515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Efficient address sequence generation for two-level mappings in High Performance Fortran 高性能Fortran中两级映射的高效地址序列生成

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737981

J. Ramanujam, A. Venkatachar, S. Dutta

引用次数: 2

Strategies for parallel implementation of a global spectral atmospheric general circulation model 平行实施全球大气环流谱模式的策略

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.738021

R. Nanjundiah

引用次数: 6

Skew-insensitive parallel algorithms for relational join 面向关系连接的倾斜不敏感并行算法

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.738010

K. Alsabti, S. Ranka

引用次数: 6