{"title":"A detailed analysis of random polling dynamic load balancing","authors":"P. Sanders","doi":"10.1109/ISPAN.1994.367176","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367176","url":null,"abstract":"Dynamic load balancing is crucial for the performance of many parallel algorithms. Random polling, a simple randomized load balancing algorithm, has proved to be very efficient in practice for applications like parallel depth first search. This paper presents a detailed analysis of the algorithm taking into account many aspects of the underlying machine and the application to be load balanced. It derives tight scalability bounds which are for the first time able to explain the superior performance of random polling analytically. In some cases, the algorithm even turns out to be optimal. Some of the proof-techniques employed might also be useful for the analysis of other parallel algorithms.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125046687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed validation of massively parallel machines","authors":"C. Aktouf, O. Benkahla, C. Robach","doi":"10.1109/ISPAN.1994.367183","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367183","url":null,"abstract":"In this paper, a distributed algorithm for validating message passing-machines is presented and evaluated. Our approach is based on adaptive distributed diagnosis of multiprocessor systems in a user environment where a full self-diagnosis is not needed. We analyze the algorithm performance using a model based on an open queueing network.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"18 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123281183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuetsu Kodama, H. Sakane, M. Sato, S. Sakai, Y. Yamaguchi
{"title":"Message-based efficient remote memory access on a highly parallel computer EM-X","authors":"Yuetsu Kodama, H. Sakane, M. Sato, S. Sakai, Y. Yamaguchi","doi":"10.1109/ISPAN.1994.367154","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367154","url":null,"abstract":"Communication latency is central to multiprocessor design. This report presents the design principles of EM-X multiprocessor towards tolerating communication latency. Multi-threading principle is built in the EM-X to overlap communication and computation for latency tolerance. In particular, we present two types of hardware support for remote memory access: (1) priority-based packet scheduling for thread invocation, and (2) direct remote memory access mechanism. The priority-based scheduling policy extends a FIFO ordered thread invocation policy to adapt to different computational needs. The direct remote memory access based on non-preemptive thread execution is designed to overlap remote memory operations while executing threads. We give two examples to explain our approach. The 80-processor prototype of EM-X is currently being fabricated and is expected to be operational in the near future. Preliminary evaluation indicates that the EM-X can effectively overlap computation and communication, toward tolerating communication latency for high performance parallel computing.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123550046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NCIC's research and development in parallel processing","authors":"Guo-Jie Li","doi":"10.1109/ISPAN.1994.367148","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367148","url":null,"abstract":"National Research Center for Intelligent Computing Systems (NCIC) is the unique national hi-tech R/D center for advanced computing technology in China. We introduce China's Hi- Tech R&D Programme (863 programmes) and NCIC, and we report on the state of the art of parallel processing at NCIC. The article discusses the key technologies being exploited by the representative Chinese R & D teams and the wide applications of parallel computers in China. The key technologies in parallel processing we are attacking are reported and include wormhole routing and other efficient switching techniques, the Easter series, MPP systems, the Dawning series symmetric and multi-thread multiprocessor, parallel operating systems and parallel file systems, parallel compilers and efficient programming tools. Future research directions at NCIC are also mentioned.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"103 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113981721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Tanno, A. Koyama, Said Mirza, T. Taketa, Syoichi Noguchi
{"title":"Performance evaluation of high-speed self-token ring LAN","authors":"K. Tanno, A. Koyama, Said Mirza, T. Taketa, Syoichi Noguchi","doi":"10.1109/ISPAN.1994.367187","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367187","url":null,"abstract":"The fiber distributed data interface(FDDI) is now widely accepted as the follow-on LAN for IEEE 802.3 (the Ethernet) and 802.5 (the token ring) LANs. However, the advent of more high-speed LANs is eagerly expected to support higher performance requirements. In this paper, we describe a new ring access control scheme adopting multiple-tokens, referred to as the self-token protocol. In the protocol, each station has private tokens, called self-tokens, and has a fixed length register to prevent packets on a ring from collision. After approximate analysis of throughput-transfer delay characteristics, we show that this protocol is attractive and suitable for a gigabit LAN. We also show that fairness of this protocol is kept good for a low number of self-tokens.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114913835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cost-effective global fault-tolerant multiprocessors","authors":"Chu-Sing Yang, S. Wu","doi":"10.1109/ISPAN.1994.367145","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367145","url":null,"abstract":"The global design approach we propose for fault-tolerant multiprocessors is cost-effective, while assuring full spare utilization during a proper reconfiguration process. Use of task reassignment during reconfiguration helps keep system costs effective and system size expandable easily. Our scheme is topology-independent; that is, any multiprocessor system can be applied. As compared with previous work, the proposed scheme can achieve higher or the same reliability at less extra hardware cost.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128899250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Matrix multiplication on the MasPar using distance insensitive communication schemes","authors":"Xiao Sun, F. Lombardi","doi":"10.1109/ISPAN.1994.367179","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367179","url":null,"abstract":"Two parallel matrix multiplication algorithms are presented in this paper. These algorithms execute on a grid with toroidal connections. Their novelty is the utilization of communication schemes which theoretically are distance insensitive; the impact on the communication and computational complexities and costs compared with a theoretical analysis, is analyzed and evaluated. The proposed algorithms have been implemented on a MasPar array. An experimental evaluation of these algorithms is performed. A comparison is made for matrix multiplication between the MasPar and the SUN-4/390.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"129 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114000908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contention sensitive fault-tolerant routing algorithms for hypercubes","authors":"R. Srinivasan, V. Chaudhary, S. Mahmud","doi":"10.1109/ISPAN.1994.367146","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367146","url":null,"abstract":"We present two new fault tolerant routing algorithms for hypercubes. The first algorithm requires only local knowledge of the faults whereas the second algorithm requires global knowledge. Unlike previous fault tolerant routing algorithms, our algorithms take into consideration the dynamic conditions (link contention) of the network. We have shown that checking for dynamic conditions in fault tolerant algorithms is essential. Performance evaluation by extensive simulation of our algorithms and other fault tolerant routing algorithms show that ours are better than previous algorithms by as much as 50%; and 500%; in time and space, respectively. We also observed that global information about the location of faults does not give us additional benefit. This observation is true regardless of the consideration of the dynamic conditions in the network.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122749830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Hiraki, H. Amano, M. Kuga, T. Sueyoshi, T. Kudoh, H. Nakashima, H. Nakajo, Hideo Matsuda, T. Matsumoto, S. Mori
{"title":"Overview of the JUMP-1, an MPP prototype for general-purpose parallel computations","authors":"K. Hiraki, H. Amano, M. Kuga, T. Sueyoshi, T. Kudoh, H. Nakashima, H. Nakajo, Hideo Matsuda, T. Matsumoto, S. Mori","doi":"10.1109/ISPAN.1994.367136","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367136","url":null,"abstract":"We describe the basic architecture of JUMP-1, an MPP prototype developed by collaboration between 7 universities. The proposed architecture can exploit high performance of coarse-grained RISC processor performance in connection with flexible fine-grained operation such as distributed shared memory, versatile synchronization and message communications.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129661227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel graph isomorphism detection with identification matrices","authors":"Lin Chen","doi":"10.1109/ISPAN.1994.367158","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367158","url":null,"abstract":"In this paper, we investigate some properties of identification matrices and exhibit some uses of identification matrices in studying the graph isomorphism problem, a well-known long-standing open problem. We show that, given two m/spl times/n identification matrices representing two graphs according to a certain relation, isomorphism can be decided efficiently in parallel if an m/spl times/(n-c) submatrix, for a constant c, satisfies the consecutive/circular 1's property. The result presented here significantly broadens the class of graphs for which there are known efficient parallel isomorphism testing algorithms.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129501479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}