N. Saito, H. Tokuda, T. Hagino, S. Oikawa, A. Yonezawa, S. Matsuoka, S. Inohara, Y. Tada, H. Sunahara, S. Ishii, Etsuya Shibayama, Y. Shinoda
{"title":"Comprehensive operating system for highly parallel machine","authors":"N. Saito, H. Tokuda, T. Hagino, S. Oikawa, A. Yonezawa, S. Matsuoka, S. Inohara, Y. Tada, H. Sunahara, S. Ishii, Etsuya Shibayama, Y. Shinoda","doi":"10.1109/ISPAN.1994.367135","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367135","url":null,"abstract":"This paper describes the design and implementation of a comprehensive operating system (COS) for highly parallel machine. This is one of the research themes in the Japanese University Highly Parallel Research Project supported by the Special Grant of the Research Funds of Ministry of Education. The target machine is called JUMP-1 originally developed in this project. Several design policies are discussed, and the user service classes are defined. COS architecture is based on the micro kernel, and it includes both Partition OS and Service OS. The development environment is also discussed briefly.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122599443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Sakakibara, Katsuyoshi Kitai, T. Isobe, Shigeko Yazawa, Teruo Tanaka, Yoshiko Tamaki, Y. Inagami
{"title":"An interprocessor memory access arbitrating scheme for the S-3800 vector supercomputer","authors":"T. Sakakibara, Katsuyoshi Kitai, T. Isobe, Shigeko Yazawa, Teruo Tanaka, Yoshiko Tamaki, Y. Inagami","doi":"10.1109/ISPAN.1994.367140","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367140","url":null,"abstract":"Reports an instruction-based variable priority scheme which achieves high sustained memory throughput on a tightly coupled multiprocessor (TCMP) vector supercomputer. We analyze the two types of priority control for arbitrating interprocessor memory access conflict. In the case of request level priority control, mutual obstruction causes performance degradation, while in the case of fixed priority control, it is caused by memory bank occupation. Mutual obstruction is caused by requests of different instructions that interfere with each other, and memory bank occupation is caused by continuous accessing of the same memory bank by higher priority instructions. The instruction-based variable priority scheme works as follows: (1) the priority of each pipeline is usually changed at the end of an instruction. (2) The priority is changed more than once in the middle of an instruction, such as a stride multiple-of-8 or indirect access instruction which may occupy the same memory bank by itself. This strategy reduces mutual obstruction because the priority of each pipeline is stable in the middle of an instruction. It also reduces memory bank occupation because opportunity for memory access among different instructions is made equal by changing the priority at the end of on instruction. Moreover, it prevents memory bank occupation by stride multiple-of-8 or indirect access instruction, by changing the priority more frequently. Consequently, high sustained memory throughput can be achieved on TCMP vector supercomputers. We implemented this scheme in Hitachi's S-3800 supercomputer.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125735310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced fault tolerant routing in hypercubes","authors":"Q. Gu, S. Peng","doi":"10.1109/ISPAN.1994.367147","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367147","url":null,"abstract":"We study the fault tolerant properties of n-dimensional hypercubes H/sub n/ for node-to-set and set-to-set routing problems on a general fault tolerant routing model, cluster fault tolerant routing, which is a natural extension of the well studied node fault tolerant routing. A cluster of a graph G is a connected subgraph of G and a cluster is called faulty if all nodes in the cluster are faulty. For node-to-set routing and set-to-set routing, where k(2/spl les/k/spl les/n) fault free node disjoint paths are needed, in H/sub n/, we show that the maximum numbers of fault clusters of diameter at most 1 that can be tolerated is n-k. We give O(kn) optimal time algorithms which find k fault free node disjoint paths of length at most n+3 for node-to-set and k fault free node disjoint paths of length at most 2n for set-to-set cluster fault tolerant routing problems in H/sub n/, respectively. We also prove that n+2 is an optimal upper bound on the length of the routing paths for node-to-set cluster fault tolerant routing.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117249461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cube-connected modules: a family of cubic networks","authors":"Gen-Huey Chen, Hui-Ling Huang","doi":"10.1109/ISPAN.1994.367164","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367164","url":null,"abstract":"A family of cubic networks, named cube-connected modules, is proposed in this paper. The cube-connected modules network consists of modules which are interconnected as a hypercube. Any connected graph, e.g., cycle, hypercube graph, and complete graph, can serve as a module. Topological properties are investigated, and the problems of routing, broadcasting, embedding, and finding parallel routing paths are studied. We show that the problem of determining the shortest routing path is NP-hard, and it can be transformed to the asymmetric traveling salesman problem. The broadcasting algorithms on cube-connected modules can be obtained by combining broadcasting algorithms on hypercubes and broadcasting algorithms on modules. We show that if the modules are hamiltonian, then the cube-connected modules are also hamiltonian. Moreover, a sufficient condition is given for the existence of maximum number of parallel paths between any two nodes of cube-connected modules.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129668340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance of 4-dimensional PANDORA networks","authors":"R. F. Holt, A. B. Ruighaver","doi":"10.1109/ISPAN.1994.367161","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367161","url":null,"abstract":"The Melbourne University Optoelectronic Multicomputer Project is investigating dense optical interconnection networks capable of providing low latency data transfers of small data items. Such capabilities are useful in the exploitation of small grain parallelism. In many cases, reducing the grain size of tasks increases the amount of parallelism which can be found in the program. Our networks use an organization of data transfers called PANDORA (PArallel Newscasts on a Dense Optical Reconfigurable Array). The communication patterns on a PANDORA network are pre-determined, removing the overhead of sending and decoding addressing information. Instead the data is recognized by the time of arrival and the channel on which it arrives. Previous efforts have focused on 2-dimensional multiple broadcasting networks where each node may broadcast a different data item on the row and columns of the network. For large processor arrays, we have to reduce the density of the interconnection network as full interconnection on each row becomes too expensive. This paper discusses a 4-dimensional network which achieves a significant reduction in density with only a small increase in data transfer delays.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"66 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114036226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Undirected circulant graphs","authors":"F. P. Muga","doi":"10.1109/ISPAN.1994.367157","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367157","url":null,"abstract":"A fundamental problem in designing massively parallel computer systems and fast communication networks is the maximization of the number of nodes given a diameter and degree of a network. This maximal number is bounded above by the Moore bound. For undirected circulant graphs, an upper bound is also given but no exact formula has been found yet for degree /spl ges/6. A refinement on this upper bound is given in this paper. It is determined also that this maximal number is odd.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127898454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Texture analysis for image processing on general-purpose parallel machines","authors":"L. Böröczky, P. Cremonesi, N. Scarabottolo","doi":"10.1109/ISPAN.1994.367169","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367169","url":null,"abstract":"The problem considered in this paper is the definition of an efficient parallel algorithm for texture analysis of an image. The target architectures are distributed-memory general-purpose MIMD parallel machines. The solutions proposed here are based on two different methods, the Statistic Feature Matrix and the Wavelet Decomposition.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128123750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Amamiya, Masahiko Satoh, A. Makinouchi, Ken-ichi Hagiwara, T. Yuasa, H. Aida, K. Ueda, K. Araki, T. Ida, T. Baba
{"title":"Research on programming languages for massively parallel processing","authors":"M. Amamiya, Masahiko Satoh, A. Makinouchi, Ken-ichi Hagiwara, T. Yuasa, H. Aida, K. Ueda, K. Araki, T. Ida, T. Baba","doi":"10.1109/ISPAN.1994.367134","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367134","url":null,"abstract":"We are pursuing research on programming languages for massively parallel processing. The objective of the research is the following two points according to the top level research objective of our Massively Parallel Processing Principle Research Project: firstly to develop a prototype of a massively parallel programming language and compiler system, which is competitive to commercial language systems like data parallel C, Fortran D or HPF; and secondly to explore a massively parallel computation model, and design an experimental language as an implementation of the newly explored massively parallel computation model.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125639001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient algorithms for conservative parallel simulation of interconnection networks","authors":"Y. M. Teo, S. Tay","doi":"10.1109/ISPAN.1994.367188","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367188","url":null,"abstract":"This paper addresses the use of parallel simulation techniques to speedup the simulation of multistage interconnection networks. The conventional null-message approach to resolving deadlock problem in conservative simulation is based on a lookahead mechanism. For some application domains, unfortunately, the lookahead information is not available. Consequently, the simulation using null messages will be trapped in a livelock. We propose a deadlock/livelock free scheme using null messages, but without the guaranteed lookahead, to coordinate the simulation, and different partitioning techniques for mapping of the simulation program onto multicomputers. A flushing mechanism to address the combinatoric explosion of using null-message in conservative simulation is also discussed. Our analysis shows that the proposed flushing mechanism effectively reduces the number of null messages from exponential to linear.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131351555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A massively parallel implementation of pattern classifiers on SIMD and MIMD architectures","authors":"K. Lam","doi":"10.1109/ISPAN.1994.367170","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367170","url":null,"abstract":"Parallel multi-layer classifier architectures with an increasing hierarchical order have offered much flexibility in design to deal with a wide variety of properties. The model of pipeline processing is especially appropriate for realising such architectures. This has provided hierarchical classifiers a distinct advantage in real-time applications to cope with the important demand for high operating speed, in addition to a potentially better classification performance. An example application of a cascaded form of the BWS and FWS networks, both of which are representatives of the array memory based statistical classifier is described in this paper. As with most pipelined architectures, the complex interactions between successive processing layers of the cascaded network represent a major drawback, and they impose performance bottlenecks which challenge the use of a highly parallel realisation of the classifier. This paper describes an efficient data parallel implementation of the BWS-FWS. For completeness, a brief review of the multi-layer classifiers is first presented. The new algorithm for combining the BWS and FWS networks is described and implemented on two distributed memory processor arrays, the MasPar MP-1 and a network of transputers. An analysis of the performance obtained is also presented.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132733135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}