专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367074
M. Walker
{"title":"On scientific research, applications software development, and industrial use of massively parallel computing systems","authors":"M. Walker","doi":"10.1109/MPCS.1994.367074","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367074","url":null,"abstract":"A healthy massively parallel computing system industry requires a well-defined market for their products. The market for massively parallel computing systems has remained stubbornly ill-defined for at least a decade. The primary reason for this is that industrial corporations have not been able to measure value in the new computing technology, due to the absence of commercially supported application software packages for these systems. It is essential for the well-being of everyone engaged in massively parallel computing, from research institutions to industrial end users, and particularly the computer builders themselves, that ways be found to ameliorate this state of affairs. This paper analyses the situation, and proposes some procedures whose implementation would ease the transfer of newly developed application technologies from research groups into the hands of end users, who can then employ massively parallel computers to solve practical industrial problems.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"11 1","pages":"217-219"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85812427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367060
H. Karatza
{"title":"Simulation of a MSIMD system with resequencing","authors":"H. Karatza","doi":"10.1109/MPCS.1994.367060","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367060","url":null,"abstract":"In this work we study the performance of a multi-processor system model which executes multiple SIMD jobs. Such a parallel computer organization contains multiple Control units (CUs) which share a resource pool of a finite number of Processing Elements (PEs) and operates multiple single-instruction-multiple-data streams (MSIMD). We assume that after the CUs service, resequencing of SIMD jobs takes place which ensures that jobs leave the processing unit on a first-in-first-out basis. A closed queueing network model of a MSIMD computer system is simulated. The performance of two different queueing disciplines in conjunction with the effect of the resequencing delay is investigated for various degrees of multiprogramming and coefficients of variation of the CUs service times.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"62 1","pages":"339-342"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79276917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367066
M. Albanesi, V. Cantoni, M. Ferretti, F. Mainieri
{"title":"A VLSI 128-processor chip for multiresolution image processing","authors":"M. Albanesi, V. Cantoni, M. Ferretti, F. Mainieri","doi":"10.1109/MPCS.1994.367066","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367066","url":null,"abstract":"This paper presents the design of a multiprocessor chip integrating 128 simple processors arranged as a mesh of 8 rows by 16 columns. This mesh of PEs is intermingled with a dual mesh of switching elements, that interconnect the PEs and reconfigure them into one of three topologies. i) an 8-connected standard mesh, ii) a set of 4-connected independent meshes: iii) a quad-tree. This architecture is a viable solution to the problem of embedding into silicon a quad-pyramid, a well known multi-resolution structure for image processing. Local memory is shared between PEs of the same row and the address space increases with the level of the pyramid. The embedding supports fault-tolerance by column substitution.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"55 1","pages":"296-307"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79078401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367024
S. Crawford, R. Demara
{"title":"Cache coherence in a multiport memory environment","authors":"S. Crawford, R. Demara","doi":"10.1109/MPCS.1994.367024","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367024","url":null,"abstract":"The effects of various cache coherence strategies are analyzed for a multiported shared memory multiprocessor. Analytical models for concurrent read exclusive write access (CREW) and concurrent read concurrent write access (CRCW) are developed including shared-not-cacheable, snooping bus, snooping bus with cache-to-cache transfers, and directory protocols. The performance of each protocol is shown as the hit rate, main memory-to-cache memory cycle time ratio, fraction of shared data, read percentage, and number of partitions are varied. Overall, results indicate that a snooping bus with cache-to-cache transfer scheme provides consistently fast access times over a wide range of execution parameters. However, nearly equivalent performance can be obtained with simpler directory based schemes. The implications of these results on increasing port complexity and memory usage are discussed.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"20 1","pages":"632-642"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81892561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367019
R. Wyrzykowski, J.S. Kanevski, O. Maslennikov
{"title":"Systolic-type implementation of matrix computations based on the Faddeev algorithm","authors":"R. Wyrzykowski, J.S. Kanevski, O. Maslennikov","doi":"10.1109/MPCS.1994.367019","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367019","url":null,"abstract":"Deals with the problem of enhancing the versatility of VLSI processor arrays without undue addition of hardware, time/control overhead, and software complexity. A promising approach to this problem is based on matrix computations carried out through the Faddeev algorithm. We design a fixed-size, linear array architecture with fully local communications and straightforward control requirements. This high-throughput, systolic-type architecture allows us to minimize both I/O requirements and the number of processing elements performing complicated operations like divisions. To derive the array from a formal description of the Faddeev algorithm based on Gaussian elimination with partial pivoting, we use purposive transformations of the basic dependence graph of the algorithm before its space-time mappings onto array architectures.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"51 1","pages":"31-42"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85113146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367057
T. Kalinowski
{"title":"Solving the mapping problem with a genetic algorithm on the MasPar-1","authors":"T. Kalinowski","doi":"10.1109/MPCS.1994.367057","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367057","url":null,"abstract":"Good mapping algorithms can significantly reduce the total execution time of a program. However, the mapping problem is NP-complete. Consequently, heuristic methods should be used. Massively parallel systems allow the implementation of genetic algorithms running on large populations. In this paper, an algorithm based on a neighbourhood model is presented. The program has been implemented on 4096-processor MasPar-1 multicomputer. Experimental results for three genetic operators are presented and compared. The influence of initialisation strategies and selection techniques is also considered. A new initialization strategy based on grouping of adjacent tasks into approximately equal clusters is proposed.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"44 1","pages":"370-374"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76435732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367026
R. Squier, K. Steiglitz
{"title":"Comparing architectures using throughput-versus-cost modeling","authors":"R. Squier, K. Steiglitz","doi":"10.1109/MPCS.1994.367026","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367026","url":null,"abstract":"The paper compares two parallel architectures in terms of throughput versus cost. Throughput is estimated using machine parameters extracted from detailed models of each architecture. Cost models are used to express total resource use in terms of a common unit. Performance of an architecture is then evaluated by optimizing throughput for each possible cost. Finally, these throughput-versus-cost results for the two architectures are compared for a particular class of problems: iterative computations on a 2-dimensional grid. The results show that there is a cost below which one architecture is an order of magnitude faster than the other, and above which this relationship is reversed. This approach may prove useful as a general comparison methodology.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"1 1","pages":"611-619"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77329740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367064
Antonio Corradi, L. Leonardi, F. Zambonelli
{"title":"Dynamic load distribution in massively parallel architectures: the parallel objects example","authors":"Antonio Corradi, L. Leonardi, F. Zambonelli","doi":"10.1109/MPCS.1994.367064","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367064","url":null,"abstract":"The paper presents the mechanisms for dynamic load distribution implemented within the support for the Parallel Objects (PO for short) programming environment. PO applications evolve depending on their dynamic need of resources, enhancing application performance. The goal is to show how dynamic load distribution can be successfully applied on a massively parallel architecture.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"14 1","pages":"318-322"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80703063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367058
D. Crosetto
{"title":"Massively parallel-processing system with 3D-Flow processors","authors":"D. Crosetto","doi":"10.1109/MPCS.1994.367058","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367058","url":null,"abstract":"The 3D-Flow is a massively parallel-processing system. Its main advantages are embodied in its architecture: the system (integrated and standardized), the assembly (modular with maximum connectivity), and the processor (programmable, powerful and fast). The combination of this architecture with a simple, high-speed processor that has several units working in parallel, with its 10 very-high-speed communication parallel ports in six directions, and the ability to operate the processor in Single Instruction Multiple Data (SIMD), or in Multiple Instruction Multiple Data (MIMD) modes, allows one to build a very versatile engine. This engine is capable of solving at very high speed with a very high degree of interconnectivity a very long algorithm (in SIMD mode), or it can perform digital filtering on high-frequency signals or pattern recognition in a very short time, using a short algorithm that can be different on each processor (in MIMD mode). The overall 3D-Flow project has passed a major design review at Fermilab. (Reviewers included experts in computers, triggering, system assembly, and electronics.).<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"58 1","pages":"355-369"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85053144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
专用汽车Pub Date : 1994-05-02DOI: 10.1109/MPCS.1994.367036
J. M. Alonso, A.A. Frutos, R. B. Palacio
{"title":"Conservative and optimistic distributed simulation in massively parallel computers: a comparative study","authors":"J. M. Alonso, A.A. Frutos, R. B. Palacio","doi":"10.1109/MPCS.1994.367036","DOIUrl":"https://doi.org/10.1109/MPCS.1994.367036","url":null,"abstract":"The main algorithms for sequential and parallel discrete event simulations are introduced. A set of different simulators is evaluated and compared using a transputer-based multicomputer: sequential, parallel conservative and parallel optimistic. For the proposed model, the conservative simulator shows better speedups. The obtained results show that a significant improvement on execution time can be obtained using parallel systems, but only if the model under simulation have some properties like coarse grain size and high degree of self-synchronisation.<<ETX>>","PeriodicalId":64175,"journal":{"name":"专用汽车","volume":"39 1","pages":"528-532"},"PeriodicalIF":0.0,"publicationDate":"1994-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86172704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}