[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation最新文献

筛选
英文 中文
An asymptotically optimal parallel bin-packing algorithm 一种渐近最优并行装箱算法
N. S. Coleman, Pearl Y. Wang
{"title":"An asymptotically optimal parallel bin-packing algorithm","authors":"N. S. Coleman, Pearl Y. Wang","doi":"10.1109/FMPC.1992.234866","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234866","url":null,"abstract":"The authors introduce a bin-packing heuristic that is well-suited for implementation on massively parallel SIMD (single-instruction multiple-data) or MIMD (multiple-instruction multiple-data) computing systems. The average-case behavior (and the variance) of the packing technique can be predicted when the input data have a symmetric distribution. The method is asymptotically optimal, yields perfect packings, and achieves the best possible average case behavior with high probability. The analytical result improves upon any online algorithms previously reported in the literature and is identical to the best results reported so far for offline algorithms.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126596587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Connection Machine model CM-5 system overview 连接机型号CM-5系统概述
J. Palmer, G. Steele
{"title":"Connection Machine model CM-5 system overview","authors":"J. Palmer, G. Steele","doi":"10.1109/FMPC.1992.234877","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234877","url":null,"abstract":"The Connection Machine model CM-5 provides high performance and ease of use for large data-intensive applications. The CM-5 architecture is designed to scale to teraflops performance on terabyte-sized problems. SPARC-based processing nodes, each with four vector pipes, are connected by two communications networks, the Data Network and the Control Network. The system combines the best features of SIMD (single-instruction multiple-data) and MIMD (multiple-instruction multiple-data) designs, integrating them into a single 'universal' parallel architecture. The processor nodes may be divided into independent computational partitions; each partition may be independently timeshared or devoted to batch processing. Programming languages include Fortran (with Fortran 90 array constructs) and C*, a parallel dialect of C. The PRISM programming environment supports source-level debugging, tracing, and profiling through a graphical interface based on X Windows.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114461628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Simulation and performance estimation for the Rewrite Rule Machine 改写规则机的仿真与性能评估
Hitoshi Aida, J. Goguen, Sany M. Leinwand, P. Lincoln, J. Meseguer, B. Taheri, T. Winkler
{"title":"Simulation and performance estimation for the Rewrite Rule Machine","authors":"Hitoshi Aida, J. Goguen, Sany M. Leinwand, P. Lincoln, J. Meseguer, B. Taheri, T. Winkler","doi":"10.1109/FMPC.1992.234941","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234941","url":null,"abstract":"The authors give an overview of the Rewrite Rule Machine's (RRM's) architecture and discuss performance estimates based on very detailed register-level simulations at the chip level, together with more abstract simulations and modeling for higher levels. For a 10000 ensemble RRM, the present estimates are as follows. (1) The raw peak performance is 576 trillion operations per second. (2) For general symbolic applications, ensemble Sun-relative speedup is roughly 6.7, and RRM performance with a wormhole network at 88% efficiency gives an idealized Sun-relative speedup of 59000. (3) For highly regular symbolic applications (the sorting problem is taken as a typical example), ensemble performance is a Sun-relative speedup of 127, and RRM performance is estimated at over 80% efficiency (relative to the cluster performance), yielding a Sun-relative speedup of over 91. (4) For systolic applications (a 2-D fluid flow problem is taken as a typical example), ensemble performance is a Sun-relative speedup of 400-670, and cluster-level performance, which should be attainable in practice, is at 82% efficiency.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117252667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Data Parallel Fortran 数据并行Fortran
P. Elustondo, L. A. Vazquez, O.J. Nestares, J. S. Avalos, G. A. Alvarez, C.-T. Ho, J. Sanz
{"title":"Data Parallel Fortran","authors":"P. Elustondo, L. A. Vazquez, O.J. Nestares, J. S. Avalos, G. A. Alvarez, C.-T. Ho, J. Sanz","doi":"10.1109/FMPC.1992.234909","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234909","url":null,"abstract":"The authors present Data Parallel Fortran (DPF), a set of extensions to Fortran aimed at programming scientific applications on a variety of parallel machines. DPF portrays a global name space to programmers and allows programs to be written in a clear, data-parallel style. DPF's model is based on the idea of having a single control thread that spans parallel virtual threads with arbitrary nesting, resuming at their completion into a single global state. It also provides explicit control of which subset of the global name space is strictly accessed by each virtual processor at different points in a program. This powerful mechanism makes it possible to write programs in which communication points are handled explicitly, but without making use of message passing code. Also, DPF offers some primitives that involve communication often encountered in parallel numerical and scientific applications. DPF semantics does not depend on any particular feature of the architecture, thus providing a reasonably high-level programming methodology.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124853974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Distance between images 图像间距离
J. A. Gualtieri, J. Le Moigne, C. V. Packer
{"title":"Distance between images","authors":"J. A. Gualtieri, J. Le Moigne, C. V. Packer","doi":"10.1109/FMPC.1992.234956","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234956","url":null,"abstract":"The authors compare two methods which compute an approximation to the Hausdorff distance between pairs of binary images. They also implement a parallel vision of one of the methods, which can provide a fast image distance algorithm to calibrate algorithms performing such tasks as image recognition, image compression, or image browsing. For this purpose, they have shown a simple application of selecting the best iteration of a region growing algorithm which yields edge images by comparing them to a Canny edge detector.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130026698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Issues on the algorithm-software continuum 算法-软件连续体的问题
L. Jamieson, M. Atallah, J. Cuny, D. Gannon, J. JáJá, V. Lo, R. Miller
{"title":"Issues on the algorithm-software continuum","authors":"L. Jamieson, M. Atallah, J. Cuny, D. Gannon, J. JáJá, V. Lo, R. Miller","doi":"10.1109/FMPC.1992.234957","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234957","url":null,"abstract":"To date, highest performance on parallel systems has required expertise spanning high-level algorithm design through architecture-dependent fine tuning of the implementation. Application users who are uninformed about architecture details are not able to take advantage of (or compensate for) idiosyncrasies of the target machine; parallel processing experts are often not able to explore radically different ways of solving a physical problem in order to adopt the approach best suited to a particular architecture. Moreover, software tools have not yet succeeded in automating the realization of high-performance parallel applications. The authors therefore deal with questions about how much of an algorithm designer a user of parallel systems can/should be expected to be, and how much software support is realistic to expect.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126209307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Input/output for fine grain multiprocessor systems 输入/输出用于细粒度多处理器系统
S.-Y. Lee
{"title":"Input/output for fine grain multiprocessor systems","authors":"S.-Y. Lee","doi":"10.1109/FMPC.1992.234927","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234927","url":null,"abstract":"While extensive investigations on how multiple processing elements (PEs) in a parallel system can be utilized efficiently have been carried out, the I/O (input/output) into and from the system has been ignored in most cases. However, the time for downloading input data or uploading results would not be negligible, especially when a large number of PEs such as those in a massively parallel system and/or a large volume of data are involved. Results from a preliminary study on how I/O can be efficiently realized in a fine-grain multiprocessor system without any hardware change are reported.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126462814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pi: a parallel architecture interface Pi:一个并行架构接口
D. Wills, W. Dally
{"title":"Pi: a parallel architecture interface","authors":"D. Wills, W. Dally","doi":"10.1109/FMPC.1992.234940","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234940","url":null,"abstract":"The authors define Pi, a parallel architecture interface that separates model and machine issues, allowing them to be addressed independently. This provides greater flexibility for both the model and machine builder. Pi addresses a set of common parallel model requirements, including low-latency communication, fast task switching, low-cost synchronization, efficient storage management, the ability to exploit locality, and efficiency support for sequential code. Since Pi provides generic parallel operations, it can efficiently support many parallel programming models, including hybrids of existing models. Pi also forms a basis of comparison for architectural components. The authors present an overview of Pi, and a description of several model examples which have been constructed and evaluated on the interface.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125328665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
On the parallel processing capabilities of LCA networks LCA网络的并行处理能力研究
I.D. Scherson, P.Y. Wang
{"title":"On the parallel processing capabilities of LCA networks","authors":"I.D. Scherson, P.Y. Wang","doi":"10.1109/FMPC.1992.234918","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234918","url":null,"abstract":"Lowest Common Ancestor networks (LCANs) are hierarchical interconnection networks for communication in SIMD and MIMD machines. The connectivity and permutational properties of specific families of LCANs have been previously studied. LCANs are built with switches in a tree-like manner. A level in the hierarchy is akin to a stage in a multistage interconnect and their topology is similar to that of hypertrees and fat trees. Their hierarchical structure lends itself to implementation in the fabrication hierarchy, namely chips, boards and backplanes. In this paper, a preliminary investigation of the algorithmic capabilities of LCANs (in terms of their parameters) is reported.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131448322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Benchmarking performance of massively parallel AI architectures 大规模并行AI架构的性能基准测试
R. Demara, H. Kitano
{"title":"Benchmarking performance of massively parallel AI architectures","authors":"R. Demara, H. Kitano","doi":"10.1109/FMPC.1992.234865","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234865","url":null,"abstract":"The authors address the architectural evaluation of massively parallel machines suitable for artificial intelligence (AI). The approach is to identify the impact of specific algorithm features by measuring execution time on a SNAP-1 and a Connection Machine-2 using different knowledge base and machine configurations. Since a wide variety of parallel AI languages and processing architectures are in use, the authors developed a portable benchmark set for Parallel AI Computational Efficiency (PACE). PACE provides a representative set of processing workloads, knowledge base topologies, and performance indices. The authors also analyze speedup and scalability of fundamental AI operations in terms of the massively parallel paradigm.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115619002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信