[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation最新文献

筛选
英文 中文
A compiler for a massively parallel distributed memory MIMD computer 用于大规模并行分布式内存MIMD计算机的编译器
G. Sabot
{"title":"A compiler for a massively parallel distributed memory MIMD computer","authors":"G. Sabot","doi":"10.1109/FMPC.1992.234910","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234910","url":null,"abstract":"The author describes the techniques that are used by the CM Compiler Engine to map the fine-grained array parallelism of languages such as Fortan 90 and C onto the Connection Machine (CM) architectures. The same compiler is used for node-level programming of the CM-5, for global programming of the CM-5, and for global programming of the SIMD (single-instruction multiple-data) CM-2. A new compiler phase is used to generate two classes of output code: code for a scalar control processor, which executes SPARC assembler, and code aimed at a model of the CM-5's parallel-processing elements. The model is embodied in a new RISC (reduced instruction set computer)-like vector instruction set called PEAC. The control program distributes parallel data at runtime among the processor nodes of the target machine. Each of these nodes is itself superpipelined and superscalar. An innovative scheduler overlaps the execution of multiple PEAC operations, while conventional vector processing techniques keep the pipelines filled.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132672830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
A Grimm collection of MIMD fairy tales 格林童话集
T. Blank, J. Nickolls
{"title":"A Grimm collection of MIMD fairy tales","authors":"T. Blank, J. Nickolls","doi":"10.1109/FMPC.1992.234881","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234881","url":null,"abstract":"The authors present two tales about massively parallel processors: 'Who is Fairest of Us All?' and 'The SPMD Path.' With a twist of humor, the tales discuss single-instruction multiple-data systems (SIMD), multiple-instruction multiple-data (MIMD) systems, differences, and the single program multiple data (SPMD) programming model. The first tale introduces autonomous SIMD (ASIMD), and then looks at the flexibility, programmability, cost, and effectiveness of MIMD and ASIMD systems. It is shown that ASIMD systems have the flexibility to solve real applications cost-effectively. The second tale describes the simple path that SPMD provides for programming, and why an ASIMD machine works well.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"57 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120853923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Architecture independent analysis of sorting and list ranking on the hierarchical PRAM model 在分层PRAM模型上对排序和列表排序进行体系结构独立分析
T. Heywood, S. Ranka
{"title":"Architecture independent analysis of sorting and list ranking on the hierarchical PRAM model","authors":"T. Heywood, S. Ranka","doi":"10.1109/FMPC.1992.234932","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234932","url":null,"abstract":"The authors consider the performance of sorting and list ranking on the hierarchical parallel random access machine (H-PRAM), a model of computation which represents general degrees of locality (neighborhoods of activity), considering communication and synchronization simultaneously. The sorting result gives a significant improvement over that for the LPRAM (local-memory PRAM, i.e. unit-size neighborhoods), matches the best known hypercube algorithms when the H-PRAM's latency parameter l(P) is set to log P, and matches the best possible mesh algorithm when l(P)= square root P. The list ranking algorithm demonstrates fundamental limitations of the H-PRAM for nonoblivious problems which have linear-time sequential algorithms.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116359738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Parallel pulse correlation and geolocation 平行脉冲相关和地理定位
D.K. Krecker, W. Mitchell
{"title":"Parallel pulse correlation and geolocation","authors":"D.K. Krecker, W. Mitchell","doi":"10.1109/FMPC.1992.234929","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234929","url":null,"abstract":"The identification and location of ground-based radars via orbiting receivers require the correlation of pulses, the determination of time differences of arrival, and geolocation. Data rates in emitter-rich environments would swamp single-CPU processors performing this operation. The authors present an innovative parallel algorithm developed specifically for this application on massively parallel computers. The algorithm is based on the parallel computation and analysis of a matrix containing the differences in the time of arrival of all pulses received in a time window, and on the parallel proof/disproof of hypothesized emitter locations. Output contains the number of emitters and their location and PRI (pulse repetition interval) sequence. The algorithm was tested on a 16 K processor Connection Machine.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126104879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Massively parallel computers: why not parallel computers for the masses? 大规模并行计算机:为什么不为大众提供并行计算机?
G. Bell
{"title":"Massively parallel computers: why not parallel computers for the masses?","authors":"G. Bell","doi":"10.1109/FMPC.1992.234946","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234946","url":null,"abstract":"The developments in high-performance computers towards achieving the goal of a teraflops supercomputer that would operate at a peak speed of 10/sup 12/ floating-point operations per second are reviewed. The net result of the quest for parallelism as chronicled by the Gordon Bell Prize is that applications evolved 115% per year and will most likely achieve 1 teraflop in 1995. The physical characteristics of supercomputing alternatives available in 1992 are described. The progress of CMOS microprocessor technology to teraflop speeds is discussed. It is argued that the mainline general purpose computers will continue to be microprocessors in three forms: supercomputers, mainframes, and scalable MPs. The current scalable, multicomputers will all evolve and become multiprocessors, but with limited coherent memories in their next generation. It is also argued that the cost and time to rewrite major applications for one-of-a-kind machines is sufficiently large to make them uneconomical.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122063427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Hyperbanyan networks: a new class of networks for distributed-memory multiprocessors Hyperbanyan网络:分布式内存多处理器的一类新网络
Clayton Ferner, K. Y. Lee
{"title":"Hyperbanyan networks: a new class of networks for distributed-memory multiprocessors","authors":"Clayton Ferner, K. Y. Lee","doi":"10.1109/FMPC.1992.234951","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234951","url":null,"abstract":"A new class of connection topologies for distributed-memory multiprocessors, hyperbanyan networks, is introduced. A hyperbanyan is a combination of the topological designs of a banyan and the hypertree networks. Since the hypertree combines the advantages of the binary tree and the hypercube, a hyperbanyan has the features of a binary tree, a hypercube, and a banyan. The hyperbanyans have a fixed degree of five, and the diameter of an (n stage*2/sup n-1/ nodes/stage) hyperbanyan is 2(n-1). A routing algorithm which is close to optimal is presented.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126990433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Dynamic precision iterative algorithms 动态精度迭代算法
D. Kramer, I. Scherson
{"title":"Dynamic precision iterative algorithms","authors":"D. Kramer, I. Scherson","doi":"10.1109/FMPC.1992.234930","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234930","url":null,"abstract":"The authors address the use of DP (dynamic precision) in fixed point iterative numerical algorithms. These algorithms are used in a wide range of numerically intensive scientific applications. One such algorithm, Muller's method, detects complex roots of an arbitrary function. This algorithm was implemented in DP on various architectures, including a MasPar MP-1 massively parallel processor and a Cray Y-MP vector processor. The results show that the use of DP can lead to a significant speedup of iterative algorithms on multiple-range architectures.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128113770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A fast algorithm for computing histograms on a reconfigurable mesh 一种计算可重构网格上直方图的快速算法
J. Jang, H. Park, V. Prasanna
{"title":"A fast algorithm for computing histograms on a reconfigurable mesh","authors":"J. Jang, H. Park, V. Prasanna","doi":"10.1109/FMPC.1992.234952","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234952","url":null,"abstract":"The authors present fast parallel algorithms for computing the histogram on PARBUS and RMESH models. Compared with the approach of J. Jeng and S. Sahni (1992), the proposed algorithm improves the time complexity by using a constant amount of memory in each processing element. In the histogram modification algorithm, the entire range of h is considered. The connections used by the proposed algorithm on the PARBUS model are same as those allowed in the MRN model. Thus, this algorithm runs on this model as well. The results obtained imply that the number of 1's in a N*N 0/1 table can be counted in O(log* N) time on an N*N reconfigurable mesh and in O(log log N) time on an N*N RMESH.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127628799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Hardware support for the Seamless programming model 对无缝编程模型的硬件支持
S. Fineberg, T. Casavant, B. H. Pease
{"title":"Hardware support for the Seamless programming model","authors":"S. Fineberg, T. Casavant, B. H. Pease","doi":"10.1109/FMPC.1992.234939","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234939","url":null,"abstract":"The communication latency problem is presented with special emphasis on RISC (reduced instruction set computer) based multiprocessors. An interprocessor communication model for parallel programs based on locality is presented. This model enables the programmer to manipulate locality at the language level and to take advantage of currently available system hardware to reduce latency. A hardware node architecture for a latency-tolerant RISC-based multiprocessor, called Seamless, that supports this model, is presented. The Seamless architecture includes the addition of a hardware locality manager to each processing element, as well as an integral runtime environment and compiler.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126524415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
ALFA: a static data flow architecture ALFA:静态数据流架构
L. Verdoscia, R. Vaccaro
{"title":"ALFA: a static data flow architecture","authors":"L. Verdoscia, R. Vaccaro","doi":"10.1109/FMPC.1992.234943","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234943","url":null,"abstract":"The authors present the ALFA architecture, a data flow machine with 16384 functional units (FUs) grouped in 128 clusters. ALFA is based on the Backus FFP computational model and uses the static data flow execution model. This machine's behavior is deterministic and asynchronous. Consequently, after compile time, instructions and data are no longer related. In this machine, even though its behavior is deterministic, no control token is generated during the computation, but only data tokens. Furthermore, during the execution phase, no memory is required to contain the partial results exchanged among FUs. A cluster with 128 FUs has been simulated, and some results are presented.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126529044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信