[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation最新文献

筛选
英文 中文
Scientific visualization theatre 科学可视化剧场
T. Sterling
{"title":"Scientific visualization theatre","authors":"T. Sterling","doi":"10.1109/FMPC.1992.234874","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234874","url":null,"abstract":"Summary form only given. Discusses the latest in massively parallel processing (MPP) applications' results through high-resolution graphics and animation. Three themes are represented, demonstrating the relationship between massively parallel computing and scientific visualization. Results of applications computed on MPPs and visualized on graphics workstations are shown for many of the cases. Examples of result data whose image rendering are performed using parallel algorithms on MPPs are shown, and some performance measurements are given. Finally, graphics presentation of data representing the behavioral dynamics of MPPs are shown, opening the way for scientific visualization to assist in the optimization of MPP computation.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133031441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The new frontiers: A workshop on future directions in massively parallel processing 新的前沿:大规模并行处理的未来方向研讨会
I.D. Scherson
{"title":"The new frontiers: A workshop on future directions in massively parallel processing","authors":"I.D. Scherson","doi":"10.1109/FMPC.1992.234882","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234882","url":null,"abstract":"The task of identifying some of the basic research issues facing modern massively parallel processing is addressed. Processing element architecture, interconnection networks, languages and compilers, and software development tools are considered.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131331901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Massively parallel sparse LU factorization 大规模并行稀疏LU分解
S. Kratzer
{"title":"Massively parallel sparse LU factorization","authors":"S. Kratzer","doi":"10.1109/FMPC.1992.234896","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234896","url":null,"abstract":"The multifrontal algorithm for sparse LU factorization has been expressed as a data parallel program that is suitable for massively parallel computers. A new way of mapping data and computations to processors is used, and good processor utilization is obtained even for unstructured sparse matrices. The sparse problem is decomposed into many smaller, dense subproblems, with low overhead for communications and memory access. Performance results are provided for factorization of regular and irregular finite-element grid matrices on the MasPar MP-1.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131496654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Communication overhead on the CM5: an experimental performance evaluation CM5上的通信开销:实验性能评估
R. Ponnusamy, A. Choudhary, G. Fox
{"title":"Communication overhead on the CM5: an experimental performance evaluation","authors":"R. Ponnusamy, A. Choudhary, G. Fox","doi":"10.1109/FMPC.1992.234899","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234899","url":null,"abstract":"The authors present experimental results for communication overhead on the scalable parallel machine CM-5. It is observed that the communication latency of the data network is 88 mu s. It was also observed that the communication cost for messages that are a multiple of 16 bytes is much smaller than for messages that are not, and therefore, for better performance, a user should pad messages to make them a multiple of 16 bytes. The authors also studied the communication overhead of three complete exchange algorithms. For small message sizes, the recursive exchange algorithm performs the best, especially for large multiprocessors. However, for large message sizes, the pairwise exchange algorithm is preferable. Finally, the authors studied two algorithms for one-to-all broadcast: the linear broadcast algorithm and the recursive broadcast algorithm. Linear broadcast does not perform well; the recursive broadcast algorithm performs well.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132838116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Parallel holographic image calculation and compression 并行全息图像的计算和压缩
D. M. Newman, D. Goeckel, R. D. Crawford, S. Abraham
{"title":"Parallel holographic image calculation and compression","authors":"D. M. Newman, D. Goeckel, R. D. Crawford, S. Abraham","doi":"10.1109/FMPC.1992.234923","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234923","url":null,"abstract":"The authors describe the parallel implementation of an algorithm suitable for hologram creation on a 16384 processor SIMD (single-instruction multiple-data) MasPar machine. When computing an image of typical complexity, the parallel implementation sacrifices up to 11% efficiency in data compression to gain a performance up to 250 times greater than that achieved on a uniprocessor workstation. The MasPar can achieve pattern generation more than 750 times faster than the fully optimized Sparc C code.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114330091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Throughput analysis of pipelined multiprocessor modules 流水线多处理器模块的吞吐量分析
S.-Y. Lee
{"title":"Throughput analysis of pipelined multiprocessor modules","authors":"S.-Y. Lee","doi":"10.1109/FMPC.1992.234926","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234926","url":null,"abstract":"A feasible form of parallel architecture would be one which consists of several pipeline stages, each of which is a multiprocessor module of a large number of processing elements (PEs). In many applications, such as real-time image processing and dynamic control, the optimized computing structure would be in this form. In the present study, the performance of a parallel processing model of such an organization has been analyzed. In particular, the effect of interstage communication on throughput of the model has been investigated to suggest an efficient way of transferring data between stages. The numerical results obtained in this study could be a useful guideline for designing a parallel computer system consisting of pipeline stages each of which contains a large number of PEs.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123192724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Traffic analysis of hypercubes and banyan-hypercubes 超立方体和榕树超立方体的流量分析
A. Bellaachia, A. Youssef
{"title":"Traffic analysis of hypercubes and banyan-hypercubes","authors":"A. Bellaachia, A. Youssef","doi":"10.1109/FMPC.1992.234950","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234950","url":null,"abstract":"The routing performance of banyan-hypercubes (BHs) is studied and compared with that of hypercubes. To evaluate the routing capabilities of BHs and hypercubes, a communication model is assumed. Based on this model, the traffic intensity of both networks is computed and the saturation probability of each network is determined. To compute the average time delay, the average queue length, the throughput, and the maximum queue size, extensive simulations were conducted for both networks for different sizes and different package generation packet rates. The saturation probability obtained through the simulation results is very close to that computed theoretically. The simulation results showed that all of the aforementioned measures are decreased when the network size gets larger. BHs with more than two levels are shown to congest faster than a hypercube of the same size, and deliver less throughput. However, a two-level BH has better performance than a hypercube of the same size. Although the BH has a better diameter and average distance, it does not necessarily have better communication capabilities than hypercubes.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115852729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Program transformation in massively parallel systems 大规模并行系统中的程序转换
T. Al-Marzooq, F. Bastani
{"title":"Program transformation in massively parallel systems","authors":"T. Al-Marzooq, F. Bastani","doi":"10.1109/FMPC.1992.234873","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234873","url":null,"abstract":"The authors present two problems in mapping highly maintainable expressive parallel code manipulating multidimensional arrays in massively parallel computers: bottlenecks due to simultaneous accesses in the EREW model, and interprocessor communication. They present a source code transformation approach to solve the expressibility-high-performance problem for the multidimensional arrays designed with a four-level hierarchical design of the data types (aggregate, abstract, logical, and physical levels). A systematic method is developed to transform parallel high-level low-performance code into parallel low-level efficient ones. The method is illustrated with matrix multiplication. The method is also used to generate high-performance logical-level code for the backpropagation algorithm of neural networks that makes extensive use of matrices. The transformed code has a much higher performance than the code with a naive mapping.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129613076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Quantitative studies of processing element granularity 加工元素粒度的定量研究
T. C. Marek, E. Davis
{"title":"Quantitative studies of processing element granularity","authors":"T. C. Marek, E. Davis","doi":"10.1109/FMPC.1992.234925","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234925","url":null,"abstract":"Quantitative results of experiments on PE (processing element) granularities are presented. An architecture simulation workbench has been developed for experiments on PE granularities of 1, 4, 8, and 16-b. An analysis of the impact of various I/O (input/output) and communication path widths is also possible. Overall performance, communication balance, PE utilization, and operand lengths can be monitored to evaluate the merits of various granularities and feature sets. This workbench has been used to run a set of benchmark algorithms that cover a range of computation and communication requirements, a range of data sizes, and a range of problem array sizes. The authors report results for two of the algorithms studied by T.C. Marek (1992): image rotation and image resampling. The results obtained are counterintuitive. They indicate that bit-serial machines have performance advantages due to inherent bit-oriented activity, even when using multiple bit operands, and to inter-PE communication when paths are narrower than the processor granularity.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129763369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automatic data distribution for nearest neighbor networks 最近邻网络的自动数据分发
M. Philippsen
{"title":"Automatic data distribution for nearest neighbor networks","authors":"M. Philippsen","doi":"10.1109/FMPC.1992.234890","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234890","url":null,"abstract":"An algorithm for mapping an arbitrary, multidimensional array onto an arbitrarily shaped multidimensional nearest-neighbor network of a distributed memory machine is presented. The individual dimensions of the array are labeled with high-level usage descriptors that either can be provided by the programmer or can be derived by sophisticated static compiler analysis. The presented algorithm achieves an appropriate exploitation of nearest-neighbor communication and allows for efficient address calculations. The author describes the integration of this technique into an optimizing compiler for Modula-2 and derives extensions that render efficient translation of nested parallelism possible and that provide support for thread scheduling.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116134364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信