[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation最新文献

筛选
英文 中文
A localized dynamic load balancing strategy for highly parallel systems 高度并行系统的局部动态负载平衡策略
M. Willebeek-LeMair, A. Reeves
{"title":"A localized dynamic load balancing strategy for highly parallel systems","authors":"M. Willebeek-LeMair, A. Reeves","doi":"10.1109/FMPC.1990.89487","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89487","url":null,"abstract":"Two dynamic load-balancing strategies, a local diffusion (RID) and a global exchange (DEM) strategy, designed to support massively parallel systems are presented and compared. The effects of system size and task granularity are studied. Both strategies are implemented on a 32-processor iPSC/2 and a 256-processor IBM Victor. Even for low degrees of parallelism the performance of the DEM and RID strategies is very similar. The efficiency of the DEM strategy, however, depends heavily on the system interconnection topology. Furthermore, the system sizes tested were small in the context of massively parallel systems. The overhead costs of synchronization (scale as O(N)) for the DEM approach may cause a serious deterioration of performance. The RID strategy is easily embedded into simpler topologies, and can scale gracefully for larger systems. Finally, the RID scheme is able to maintain task locality, supporting a wider variety of applications that exhibit local communication dependencies between tasks. Therefore, the RID strategy may offer a superior performance when locality is important.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126343698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Data optimization: minimizing residual interprocessor data motion on SIMD machines 数据优化:最小化SIMD机器上残留的处理器间数据移动
K. Knobe, V. Natarajan
{"title":"Data optimization: minimizing residual interprocessor data motion on SIMD machines","authors":"K. Knobe, V. Natarajan","doi":"10.1109/FMPC.1990.89492","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89492","url":null,"abstract":"Basic concepts in array layout are summarized, and unhonored preferences and residual data motion are discussed. A technique for minimizing such motion is presented. For each array the source program is divided into regions, each associated with a single home. This enables efficient handling of residual data motion. The partitioning into regions is based on control flow and data dependence. Preliminary results obtained with this technique show an order-of-magnitude improvement for certain classes of programs.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123814071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
An optimal lookahead processor to prune search space 一种优化的前瞻处理器,用于精简搜索空间
J. Gu
{"title":"An optimal lookahead processor to prune search space","authors":"J. Gu","doi":"10.1109/FMPC.1990.89462","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89462","url":null,"abstract":"The discrete relaxation algorithm (DRA) is an efficient computational technique for enforcing arc consistency (AC) in a consistent labeling problem (CLP). The original sequential AC-1 algorithm suffers from O(n/sup 3/m/sup 3/) time complexity for an n-object and m-label problem. Sample problem runs show that all these sequential algorithms are too slow to meet the need for any useful real-time CLP applications. An optimal parallel DRA5 algorithm that reaches the optimal lower bound, O(nm), for parallel AC algorithms (where the number of processors is polynomial in the problem size) is given. The algorithm has been implemented on a fine-grained, massively parallel hardware computer architecture. For problems of practical interest, 4 to 10 orders of magnitude of efficiency improvement can be reached on this hardware architecture.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"359 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122749293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Korean character recognition using neural networks 用神经网络识别韩文字符
J. Koh, G. S. Moon, K. Mehrotra, C. Mohan, Sanjay Ranka
{"title":"Korean character recognition using neural networks","authors":"J. Koh, G. S. Moon, K. Mehrotra, C. Mohan, Sanjay Ranka","doi":"10.1109/FMPC.1990.89454","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89454","url":null,"abstract":"A neural network approach for recognizing printed Korean characters, based on a variant of the backpropagation algorithm, is presented. Implementation of the algorithms for neural networks with Hough transform inputs provided excellent recognition: about 81% of the training samples and 73% of the tested samples can be recognized.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128134159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On single parameter characterization of parallelism 关于并行度的单参数表征
D. Marinescu, J. Rice
{"title":"On single parameter characterization of parallelism","authors":"D. Marinescu, J. Rice","doi":"10.1109/FMPC.1990.89464","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89464","url":null,"abstract":"Issues pertinent to performance analysis of massively parallel systems are discussed. Attention is focused on the average parallelism of a software structure, which has been proposed as a single-parameter characterization of parallel software. It is argued that single-parameter characterization of parallel software or of parallel hardware rarely provides insight into the complex interactions among the software and hardware components of a parallel system. In particular, bounds for the speedup based on simple models of parallelism are violated when a model ignores the effects of communication delays.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133347161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Designing the 3-LAP (three layers associative processor) for arithmetic and symbolic applications 设计用于算术和符号应用的3-LAP(三层关联处理器)
C. Davarakis, D. Maritsas
{"title":"Designing the 3-LAP (three layers associative processor) for arithmetic and symbolic applications","authors":"C. Davarakis, D. Maritsas","doi":"10.1109/FMPC.1990.89471","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89471","url":null,"abstract":"A variant of the MULTAP architecture, called 3-LAP, is presented. This three-layer machine is designed from the middle out, beginning with its finite-state-machine diagram and working toward its low-level processing element cell specification and its high-level algorithm applications definition. The 3-LAP's operating and control parts are defined, the estimated machine throughput performance is presented (over 100 GCOPS (giga complex operations per second)), the processing element cell is defined, and arithmetic and symbolic application primitives in 3-LAP instructions are described.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114567233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Performance analysis of an implementation of the Beam and Warming implicit factored scheme on the NCUBE hypercube 在NCUBE超立方体上实现Beam和Warming隐式因子方案的性能分析
P. J. Kominsky
{"title":"Performance analysis of an implementation of the Beam and Warming implicit factored scheme on the NCUBE hypercube","authors":"P. J. Kominsky","doi":"10.1109/FMPC.1990.89447","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89447","url":null,"abstract":"A production 3-D Beam and Warming implicit Navier Stokes code has been implemented on the NCUBE hypercube using the grid allocation scheme of J. Bruno and P.R. Capello (see Proc. 3rd Conf. on Hypercube Concurrent Computers and Applications, p.1073-87, 1988). Predicted (32-b) performance on 1024 nodes is 67.1 MFLOPS. Efficiencies of 70% are attainable for implicit algorithms, although constant-memory scaled performance is found to decrease with increasing number of nodes, unlike the case for explicit implementations.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123918509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Index domain alignment: minimizing cost of cross-referencing between distributed arrays 索引域对齐:最小化分布式数组之间交叉引用的成本
Jingke Li, M. Chen
{"title":"Index domain alignment: minimizing cost of cross-referencing between distributed arrays","authors":"Jingke Li, M. Chen","doi":"10.1109/FMPC.1990.89493","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89493","url":null,"abstract":"The issue of data movement between processors due to cross-references between multiple distributed arrays is addressed. The problem of index domain alignment is formulated as finding a set of suitable alignment functions that map the index domains of the arrays into a common index domain so as to minimize the cost of data movement due to cross-references between the arrays. The cost function and the machine model used are abstractions of the current generation of distributed-memory machines. The problem as formulated is shown to be NP-complete. A heuristic algorithm is devised and shown to be efficient and to provide excellent results.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"58 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129762774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 206
Simulation of neural networks on a massively parallel computer (DAP-510) using sparse matrix techniques 利用稀疏矩阵技术在大规模并行计算机(DAP-510)上模拟神经网络
S.N. Gupta, M. Zubair, C. Grosch
{"title":"Simulation of neural networks on a massively parallel computer (DAP-510) using sparse matrix techniques","authors":"S.N. Gupta, M. Zubair, C. Grosch","doi":"10.1109/FMPC.1990.89486","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89486","url":null,"abstract":"A parallel sparse matrix algorithm is proposed for the simulation of the modified Hopfield-Tank (MHT) network for solving the Traveling Salesman Problem (TSP). The MHT network using this sparse matrix algorithm has been implemented on a DAP-510, a massively parallel SIMD (single-instruction-steam, multiple-data-stream) computer consisting of 1024 processors. Problems of various sizes, ranging from eight cities up to 256 cities, have been simulated. The results show a very large speedup for the algorithm as compared with one using a standard dense matrix implementation.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128071476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Indirect addressing and load balancing for faster solution to Mandelbrot set on SIMD architectures 间接寻址和负载平衡,更快地解决SIMD架构上的Mandelbrot集
S. Tomboulian, M. Pappas
{"title":"Indirect addressing and load balancing for faster solution to Mandelbrot set on SIMD architectures","authors":"S. Tomboulian, M. Pappas","doi":"10.1109/FMPC.1990.89495","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89495","url":null,"abstract":"The authors present a method for using local indirect addressing to achieve faster solutions for some problems with data-dependent convergence rates on SIMD (single-instruction-stream, multiple-data-stream) architectures. A class of problems characterized by computations on data points where the computation is identical but the convergence rate is data dependent is examined. In the absence of indirect addressing, algorithm time is governed by the maximum number of iterations. An algorithm using indirect addressing allows a processor to proceed to the next data point upon convergence. Thus the overall number of iterations will approach the mean convergence rate for a sufficiently large problem. Load-balancing techniques can be applied for additional performance improvement. These techniques are used for solving Mandelbrot sets on the MP-1 massively parallel computer.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125083014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信