Int. J. High Speed Comput.最新文献

筛选
英文 中文
On Improving the Performance of Tree Machines 关于提高树机性能的研究
Int. J. High Speed Comput. Pub Date : 1995-06-01 DOI: 10.1142/S0129053395000142
Ajay K. Gupta, Hong Wang
{"title":"On Improving the Performance of Tree Machines","authors":"Ajay K. Gupta, Hong Wang","doi":"10.1142/S0129053395000142","DOIUrl":"https://doi.org/10.1142/S0129053395000142","url":null,"abstract":"In this paper we introduce a class of trees, called generalized compressed trees. Generalized compressed trees can be derived from complete binary trees by performing certain ‘contraction’ operations. A generalized compressed tree CT of height h has approximately 25% fewer nodes than a complete binary tree T of height h. We show that these trees have smaller (up to a 74% reduction) 2-dimensional and 3-dimensional VLSI layouts than the complete binary trees. We also show that algorithms initially designed for T can be simulated by CT with at most a constant slow-down. In particular, algorithms having non-pipelined computation structure and originally designed for T can be simulated by CT with no slow-down.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123894904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Load Balancing: a Programmer's Approach or the Impact of Task-Length Parameters on the Load Balancing Performance of Parallel Programs 负载平衡:程序员的方法或任务长度参数对并行程序负载平衡性能的影响
Int. J. High Speed Comput. Pub Date : 1995-06-01 DOI: 10.1142/S0129053395000178
Y. Ben-Asher, A. Schuster, J. F. Sibeyn
{"title":"Load Balancing: a Programmer's Approach or the Impact of Task-Length Parameters on the Load Balancing Performance of Parallel Programs","authors":"Y. Ben-Asher, A. Schuster, J. F. Sibeyn","doi":"10.1142/S0129053395000178","DOIUrl":"https://doi.org/10.1142/S0129053395000178","url":null,"abstract":"We consider the problem of dynamic load balancing in an n processor parallel system. The scheduling process of a parallel program is modeled by randomly throwing weighted balls into n holes. For a given program A, the ball weights (task lengths) are chosen according to a probability distribution , for which we know only some of the following parameters: the expectation μ, variance σ2, maximum M and minimum m. From these parameters, we derive an upper bound for the number of tasks to be generated by A in order to achieve a load balancing ratio for which the run-time is optimal up to a factor (1+e)2 for any 0<e≤0.5, with very high probability. Using the derived relations, the programmer may control the load-balancing of his program by tuning the global parameters of the generated tasks. This can be done regardless of the underlying scheduler used by the parallel machine. We also give experimental results of marine-life simulation in support of our claims.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131166560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Extensions to Cycle Shrinking 循环收缩的扩展
Int. J. High Speed Comput. Pub Date : 1995-06-01 DOI: 10.1142/S0129053395000154
A. Sethi, S. Biswas, A. Sanyal
{"title":"Extensions to Cycle Shrinking","authors":"A. Sethi, S. Biswas, A. Sanyal","doi":"10.1142/S0129053395000154","DOIUrl":"https://doi.org/10.1142/S0129053395000154","url":null,"abstract":"An important part of a parallelizing compiler is the restructuring phase, which extracts parallelism from a sequential program. We consider an important restructuring transformation called cycle shrinking [5], which partitions the iteration space of a loop so that the iterations within each group of the partition can be executed in parallel. The method in [5] mainly deals with dependences with constant distances. In this paper, we propose certain extensions to the cycle shrinking transformation. For dependences with constant distances, we present an algorithm which, under certain fairly general conditions, partitions the iteration space in a minimal number of groups. Under such conditions, our method is optimal while the previous methods are not. We have also proposed an algorithm to handle a large class of loops which have dependences with variable distances. This problem is considerably harder and has not been considered before in full generality.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128422085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Task Distribution on a Butterfly Multiprocessor 蝴蝶多处理器上的任务分配
Int. J. High Speed Comput. Pub Date : 1995-03-01 DOI: 10.1142/S0129053395000026
I. Gottlieb, A. Herold
{"title":"Task Distribution on a Butterfly Multiprocessor","authors":"I. Gottlieb, A. Herold","doi":"10.1142/S0129053395000026","DOIUrl":"https://doi.org/10.1142/S0129053395000026","url":null,"abstract":"We consider the practical performance of dynamic task distribution on a multiprocessor, where overloaded processors dispense tasks to be performed on idle ones which are free to execute them. We propose a topology and an algorithm for routing packets in a network from an arbitrary subset of processors S to an arbitrary subset T, where the exact target node within T for a particular task is unimportant and therefore not specified. The method presented achieves work distribution in O(10* log N) time, where N is the nodes (processors) number. It operates on a Duplex Butterfly, and requires O(log N) size buffers. The solution is dynamic, taking into consideration real time availability of processors, and deterministic. The mechanism includes throttling of the task generation rate. “Software synchronization” in asynchronous mode ensures the insensitivity of the algorithm to hardware propagation delays of signals in large networks.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"278 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125849815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linear Algebra calculations on a Virtual Shared Memory Computer 虚拟共享内存计算机上的线性代数计算
Int. J. High Speed Comput. Pub Date : 1995-03-01 DOI: 10.1142/S0129053395000038
P. Amestoy, I. Duff, M. Daydé, Pierre Morère
{"title":"Linear Algebra calculations on a Virtual Shared Memory Computer","authors":"P. Amestoy, I. Duff, M. Daydé, Pierre Morère","doi":"10.1142/S0129053395000038","DOIUrl":"https://doi.org/10.1142/S0129053395000038","url":null,"abstract":"We evaluate the impact of the memory hierarchy of virtual shared memory computers on the design of algorithms for linear algebra. On classical shared memory multiprocessor computers, block algorithms are used for efficiency. We study here the potential and the limitations of such approaches on globally addressable distributed memory computers. The BBN TC2000 belongs to this class of computers and will be used to illustrate our discussion. We describe the implementation of Level 3 BLAS and examine the performance of some of the LAPACK routines. The impact of the number of processors with respect to the choice of the variants of classical matrix factorizations (for example, KJI, JKI, JIK for the LU factorization) is discussed. We also study the factorization of sparse matrices based on a multifrontal approach. The ideas introduced for the parallelization of full linear algebra codes are applied to the sparse case. We discuss and illustrate the limitations of this approach in sparse multifrontal factorization. We show that the speed-ups obtained on the BBN TC2000 for the class of methods presented here are comparable to those obtained on more classical shared memory computers, such as the Alliant FX/80, the CRAY-2 and the IBM 3090/VF.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127738381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Restriction-Free Adaptive Wormhole Routing in Multicomputer Networks 多计算机网络中的无限制自适应虫洞路由
Int. J. High Speed Comput. Pub Date : 1995-03-01 DOI: 10.1142/S0129053395000063
Jai-Hoon Chung, H. Yoon, S. Maeng
{"title":"Restriction-Free Adaptive Wormhole Routing in Multicomputer Networks","authors":"Jai-Hoon Chung, H. Yoon, S. Maeng","doi":"10.1142/S0129053395000063","DOIUrl":"https://doi.org/10.1142/S0129053395000063","url":null,"abstract":"The adaptive routing approach has been expected as a promising way to improve network performance by utilizing available network bandwidth. Previous adaptive routing strategies in wormhole-routed multicomputer networks restrict the routing of messages by the routing algorithm to prevent deadlock. This results in low degree of adaptivity and low utilization of physical or virtual channels. In this paper, we examine the possibility of performing restriction-free adaptive routing in wormhole-routed networks as an approach to further improving the performance of these networks. A new flow control policy, called message cutting-in, is proposed, and two adaptive routing strategies are presented. Freedom of communication deadlock is achieved by the proposed flow control policy. The proposed adaptive routing strategies do not restrict routing and maximally utilize the physical and virtual channels. Simulation results show that the restriction-free adaptive routing approach is promising from the fact that it has the lowest latency and highest throughput depending on the number of virtual channels per physical channel and patterns of message traffic.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121657368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Parallel Algorithm for Matrix-Vector Multiplication 矩阵-向量乘法的一种高效并行算法
Int. J. High Speed Comput. Pub Date : 1995-03-01 DOI: 10.1142/S0129053395000051
B. Hendrickson, R. Leland, S. Plimpton
{"title":"An Efficient Parallel Algorithm for Matrix-Vector Multiplication","authors":"B. Hendrickson, R. Leland, S. Plimpton","doi":"10.1142/S0129053395000051","DOIUrl":"https://doi.org/10.1142/S0129053395000051","url":null,"abstract":"The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scientific computation. A fast and efficient parallel algorithm for this calculation is therefore desirable. This paper describes a parallel matrix-vector multiplication algorithm which is particularly well suited to dense matrices or matrices with an irregular sparsity pattern. Such matrices can arise from discretizing partial differential equations on irregular grids or from problems exhibiting nearly random connectivity between data structures. The communication cost of the algorithm is independent of the matrix sparsity pattern and is shown to scale as for an n×n matrix on p processors. The algorithm’s performance is demonstrated by using it within the well known NAS conjugate gradient benchmark. This resulted in the fastest run times achieved to date on both the 1024 node nCUBE 2 and the 128 node Intel iPSC/860. Additional improvements to the algorithm which are possible when integrating it with the conjugate gradient algorithm are also discussed.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122637666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Parallel and Pipelined Parallel Consecutive Sums on a Hypercube with Application to Ray Casting 超立方体上的并行和管道并行连续和及其在光线投射中的应用
Int. J. High Speed Comput. Pub Date : 1995-03-01 DOI: 10.1142/S0129053395000099
Jianjian Song, R. Shu
{"title":"Parallel and Pipelined Parallel Consecutive Sums on a Hypercube with Application to Ray Casting","authors":"Jianjian Song, R. Shu","doi":"10.1142/S0129053395000099","DOIUrl":"https://doi.org/10.1142/S0129053395000099","url":null,"abstract":"Communication penalty for parallel computation is related to message startup time and speed of data transmission between the host and processing elements (PEs). We propose two algorithms in this paper to show that the first factor can be alleviated by reducing the number of messages and the second by making the host-PE communication concurrent with computation on the PE array. The algorithms perform 2n consecutive sums of 2n numbers each on a hypercube of degree n. The first algorithm leaves one sum on each processor. It takes n steps to complete the sums and reduces the number of messages generated by a PE from 2n to n. The second algorithm sends all the sums back to the host as the sums are generated one by one. It takes 2n+n−1 steps to complete the sums in a pipeline so that one sum is completed every step after the initial (n−1) steps. We apply our second algorithm to the front-to-back composition for ray casting. For large number of rays, the efficiency and speedup of our algorithm are close to theoretically optimal values.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134014311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Tabu Search Approach to Task Scheduling on Heterogeneous Processors under Precedence Constraints 优先约束下异构处理器任务调度的禁忌搜索方法
Int. J. High Speed Comput. Pub Date : 1995-03-01 DOI: 10.1142/S012905339500004X
S. Porto, C. Ribeiro
{"title":"A Tabu Search Approach to Task Scheduling on Heterogeneous Processors under Precedence Constraints","authors":"S. Porto, C. Ribeiro","doi":"10.1142/S012905339500004X","DOIUrl":"https://doi.org/10.1142/S012905339500004X","url":null,"abstract":"Parallel programs may be represented as a set of interrelated sequential tasks. When multiprocessors are used to execute such programs, the parallel portion of the application can be speeded up by an appropriate allocation of processors to the tasks of the application. Given a parallel application defined by a task precedence graph, the goal of task scheduling (or processor assignment) is thus the minimization of the makespan of the application. In a heterogeneous multiprocessor system, task scheduling consists of determining which tasks will be assigned to each processor, as well as the execution order of the tasks assigned to each processor. In this work, we apply the tabu search metaheuristic to the solution of the task scheduling problem on a heterogeneous multiprocessor environment under precedence constraints. The topology of the Mean Value Analysis solution package for product form queueing networks is used as the framework for performance evaluation. We show that tabu search obtains much better results, i.e., shorter completion times, improving from 20 to 30% the makespan obtained by the most appropriate algorithm previously published in the literature.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124732395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
Implementing Linda Tuplespace on a Distributed System 在分布式系统上实现Linda元空间
Int. J. High Speed Comput. Pub Date : 1995-03-01 DOI: 10.1142/S0129053395000087
M. Feng, Yaoqing Gao, C. Yuen
{"title":"Implementing Linda Tuplespace on a Distributed System","authors":"M. Feng, Yaoqing Gao, C. Yuen","doi":"10.1142/S0129053395000087","DOIUrl":"https://doi.org/10.1142/S0129053395000087","url":null,"abstract":"Linda, a general purpose coordination language, has been used to make a language parallel. Based on a logically shared tuplespace, Linda poses difficulties to be efficiently implemented on a distributed multiprocessor system. This paper reports our approach to solve the problem: processors are divided into groups, and each group has a group manager to provide a local view of the global tuplespace, and handles the tuplespace operations incurred by processors within the group. To maintain the consistency and correctness of the Linda tuplespace operations, we propose the algorithms of a group manager. We also implement the algorithms on a transputer-based multicomputer and show the experiment results.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130684099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信