Parallel Algorithms and Applications最新文献

Cost-effective modeling for natural resource distribution systems 自然资源分配系统的成本效益建模

Parallel Algorithms and Applications Pub Date : 2004-12-01 DOI: 10.1080/10637190412331295148

A. Al-Ayyoub

引用次数: 0

A comparative study of explicit group iterative solvers on a cluster of workstations 工作站集群上显式群迭代求解的比较研究

Parallel Algorithms and Applications Pub Date : 2004-12-01 DOI: 10.1080/10637190412331295157

Norhashidah Hj. Mohd Ali, Rosni Abdullah †, Kok Jun Lee ‡

{"title":"A comparative study of explicit group iterative solvers on a cluster of workstations","authors":"Norhashidah Hj. Mohd Ali, Rosni Abdullah †, Kok Jun Lee ‡","doi":"10.1080/10637190412331295157","DOIUrl":"https://doi.org/10.1080/10637190412331295157","url":null,"abstract":"In this paper, a group iterative scheme based on rotated (cross) five-point finite difference discretisation, i.e. the four-point explicit decoupled group (EDG) is considered in solving a second order elliptic partial differential equation (PDE). This method was firstly introduced by Abdullah [“The four point EDG method: a fast poisson solver”, Int. J. Comput. Math., 38 (1991) 61–70], where the method was found to be more superior than the common existing methods based on the standard five-point finite difference discretisation. The method was further extended to different type of PDE's, where similar improved results were established [Ali, N.H.M., Abdullah, A.R. Four Point EDG: A Fast Solver For The Navier–Stokes Equation, M.H.Hamza (ed.) Proceedings of the IASTED International Conference on Modelling Simulation And Optimization, Gold Coast, Australia, May 6–9 (1996) (CD Rom-File 242-165.pdf), ISBN: 0-88986-197-8; Ali, N.H.M., Abdullah, A.R. New Parallel Point Iterative Solutions For the Diffusion-Convection Equation Proceedings of the International Conference on Parallel and Distributed Computing and Networks Singapore, Aug. 11–13 (1997) 136–139; Ali, N.H.M., Abdullah, A.R. “New rotated iterative algorithms for the solution of a coupled system of elliptic equations” Int. J. Comput. Math. 74 (1999) 223–251]. These new iterative algorithms had been developed to run on the Sequent Balance, a shared memory parallel computer [A.R. Abdullah, N.M. Ali, The Comparative Study of Parallel Strategies For The Solution of Elliptic PDE's Parallel Algorithms and Applications Vol. 10 (1996) 93–103; Ali, N.H.M., Abdullah, A.R. “Parallel four point explicit decoupled group (EDG) method for elliptic PDE's” Proceedings of the Seventh IASTED/ISMM International Conference on Parallel and Distributed Computing and Systems (1995) 302–304 (Washington DC); Ali, N.H.M., Abdullah, A.R. New Parallel Point Iterative Solutions For the Diffusion-Convection Equation Proceedings of the International Conference on Parallel and Distributed Computing and Networks, Singapore, Aug. 11–13 (1997) 136–139; Yousif, W.S., Evans, D.J.“Explicit decoupled group iterative methods and their parallel implementations” Parallel Algorithms and Applications 7 (1995) 53–71] where they were shown to be suitable to be implemented in parallel. In this work, the four-point group algorithm was ported to run on a cluster of Sun workstations using a parallel virtual machine (PVM) programming environment together with the four-point explicit group (EG) method [Evans, D.J., Yousif, W.S. “The implementation of the explicit block iterative methods on the balance 8000 parallel computer” Parallel Computing 16 (1990) 81–97]. We describe the parallel implementations of these methods in solving the Poisson equation and the results of some computational experiments are compared and reported. rosni@cs.usm.my kokjl@hotmail.com","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125661288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Fast and scalable parallel matrix computations with reconfigurable pipelined optical buses 快速和可扩展的并行矩阵计算与可重构的流水线光总线

Parallel Algorithms and Applications Pub Date : 2004-12-01 DOI: 10.1080/10637190410001700604

Keqin Li

引用次数: 2

FPGA implementation of a Cholesky algorithm for a shared-memory multiprocessor architecture 用于共享内存多处理器架构的Cholesky算法的FPGA实现

Parallel Algorithms and Applications Pub Date : 2004-12-01 DOI: 10.1080/10637190412331279957

Satchidanand G. Haridas, Sotirios G. Ziavras

{"title":"FPGA implementation of a Cholesky algorithm for a shared-memory multiprocessor architecture","authors":"Satchidanand G. Haridas, Sotirios G. Ziavras","doi":"10.1080/10637190412331279957","DOIUrl":"https://doi.org/10.1080/10637190412331279957","url":null,"abstract":"Solving a system of linear equations is a key problem in engineering and science. Matrix factorization is a key component of many methods used to solve such equations. However, the factorization process is very time consuming, so these problems have often been targeted for parallel machines rather than sequential ones. Nevertheless, commercially available supercomputers are expensive and only large institutions have the resources to purchase them. Hence, efforts are on to develop moreaffordable alternatives. In this paper, we propose such an approach. We present an implementation of a parallel version of the Cholesky matrix factorization algorithm on a single-chip multiprocessor built inside an APEX20K series Field-Programmable Gate Array (FPGA) developed by Altera. Our multiprocessor system uses an asymmetric, shared-memoryMIMD architecture and was built using the configurable Nios™ processor core which was also developed by Altera. Our system was developed using Altera's System-On-a-Programmable-Chip (SOPC) Quartus II development environment. Our Cholesky implementation is based on an algorithm described by George et al. [6]. This algorithm is scalable and uses a “queue of tasks” approach to ensure dynamic load-balancing among the processing elements. Our implementation assumes dense matrices in the input. We present performance results for uniprocessor and multiprocessor implementations. Our results show that the implementation of multiprocessors inside FPGAs can benefit matrix operations, such as matrix factorization. Further benefits result from good dynamic load-balancing techniques.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126277372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Application of MPI-IO in Parallel Particle Transport Monte-Carlo Simulation MPI-IO在平行粒子输运蒙特卡罗模拟中的应用

Parallel Algorithms and Applications Pub Date : 2004-12-01 DOI: 10.1080/10637190412331295166

Mo Ze-yao, Huang Zhengfeng

引用次数: 2

A Distributed Normalized Explicit Preconditioned Conjugate Gradient Method 一种分布归一化显式预条件共轭梯度法

Parallel Algorithms and Applications Pub Date : 2004-06-01 DOI: 10.1080/10637190412331279975

G. Gravvanis, K. M. Giannoutakis, Nikolaos Missirlis

引用次数: 3

The Journal of Parallel Algorithms and Applications: Special Issue on Parallel and Distributed Algorithms 并行算法与应用杂志:并行与分布式算法特刊

Parallel Algorithms and Applications Pub Date : 2004-06-01 DOI: 10.1080/10637190410001725445

G. Gravvanis, H. Arabnia

{"title":"The Journal of Parallel Algorithms and Applications: Special Issue on Parallel and Distributed Algorithms","authors":"G. Gravvanis, H. Arabnia","doi":"10.1080/10637190410001725445","DOIUrl":"https://doi.org/10.1080/10637190410001725445","url":null,"abstract":"The Journal of Parallel Algorithms and Applications publishes original quality research throughout various areas, including Parallel and Distributed Algorithms. The scope of the journal includes novel applications as well as fundamental contributions to the field. This Special Issue of The Journal of Parallel Algorithms and Applications contains selected articles presented at The International Multi-Conference in Computer Science and Computer Engineering; title of track: The 2003 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2003; June 23–26, 2003, Las Vegas, Nevada, USA). The main objective of the International Multi-Conference in Computer Science and Computer Engineering series is to create an international scientific forum for presentation and discussion of current research topics of Computer Science and Engineering. The six papers appearing in this special issue provide a variety and wealth of contributions and approaches in the field: In this special issue, Schimmler M., Schmidt B. and Lang H.W. present the design of a new bit-serial floating-point unit (FPU), which has been developed for the processors of the Instruction Systolic Array parallel computer model. The bit-serial approach requires a different data format. The proposed floating-point unit uses an IEEE compliant internal floating-point format that allows a fast least significant bit (LSB)-first arithmetic that can be efficiently implemented in hardware. Mohamed A.S. and Baydogan V.S. propose a broader generic application/language/ model independent multi-agent framework for dynamic load balancing. The framework is intended to handle varying levels of load changes in computations, I/O, and/or synchronization throughout the application run and it is an open-architecture that currently supports four multi-level parallel programming models. An open-architecture multi-agent load-balancing capability is proposed that currently makes use of a leading geometric partitioner engine at runtime. It has been shown that the framework is effective in monitoring, tuning, and rebalancing emerging computational, I/O and synchronization sources of load imbalance. Zafar B., Pinkston T.M., Bermudez A. and Duato J. discuss InfiniBand architecture which is a newly established general-purpose interconnect standard. A method for applying the Double Scheme over InfiniBand networks is proposed. The Double Scheme provides","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"85 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125718498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deadlock-free dynamic reconfiguration over InfiniBand™ NETWORKS InfiniBand™网络上无死锁的动态重新配置

Parallel Algorithms and Applications Pub Date : 2004-06-01 DOI: 10.1080/10637190410001725463

B. Zafar, T. Pinkston, Aurelio Bermúdez, J. Duato

{"title":"Deadlock-free dynamic reconfiguration over InfiniBand™ NETWORKS","authors":"B. Zafar, T. Pinkston, Aurelio Bermúdez, J. Duato","doi":"10.1080/10637190410001725463","DOIUrl":"https://doi.org/10.1080/10637190410001725463","url":null,"abstract":"InfiniBand Architecture (IBA) is a newly established general-purpose interconnect standard applicable to local area, system area and storage area networking and I/O. Networks based on this standard should be capable of tolerating topological changes due to resource failures, link/switch activations, and/or hot swapping of components. In order to maintain connectivity, the network's routing function may need to be reconfigured on each topological change. Although the architecture has various mechanisms useful for configuring the network, no strategy or procedure is specified for ensuring deadlock freedom during dynamic network reconfiguration. In this paper, a method for applying the Double Scheme over InfiniBand networks is proposed. The Double Scheme provides a systematic way of reconfiguring a network dynamically while ensuring freedom from deadlocks. We show how features and mechanisms available in IBA for other purposes can also be used to implement dynamic network reconfiguration based on the Double Scheme. We also propose new mechanisms that may be considered in future versions of the IBA specification for making dynamic reconfiguration and other subnet management operations more efficient.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124578584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

A bit-serial floating-point unit for a massively parallel system on a chip 用于芯片上大规模并行系统的位串行浮点单元

Parallel Algorithms and Applications Pub Date : 2004-06-01 DOI: 10.1080/10637190410001725454

M. Schimmler, B. Schmidt, Hans-Werner Lang

引用次数: 0

A Locality-conscious load-balancer based on negotiations in dynamic unstructured mesh computations 动态非结构化网格计算中基于协商的位置感知负载均衡器

Parallel Algorithms and Applications Pub Date : 2004-06-01 DOI: 10.1080/10637190412331279966

A. Mohamed, Veysel S. Baydogan

{"title":"A Locality-conscious load-balancer based on negotiations in dynamic unstructured mesh computations","authors":"A. Mohamed, Veysel S. Baydogan","doi":"10.1080/10637190412331279966","DOIUrl":"https://doi.org/10.1080/10637190412331279966","url":null,"abstract":"Recently hybrid/multi-level parallel programming models are gaining lots of momentum basically because they have proven to provide better scalability, speedup and utilization than any single parallel programming model alone. In such models, load balancing should not only mean balancing the computational loads (as it has always been perceived), but should also mean balancing I/O imbalance as well as synchronization imbalance. In this paper, we propose a broader generic application/language/model independent multi-agent framework for dynamic load balancing. It takes most of the load-balancing burden away from programmers. It is not a library but a runtime support system that is not hardwired to the parallel applications. The framework is intended to handle varying levels of load changes in computations, I/O and/or synchronization throughout the application run and it is an open-architecture that currently supports four multi-level parallel programming models. It has a clean interface to the application, runs in parallel and provides additional functionality such as determination of when to balance load and provide interface to end users. The proposed open-architecture multi-agent load-balancing capability currently makes use of a leading geometric partitioner engine (Chaco) at runtime. A mesh solver may initially create hundreds of lightweight threads, each handling a small submesh by calling Chaco partitioning engine in a pre-processing stage. This partitioner engine might be called again by these light-weight threads if a divide-and-conquer process is deemed necessary when the sub-domain (submesh) served by this thread grows out beyond certain threshold limits and thus creates an imbalance. In the proposed framework, the multi-agent is a set of SMP-based load balancers (agents) that do not have to share any data structure with the parallel application threads. They just monitor and collect system and application data frequently from the outside of the multi-threaded parallel application solver and send adjustments and negotiation plans to the SMP-load balancers and the application threads whenever a need for load balancing arises. The proposed framework has been deployed in four hybrid/multi-level parallel programming models and its capabilities of issuing corrective actions against emerging imbalances were tested in the context of an adaptive mesh refinement application. Experimental results show that the framework is effective in monitoring, tuning and rebalancing emerging computational, I/O and synchronization sources of load imbalance.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131512258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1