Parallel Algorithms and Applications最新文献

筛选
英文 中文
Defects in parallel Monte Carlo and quasi-Monte Carlo integration using the leap-frog technique 利用跃迁技术求解平行蒙特卡罗积分和拟蒙特卡罗积分中的缺陷
Parallel Algorithms and Applications Pub Date : 2003-05-01 DOI: 10.1080/1063719031000088021
K. Entacher, Thomas Schell, W. C. Schmid, A. Uhl
{"title":"Defects in parallel Monte Carlo and quasi-Monte Carlo integration using the leap-frog technique","authors":"K. Entacher, Thomas Schell, W. C. Schmid, A. Uhl","doi":"10.1080/1063719031000088021","DOIUrl":"https://doi.org/10.1080/1063719031000088021","url":null,"abstract":"Currently, the most efficient numerical techniques for evaluating high-dimensional integrals are based on Monte Carlo and quasi-Monte Carlo techniques. These tasks require a significant amount of computation and are therefore often executed on parallel computer systems. In order to keep the communication amount within a parallel system to a minimum, each processing element (PE) requires its own source of integration nodes. Therefore, techniques for using separately initialized and disjoint portions of a given point set on a single PE are classically employed. Using the so-called substreams may lead to dramatic errors in the results under certain circumstances. In this work, we compare the possible defects employing leaped quasi-Monte Carlo and Monte Carlo substreams. Apart from comparing the magnitude of the observed integration errors we give an overview under which circumstances (i.e. parallel programming models) such errors can occur.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133254130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Special Issue: A systolic block-Jacobi SVD solver for processor meshes 特刊:处理器网格的收缩块- jacobi SVD求解器
Parallel Algorithms and Applications Pub Date : 2003-05-01 DOI: 10.1080/1063719031000088003
G. Okša, M. Vajtersic
{"title":"Special Issue: A systolic block-Jacobi SVD solver for processor meshes","authors":"G. Okša, M. Vajtersic","doi":"10.1080/1063719031000088003","DOIUrl":"https://doi.org/10.1080/1063719031000088003","url":null,"abstract":"We design the systolic version of the two-sided block-Jacobi algorithm for the singular value decomposition (SVD) of matrix A∈R m×n , and m, n even. The algorithm involves the class CO of parallel orderings on the two-dimensional toroidal mesh with p processors. The mathematical background is based on the QR decomposition (QRD) of local data matrices and on the triangular Kogbetliantz algorithm (TKA) for local SVDs in the diagonal mesh processors. Subsequent updates of local matrices in the diagonal as well as nondiagonal mesh processors are required. We show that all updates can be realized by orthogonal modified Givens rotations. These rotations can be efficiently pipelined in parallel in the horizontal and vertical rings of processor through the toroidal mesh. Our solution requires, per one mesh processor, systolic processing elements (PEs) and additional delay elements. The time complexity can be estimated as where w is the number of global sweeps in the two-sided block-Jacobi algorithm and Δ is the length of the global synchronization time step. The VLSI area per mesh processor, measured by the number of vertical and horizontal wires required for its construction, can be estimated as and the combined VLSI area–time complexity per mesh processor is The theoretical speedup can be estimated as Using the mesh processors of fixed inner size , even, it is possible to construct the square two-dimensional toroidal mesh and to compute the SVD of matrix A, the size of the which matches the shape of mesh processors, i.e. In this sense, the systolic algorithm is scalable.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125612576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
LEFTMOST EIGENVALUE OF REAL AND COMPLEX SPARSE MATRICES ON PARALLEL COMPUTER USING APPROXIMATE INVERSE PRECONDITIONING 利用近似逆预处理在并行计算机上求实和复稀疏矩阵的最左特征值
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/10637190208941433
G. Pini
{"title":"LEFTMOST EIGENVALUE OF REAL AND COMPLEX SPARSE MATRICES ON PARALLEL COMPUTER USING APPROXIMATE INVERSE PRECONDITIONING","authors":"G. Pini","doi":"10.1080/10637190208941433","DOIUrl":"https://doi.org/10.1080/10637190208941433","url":null,"abstract":"An efficient parallel approach for the computation of the eigenvalue of smallest absolute magnitude of sparse real and complex matrices is provided. The proposed strategy tries to improve the efficiency of the reverse power method. At each inverse power iteration the linear system is solved either by the conjugate gradient scheme (symmetric case) or by the Bi-CGSTAB method (symmetric case). Both solvers are preconditioned employing the approximate inverse factorization and thus are easily parallelized. The satisfactory speed-ups obtained on the CRAY T3E supercomputer show the high degree of parallelization reached by the proposed algorithm.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127201542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
AN O∥LOG P) PARALLEL IMPLEMENTATION OF FEEDBACK GUIDED DYNAMIC LOOP SCHEDULING 一个o∥log p)并行实现的反馈导向动态循环调度
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/10637190208941438
T. Tabirca, Len Freeman, S. Tabirca
{"title":"AN O∥LOG P) PARALLEL IMPLEMENTATION OF FEEDBACK GUIDED DYNAMIC LOOP SCHEDULING","authors":"T. Tabirca, Len Freeman, S. Tabirca","doi":"10.1080/10637190208941438","DOIUrl":"https://doi.org/10.1080/10637190208941438","url":null,"abstract":"Feedback Guided Dynamic Loop Scheduling (FGDLS) is a recently proposed dynamic algorithm for loop scheduling. The original algorithm required an O(p) serial computation at each stage to compute the updated loop schedule. In this paper, it is shown that this computation can be implemented in O(log p) operations on p processors","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130489061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
NUMERICAL SOLUTION OF DISCRETE STABLE LINEAR MATRIX EQUATIONS ON MULTICOMPUTERS 离散稳定线性矩阵方程在多计算机上的数值解
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/10637190208941436
P. Benner, E. S. Quintana‐Ortí, G. Quintana-Ortí
{"title":"NUMERICAL SOLUTION OF DISCRETE STABLE LINEAR MATRIX EQUATIONS ON MULTICOMPUTERS","authors":"P. Benner, E. S. Quintana‐Ortí, G. Quintana-Ortí","doi":"10.1080/10637190208941436","DOIUrl":"https://doi.org/10.1080/10637190208941436","url":null,"abstract":"We investigate the parallel performance of numerical algorithms for solving discrete Sylvester and Stein equations as they appear for instance in discrete-time control problems, filtering, and image restoration. The methods used here are the squared Smith iteration and the sign function method on a Cayley transformation of the original equation. For Stein equations with semidefinite right-hand side these methods are modified such that the Cholesky factor of the solution can be computed directly without forming the solution matrix explicitly. We report experimental results of these algorithms on distributed-memory multicomputers","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"464 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116185894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
PORTING REGULAR APPLICATIONS ON HETEROGENEOUS WORKSTATION NETWORKS: PERFORMANCE ANALYSIS AND MODELING 在异构工作站网络上移植常规应用程序:性能分析和建模
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/01495730108941441
A. Clematis, A. Corana
{"title":"PORTING REGULAR APPLICATIONS ON HETEROGENEOUS WORKSTATION NETWORKS: PERFORMANCE ANALYSIS AND MODELING","authors":"A. Clematis, A. Corana","doi":"10.1080/01495730108941441","DOIUrl":"https://doi.org/10.1080/01495730108941441","url":null,"abstract":"Abstract Heterogeneous networks of workstations and/or personal computers (NOW) are increasingly used as a powerful platform for the execution of parallel applications. When applications previously developed for traditional parallel machines (homogeneous and dedicated) are ported to NOWs, performance worsens owing in part to less efficient communications but more often to unbalancing. In this paper, we address the problem of the efficient porting to heterogeneous NOWs of data-parallel applications originally developed using the SPMD paradigm for homogeneous parallel systems with regular topology like ring. To achieve good performance, the computation time on the various machines composing the NOW must be as balanced as possible. This can be obtained in two ways: by using an heterogeneous data partition strategy with a single process per node, or by splitting homogeneously data among processes and assigning to each node a number of processes proportional to its computing power. The first method is however more difficult, since some modifications in the code are always needed, whereas the second approach requires very few changes. We carry out a simplified but reliable analysis, and propose a simple model able to simulate performance in the various situations. Two test cases, matrix multiplication and computation of long-range interactions, are considered, obtaining a good agreement between simulated and experimental results. Our analysis shows that an efficient porting of regular homogeneous data-parallel applications on heterogeneous NOWs is possible. Particularly, the approach based on multiple processes per node turns out to be a straightforward and effective way for achieving very satisfying performance in almost all situations, even dealing with highly heterogeneous systems.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127652873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DERIVING A FAST SYSTOLIC ALGORITHM FOR THE LONGEST COMMON SUBSEQUENCE PROBLEM 给出了一种求解最长公共子序列问题的快速收敛算法
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/10637190208941431
Yen-Chun Lin, J. Yeh
{"title":"DERIVING A FAST SYSTOLIC ALGORITHM FOR THE LONGEST COMMON SUBSEQUENCE PROBLEM","authors":"Yen-Chun Lin, J. Yeh","doi":"10.1080/10637190208941431","DOIUrl":"https://doi.org/10.1080/10637190208941431","url":null,"abstract":"The longest common subsequence (LCS) problem is to find an LCS of two given sequences and the length of the LCS. In this paper, an efficient systolic algorithm for the LCS problem is derived. For two sequences of length m and n, where m ≥ n, the problem can be solved with only [n/2] processors in m + 2[n/2] − 1 time steps. Compared with other systolic algorithms that solve the LCS problem, our algorithm not only takes fewer time steps but also uses fewer processors. Our algorithm is better suited to implementation on multicomputers than other systolic algorithms.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127810884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A PARALLEL DIVIDE AND CONQUER ALGORITHM FOR NON SYMMETRIC TRIDIAGONAL TOEPLITZ SYSTEMS USING CONJUGATE GRADIENT 非对称三对角线toeplitz系统的共轭梯度并行分治算法
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/01495730208941443
L. Garey, R. E. Shaw, J. Zhang
{"title":"A PARALLEL DIVIDE AND CONQUER ALGORITHM FOR NON SYMMETRIC TRIDIAGONAL TOEPLITZ SYSTEMS USING CONJUGATE GRADIENT","authors":"L. Garey, R. E. Shaw, J. Zhang","doi":"10.1080/01495730208941443","DOIUrl":"https://doi.org/10.1080/01495730208941443","url":null,"abstract":"Abstract In this paper, we consider the application of the conjugate gradient method specifically to solve non symmetric systems which are large, tridiagonal and Toeplitz. Under the condition that the system is diagonally dominant, one can pre-multiply the system by the transpose of the coefficient matrix and take advantage of the structure of the new coefficient matrix to perturb and factor it. This allows us to divide the task of solution containing pairs of tridiagonal, symmetric and Toeplitz systems and to solve the pairs of systems using a parallel implementaton of congujate gradient. Final corrections, to account for the perturbations, provide a numerical approximation to the solution.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116687927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
THE LOAD DISTRIBUTION PROBLEM IN A PROCESSOR RING 处理器环中的负载分配问题
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/01495730108941440
F. Lau
{"title":"THE LOAD DISTRIBUTION PROBLEM IN A PROCESSOR RING","authors":"F. Lau","doi":"10.1080/01495730108941440","DOIUrl":"https://doi.org/10.1080/01495730108941440","url":null,"abstract":"Abstract Given a global picture of the system load and the average load, the load distribution problem is to find a suitable schedule, consisting of the amount of excess load to transfer along every edge, so that the system load can be balanced in minimal time by executing the schedule. We study this problem for the ring topology We discuss some existing algorithms, show how they fall short of being able to generate optimal schedules, and present a simple algorithm that would generate an optimal schedule for any given system load instance. This simple algorithm relies on an existing algorithm to create a search window in which the optimal solution is to be found.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126049067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ON MAX CUT IN CUBIC GRAPHS 关于三次图的Max切割
Parallel Algorithms and Applications Pub Date : 2002-01-01 DOI: 10.1080/01495730108941439
T. Calamoneri, Irene Finocchi, Y. Manoussakis, R. Petreschi
{"title":"ON MAX CUT IN CUBIC GRAPHS","authors":"T. Calamoneri, Irene Finocchi, Y. Manoussakis, R. Petreschi","doi":"10.1080/01495730108941439","DOIUrl":"https://doi.org/10.1080/01495730108941439","url":null,"abstract":"Abstract This paper is concerned with the maximum cut problem in parallel on cubic graphs. New theoretical results characterizing the cardinality of the cut are presented. These results make it possible to design a simple combinatorial O(log n) time parallel algorithm, running on a CRCW P-RAM with O(n) processors. The approximation ratio achieved by the algorithm is 1·3 and improves the best known parallel approximation ratio, i.e. 2, in the special class of cubic graphs. The algorithm also guarantees that the size of the returned cut is at least ((9g −3)/8 g)n, where g is the odd girth of the input graph. Experimental results round off the paper, showing that the solutions obtained in practice are likely to be much better than the theoretical lower bound.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"311 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116805085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信