{"title":"Performance of level 3 BLAS kernels in a dynamically partitioned data-flow environment","authors":"P. Berger , S. Gruszka , I. Gottlieb , Y. Singer","doi":"10.1016/0956-0521(95)00050-X","DOIUrl":"10.1016/0956-0521(95)00050-X","url":null,"abstract":"<div><p>The Dynamically Partitioned Data-Flow (DPDF) model is based on an original analysis concept of the data dependency graph at the instruction level. Instead of a breadth first analysis, as in a classical Data-Flow Model, we execute instructions along data-dependent paths. As a consequence, data locality can be exploited by reusing results between the execution of consecutive instructions. In addition, the different paths are not statically defined but arise from a dynamical partitioning of the graph. This model presents the advantage to support very small cost dynamic scheduling and multitasking strategies. In order to study the efficiency of this new model, a first architecture has been defined. This architecture is currently limited to a single processor with one serial processing unit but four graph analyzing units (called prefetch units). Each of these prefetch units is able to build dynamically its own execution path inside the Data-Flow graph of an application. The efficiency of this architecture is studied on a numerical benchmark composed of a subset of the Livermore loops and of three routines of the Level 3 BLAS (GEMM, SYRK and TRSM). Our goal in these experimentations is to demonstrate the ability of the four prefetch units to feed the ALU.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 4","pages":"Pages 357-361"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00050-X","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77347200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An experimental evaluation of a peer-model monitoring system for the support of a parallel processing environment","authors":"JoséMagalhães Cruz , João Falcãoe Cunha","doi":"10.1016/0956-0521(95)00043-7","DOIUrl":"10.1016/0956-0521(95)00043-7","url":null,"abstract":"<div><p>The process of monitoring the machines or computing nodes in a network, and of monitoring the communication traffic between them, is very important to efficiently launch and execute parallel coarse-grained applications or even classical (serial-type) applications, taking advantage of machines in the network that are not heavily used. An experimental software system, named MONSYS, that is capable of monitoring the machines in a network, is presented. MONSYS can be used in the support of an Application Manager system capable of distributing parallel tasks (or classical programs) over the machines in a local area network with the objective of achieving load balancing. It can also be used as a tool in the administration of networks. MONSYS exhibits a highly decentralized and fault tolerant architecture based on the Peer-Model, which, together with its information diffusion algorithms, constitute its prime novelty. A set of experiments that investigate the performance and scalability of a prototype of MONSYS is presented and discussed. The experiments reported show that MONSYS offers a reasonably accurate picture of the internal state of the machines monitored, without being a burden to the network communication channels or to the machines themselves. In fact, the quantitative results obtained indicate that MONSYS can be several times more performant than an equivalent system using a multicast communication scheme for the exchange of machine state information.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 4","pages":"Pages 331-343"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00043-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79028986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Porting industrial codes and developing sparse linear solvers on parallel computers","authors":"Michel J. Dayde , Iain S. Duff","doi":"10.1016/0956-0521(95)00033-X","DOIUrl":"10.1016/0956-0521(95)00033-X","url":null,"abstract":"<div><p>We address the main issues when porting existing codes from serial to parallel computers and when developing portable parallel software on MIMD multiprocessors (shared memory, virtual shared memory, and distributed memory multiprocessors, and networks of computers). We discuss the use of numerical libraries as a way of developing portable and efficient parallel code. We illustrate this by using examples from our experience in porting industrial codes and in designing parallel numerical libraries. We report in some detail on the parallelization of scientific applications coming from Centre National d'Etudes Spatiales and from Aérospatiale, and we illustrate how it is possible to develop portable and efficient numerical software by considering the parallel solution of sparse linear systems of equations.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 4","pages":"Pages 295-305"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00033-X","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87689931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luis Dias , João Paulo Costa , João Namorado Clímaco
{"title":"A parallel approach to the analytic hierarchy process decision support tool","authors":"Luis Dias , João Paulo Costa , João Namorado Clímaco","doi":"10.1016/0956-0521(95)00045-3","DOIUrl":"10.1016/0956-0521(95)00045-3","url":null,"abstract":"<div><p>The Multiple Criteria Decision Aiding methods dedicated to discrete problems follow different philosophies and strategies for selecting, clustering or ranking alternatives. This work presents a tool using one such method—the Analytic Hierarchy Process (AHP). The Decision Maker (DM) can structure his criteria as a hierarchy tree having the alternatives as leaf nodes. The DM must then build matrices for each node by performing pairwise comparisons between its children. The AHP finds the weights of each child concerning the parent criterion by calculating the elements of the eigenvector corresponding to the maximum eigenvalue of the comparison matrix. Weights are then combined in order to obtain the influence of each alternative on the top of the hierarchy. A DM expects that a Decision Support Tool works faster than he/she does. In order to achieve speed a parallel approach was developed. Parallel implementations described in this work follow different message-passing strategies and capitalise on the fact that the vector of weights for each matrix can be calculated independently. The authors used a network of four Inmos Transputers. Research will focus on finding which implementation will run faster and how the DMs options affect the speedups obtainable.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 4","pages":"Pages 431-436"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00045-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83075758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel performance of Jacobi eigenvalue solution","authors":"Makan Pourzand , Bernard Tourancheau","doi":"10.1016/0956-0521(95)00035-6","DOIUrl":"10.1016/0956-0521(95)00035-6","url":null,"abstract":"<div><p>In this paper we focus on Jacobi like resolution of the eigenproblem for a real symmetric matrix from a parallel performance point of view: we try to optimize the algorithm working on the communication intensive part of the code. We discuss several parallel implementations and propose an implementation which overlaps the communications by the computations to reach a better efficiency. We show that the overlapping implementation can lead to significant improvements. We conclude by presenting our future work.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 4","pages":"Pages 377-383"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00035-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77171290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Belsky, M.W. Beall, J. Fish, M.S. Shephard, S. Gomaa
{"title":"Computer-aided multiscale modeling tools for composite materials and structures","authors":"V. Belsky, M.W. Beall, J. Fish, M.S. Shephard, S. Gomaa","doi":"10.1016/0956-0521(95)00019-V","DOIUrl":"10.1016/0956-0521(95)00019-V","url":null,"abstract":"<div><p>This paper presents recent research efforts at Rensselaer Polytechnic Institute aimed at developing computer-aided multiscale modeling tools for composite materials and structures aimed at predicting the <em>macromechanical</em> (overall) structural response, such as critical deformation, vibration and buckling modes, as well as various failure modes on the <em>mesomechanical</em> (lamina) level, such as delamination and ply buckling, and on the micromechanical (the scale of microconstituents) level, such as debonding, microbuckling, etc.</p><p>The building blocks of this technology are (i) idealization error estimators aimed at quantifying the quality of the numerical and mathematical models of composites, (ii) multigrid technology aimed at superconvergent solution of the multiscale computational models, (iii) mathematical homogenization theory aimed at constructing inter-scale transfer operators for rapid and reliable information flow between the scales, (iv) system identification for <em>in situ</em> characterization of the phases and their interface, and (v) multiscale model construction and visualization.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 3","pages":"Pages 213-223"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00019-V","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75426581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Iterative and direct solvers for interface problems with lagrange multipliers","authors":"J. Fish, V. Belsky, M. Pandheeradi","doi":"10.1016/0956-0521(95)00017-T","DOIUrl":"10.1016/0956-0521(95)00017-T","url":null,"abstract":"<div><p>Special purpose iterative and direct solvers are proposed for solving nonpositive definite symmetric linear systems arising from the three-field hybrid variational principle, which enforces compatibility between independently modeled substructures in the weak sense. The basic idea of the proposed family of solvers is to transform a nonpositive definite linear system into an equivalent positive definite system, which can then be solved by either the iterative or direct method without pivoting. The two-level iterative approach proposed in this paper consists of resolving the lower frequency response of the source problem by means of the equivalent positive definite collocation problem and capturing higher frequency response using projected conjugate gradient method. Numerical experiments in 2-D and shells indicate that the total CPU time of the two-level iterative process is less than 20% higher than that of a corresponding collocation problem, but the benefit from increased accuracy and modeling flexibility clearly overshadows this increased cost.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 3","pages":"Pages 261-273"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00017-T","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80122011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The concurrent element level processing for nonlinear dynamic analysis on a massively parallel computer","authors":"Sang Y. Synn, Robert E. Fulton","doi":"10.1016/0956-0521(95)00022-R","DOIUrl":"10.1016/0956-0521(95)00022-R","url":null,"abstract":"<div><p>The goal of this paper is to explore parallel methodologies with the desired flexibility, generality and accuracy for nonlinear dynamic finite element analysis on massively parallel computer. This paper tests the generality of the concurrent element processing approach and proposes a basic software design strategy to fully take advantage of features available in massively parallel computers having a hierarchical ring architecture. As a testbed, a large scale general purpose code, DYNA3D as used and modified as appropriate to test proposed parallel design concepts on a KSRI parallel computer.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 3","pages":"Pages 285-293"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00022-R","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76891902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The performance prediction of a parallel skyline solver and its implementation for large scale structure analysis","authors":"Sang Y. Synn, Robert E. Fulton","doi":"10.1016/0956-0521(95)00021-Q","DOIUrl":"10.1016/0956-0521(95)00021-Q","url":null,"abstract":"<div><p>In this paper, we propose simplified formulas to predict the time complexity in a parallel skyline solver using two different memory schemes (Global Shared, Shared/Local memory) on two machines (BBN, KSRI). Numerical operation counts and data communication costs are considered for the formulas. Based on these formulas, we developed a processor mapping algorithm to cover the initial computation stage and the recomputation stage of substructure analysis, and also compared the performances of parallel global approach and parallel substructure approach for two practical structure models.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 3","pages":"Pages 275-284"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00021-Q","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87436945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eugene L. Poole , Michael A. Heroux , Pravin Vaidya , Anil Joshi
{"title":"Performance of iterative methods in ANSYS on cray parallel/vector supercomputers","authors":"Eugene L. Poole , Michael A. Heroux , Pravin Vaidya , Anil Joshi","doi":"10.1016/0956-0521(95)00016-S","DOIUrl":"10.1016/0956-0521(95)00016-S","url":null,"abstract":"<div><p>This paper describes recent work using iterative methods for the solution of linear systems in the ANSYS program. The ANSYS program, a general purpose finite element code widely used in structural analysis applications, has now added an iterative solver option. The development of robust iterative solvers and their use in commercial programs is discussed. Discussion of the applicability of iterative solvers as a general purpose solver will include the topics of robustness; as well as memory requirements and CPU performance. A new iterative solver for general purpose finite element codes which functions as a “black-box” solver using element-specific information and the underlying problem physics to construct an effective and inexpensive preconditioner is described. Some results are given from realistic examples comparing the performance of the iterative solver implemented in ANSYS with the traditional parallel/vector frontal solver used in ANSYS and a robust shifted incomplete Choleski iterative solver.</p></div>","PeriodicalId":100325,"journal":{"name":"Computing Systems in Engineering","volume":"6 3","pages":"Pages 251-259"},"PeriodicalIF":0.0,"publicationDate":"1995-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0956-0521(95)00016-S","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72721364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}