Scientific Computing with Multicore and Accelerators最新文献

Hardware-Oriented Multigrid Finite Element Solvers on GPU-Accelerated Clusters gpu加速集群上面向硬件的多网格有限元求解器

Scientific Computing with Multicore and Accelerators Pub Date : 2010-12-07 DOI: 10.1201/B10376-17

S. Turek, Dominik Göddeke, S. Buijssen, Hilmar Wobker

{"title":"Hardware-Oriented Multigrid Finite Element Solvers on GPU-Accelerated Clusters","authors":"S. Turek, Dominik Göddeke, S. Buijssen, Hilmar Wobker","doi":"10.1201/B10376-17","DOIUrl":"https://doi.org/10.1201/B10376-17","url":null,"abstract":"The accurate simulation of real-world phenomena in computational science is often based on an underlying mathematical model comprising a system of partial differential equations (PDEs). Important research fields that we pursue in this setting are computational solid mechanics and computational fluid dynamics (CSM and CFD, see Section 3). Practical applications range from material failure tests, as for instance crash tests in the automotive industry, to fluid and gas flow of any kind, for instance in chemical or medical engineering (e. g., simulation of blood flow in the human body to predict aneurysms) or flow around cars and aircrafts to minimize drag and lift forces. Moreover, the coupling of both models is essential for fluid structure interaction settings (FSI) which represent problem fields of very high technological importance. Such configurations include polymer processing or microfluidic problems exhibiting very complex multiscale behavior due to nonlinear rheological or non-isothermal constitutive laws, and also due to self-induced oscillations of the structural parts in the flow field. In all these cases, the fluid part is mostly laminar, but highly viscous.","PeriodicalId":411793,"journal":{"name":"Scientific Computing with Multicore and Accelerators","volume":"13 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125637218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Efficient Parallel Scan Algorithms for Manycore GPUs 多核gpu的高效并行扫描算法

Scientific Computing with Multicore and Accelerators Pub Date : 2010-12-07 DOI: 10.1201/B10376-29

S. Sengupta, Mark J. Harris, M. Garland, John Douglas Owens

引用次数: 35

Auto-Tuning Stencil Computations on Multicore and Accelerators 多核和加速器上的自调优模板计算

Scientific Computing with Multicore and Accelerators Pub Date : 2010-12-07 DOI: 10.1201/B10376-18

K. Datta, Samuel Williams, V. Volkov, J. Carter, L. Oliker, J. Shalf, K. Yelick

{"title":"Auto-Tuning Stencil Computations on Multicore and Accelerators","authors":"K. Datta, Samuel Williams, V. Volkov, J. Carter, L. Oliker, J. Shalf, K. Yelick","doi":"10.1201/B10376-18","DOIUrl":"https://doi.org/10.1201/B10376-18","url":null,"abstract":"Author(s): Datta, K; Williams, S; Volkov, V; Carter, J; Oliker, L; Shalf, J; Yelick, K | Editor(s): Kurzak, J; Bader, D; Dongarra, J | Abstract: © 2011 by Taylor and Francis Group, LLC. The recent transformation from an environment where gains in computational performance came from increasing clock frequency and other hardware engineering innovations, to an environment where gains are realized through the deployment of ever increasing numbers of modest performance cores has profoundly changed the landscape of scientific application programming. This exponential increase in core count represents both an opportunity and a challenge: access to petascale simulation capabilities and beyond will require that this concurrency be efficiently exploited. The problem for application programmers is further compounded by the diversity of multicore architectures that are now emerging [4]. From relatively complex out-of-order CPUs with complex cache structures, to relatively simple cores that support hardware multithreading, to chips that require explicit use of software controlled memory, designing optimal code for these different platforms represents a serious impediment. An emerging solution to this problem is auto-tuning: the automatic generation of many versions of a code kernel that incorporate various tuning strategies, and the benchmarking of these to select the highest performing version. Typical tuning strategies might include: maximizing incore performance with loop unrolling and restructuring; maximizing memory bandwidth by exploiting non-uniform memory access (NUMA), engaging prefetch by directives; and minimizing memory traffic by cache blocking or array padding. Often a key parameter is associated with each tuning strategy (e.g., the amount of loop unrolling or the cache blocking factor), and these parameters must be explored in addition to the layering of the basic strategies themselves.","PeriodicalId":411793,"journal":{"name":"Scientific Computing with Multicore and Accelerators","volume":"245 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115637585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Mixed-Precision GPU-Multigrid Solvers with Strong Smoothers 具有强平滑器的混合精度gpu -多网格求解器

Scientific Computing with Multicore and Accelerators Pub Date : 2010-12-07 DOI: 10.1201/B10376-11

Dominik Göddeke, R. Strzodka

引用次数: 8

GPU Algorithms for Molecular Modeling 分子建模的GPU算法

Scientific Computing with Multicore and Accelerators Pub Date : 2010-12-07 DOI: 10.1201/B10376-32

J. Stone, David J. Hardy, B. Isralewitz, K. Schulten

引用次数: 4

Drug Design on the Cell BE 细胞BE的药物设计

Scientific Computing with Multicore and Accelerators Pub Date : 1900-01-01 DOI: 10.1201/B10376-24

Cecilia González-Alvarez, Harald Servat, Daniel Cabrera-Benitez, Xavier Aguilar, Carles Pons, J. Fernández-Recio, Daniel Jiménez-González

引用次数: 0

Data Flow Frameworks for Emerging Heterogeneous Architectures and Their Application to Biomedicine 新兴异构架构的数据流框架及其在生物医学中的应用

Scientific Computing with Multicore and Accelerators Pub Date : 1900-01-01 DOI: 10.1201/b10376-27

Ümit V. Çatalyürek, Renato Ferreira, Timothy D. R. Hartley, George Teodoro, R. S. Oliveira

引用次数: 0

Combinatorial Algorithm Design on the Cell/B.E. Processor 基于Cell/B.E.的组合算法设计处理器

Scientific Computing with Multicore and Accelerators Pub Date : 1900-01-01 DOI: 10.1201/b10376-16

David A. Bader, Virat Agarwal, Kamesh Madduri, F. Petrini

引用次数: 0

Pairwise Computations on the Cell Processor 单元格处理器上的成对计算

Scientific Computing with Multicore and Accelerators Pub Date : 1900-01-01 DOI: 10.1201/b10376-22

Abhinav Sarje, J. Zola, S. Aluru

引用次数: 0

Implementing FFTs on Multicore Architectures 在多核架构上实现fft

Scientific Computing with Multicore and Accelerators Pub Date : 1900-01-01 DOI: 10.1201/b10376-14

A. Chow, G. Fossum, Daniel A. Brokenshire

引用次数: 0