2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)最新文献_第10页

Mr. Scan: Extreme scale density-based clustering using a tree-based network of GPGPU nodes Mr. Scan:使用基于GPGPU节点的树状网络的基于极端规模密度的集群

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503262

Benjamin Welton, Evan Samanas, B. Miller

引用次数: 54

Performance evaluation of Intel® Transactional Synchronization Extensions for high-performance computing Intel®事务性同步扩展的高性能计算性能评估

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503232

Richard M. Yoo, C. Hughes, K. Lai, Ravi Rajwar

引用次数: 277

Precimonious: Tuning assistant for floating-point precision Precimonious:浮点精度调优助手

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503296

Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, J. Demmel, W. Kahan, Koushik Sen, D. Bailey, Costin Iancu, David G. Hough

{"title":"Precimonious: Tuning assistant for floating-point precision","authors":"Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, J. Demmel, W. Kahan, Koushik Sen, D. Bailey, Costin Iancu, David G. Hough","doi":"10.1145/2503210.2503296","DOIUrl":"https://doi.org/10.1145/2503210.2503296","url":null,"abstract":"Given the variety of numerical errors that can occur, floating-point programs are difficult to write, test and debug. One common practice employed by developers without an advanced background in numerical analysis is using the highest available precision. While more robust, this can degrade program performance significantly. In this paper we present Precimonious, a dynamic program analysis tool to assist developers in tuning the precision of floating-point programs. Precimonious performs a search on the types of the floating-point program variables trying to lower their precision subject to accuracy constraints and performance goals. Our tool recommends a type instantiation that uses lower precision while producing an accurate enough answer without causing exceptions. We evaluate Precimonious on several widely used functions from the GNU Scientific Library, two NAS Parallel Benchmarks, and three other numerical programs. For most of the programs analyzed, Precimonious reduces precision, which results in performance improvements as high as 41%.","PeriodicalId":371074,"journal":{"name":"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116674858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 285

2HOT: An improved parallel hashed oct-tree N-Body algorithm for cosmological simulation 一种改进的并行哈希oct-tree N-Body宇宙学模拟算法

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-10-16 DOI: 10.1145/2503210.2503220

Michael S. Warren

引用次数: 59