2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

筛选
英文 中文
Incremental and Parallel Analytics on Astrophysical Data Streams 天体物理数据流的增量和并行分析
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.130
D. Mishin, T. Budavári, A. Szalay, Yanif Ahmad
{"title":"Incremental and Parallel Analytics on Astrophysical Data Streams","authors":"D. Mishin, T. Budavári, A. Szalay, Yanif Ahmad","doi":"10.1109/SC.Companion.2012.130","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.130","url":null,"abstract":"Stream processing methods and online algorithms are increasingly appealing in the scientific and large-scale data management communities due to increasing ingestion rates of scientific instruments, the ability to produce and inspect results interactively, and the simplicity and efficiency of sequential storage access over enormous datasets. This article will showcase our experiences in using off-the-shelf streaming technology to implement incremental and parallel spectral analysis of galaxies from the Sloan Digital Sky Survey (SDSS) to detect a wide variety of galaxy features. The technical focus of the article is on a robust, highly scalable principal components analysis (PCA) algorithm and its use of coordination primitives to realize consistency as part of parallel execution. Our algorithm and framework can be readily used in other domains.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"54 1","pages":"1078-1086"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82369743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Designing a Collaborative Filtering Recommender on the Single Chip Cloud Computer 在单片云计算机上设计协同过滤推荐器
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.118
Aalap Tripathy, Atish Patra, S. Mohan, R. Mahapatra
{"title":"Designing a Collaborative Filtering Recommender on the Single Chip Cloud Computer","authors":"Aalap Tripathy, Atish Patra, S. Mohan, R. Mahapatra","doi":"10.1109/SC.Companion.2012.118","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.118","url":null,"abstract":"Fast response requirements for big-data applications on cloud infrastructures continues to grow. At the same time, many cores on-chip have now become a reality. These developments are set to redefine infrastructure nodes of cloud data centers in the future. For this to happen, parallel programming runtimes need to be designed for many-cores on chip as the target architecture. In this paper, we show that the commonly used MapReduce programming paradigm can be adapted to run on Intel's experimental single chip cloud computer (SCC) with 48-cores on chip. We demonstrate this using a Collaborative Filtering (CF) recommender system as an application. This is a widely used technique for information filtering to predict user's preference towards an unknown item from their past ratings. These systems are typically deployed in distributed clusters and operate on large apriori datasets. We address scalability with data partitioning, combining and sorting algorithms, maximize data locality to minimize communication cost within the SCC cores. We demonstrate ~2x speedup, ~94% lower energy consumption for benchmark workloads as compared to a distributed cluster of single and multi-processor nodes.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"49 1","pages":"838-847"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82857851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
SAN Optimization for High Performance Storage with RDMA Data Transfer 基于RDMA数据传输的高性能存储SAN优化
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.15
Jae-Woo Choi, Youngjin Yu, Hyeonsang Eom, H. Yeom, Dongin Shin
{"title":"SAN Optimization for High Performance Storage with RDMA Data Transfer","authors":"Jae-Woo Choi, Youngjin Yu, Hyeonsang Eom, H. Yeom, Dongin Shin","doi":"10.1109/SC.Companion.2012.15","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.15","url":null,"abstract":"Today's server environments consist of many machines constructing clusters for distributed computing system or storage area networks (SAN) for effectively processing or saving enormous data. In these kinds of server environments, backend-storages are usually the bottleneck of the overall system. But it is not enough to simply replace the devices with better ones to exploit their performance benefits. In other words, proper optimizations are needed to fully utilize their performance gains. In this work, we first applied a high performance device as a backend-storage to the existing SAN solution, and found that it could not utilize the low latency and high bandwidth of the device, especially in case of small sized random I/O pattern even though a high speed network was used. To address this problem, we propose a new design that contains three optimizations: 1) removing software overheads to lower I/O latency; 2) parallelism to utilize the high bandwidth of the device; 3) temporal merge mechanism to reduce network overhead. We implemented them as a prototype and found that our solution makes substantial performance improvements in terms of both the latency and bandwidth.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"14 1","pages":"24-29"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82899841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Abstract: Hybrid Breadth First Search Implementation for Hybrid-Core Computers 摘要:混合核计算机的混合广度优先搜索实现
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.184
Kevin R. Wadleigh, John Amelio, K. Collins, G. Edwards
{"title":"Abstract: Hybrid Breadth First Search Implementation for Hybrid-Core Computers","authors":"Kevin R. Wadleigh, John Amelio, K. Collins, G. Edwards","doi":"10.1109/SC.Companion.2012.184","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.184","url":null,"abstract":"Summary form only given. The Graph500 benchmark is designed to evaluate the suitability of supercomputing systems for graph algorithms, which are increasingly important in HPC. The timed Graph500 kernel, Breadth First Search, exhibits memory access patterns typical of these types of applications, with poor spatial locality and synchronization between multiple streams of execution. The Graph500 benchmark was ported to the Convey HC-2ex and MX-100, hybrid-core computers with an Intel host system and a coprocessor incorporating four reprogrammable Xilinx FPGAs. The computers contain a unique memory system designed to sustain high bandwidth for random memory accesses. The BFS kernel was implemented as a hybrid algorithm with concurrent processing on both the host and coprocessor. The early steps use a top-down algorithm on the host with results copied to coprocessor memory for use in a bottom-up algorithm. The coprocessor uses thousands of threads to traverse the graph. The resulting implementation runs at over 16 billion TEPS.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"5 1","pages":"1354-1354"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90383297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The SDAV Software Frameworks for Visualization and Analysis on Next-Generation Multi-Core and Many-Core Architectures 面向下一代多核与多核架构的SDAV可视化与分析软件框架
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.36
Christopher M. Sewell, J. Meredith, K. Moreland, T. Peterka, David E. DeMarle, Li-Ta Lo, J. Ahrens, Robert Maynard, Berk Geveci
{"title":"The SDAV Software Frameworks for Visualization and Analysis on Next-Generation Multi-Core and Many-Core Architectures","authors":"Christopher M. Sewell, J. Meredith, K. Moreland, T. Peterka, David E. DeMarle, Li-Ta Lo, J. Ahrens, Robert Maynard, Berk Geveci","doi":"10.1109/SC.Companion.2012.36","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.36","url":null,"abstract":"This paper surveys the four software frameworks being developed as part of the visualization pillar of the SDAV (Scalable Data Management, Analysis, and Visualization) Institute, one of the SciDAC (Scientific Discovery through Advanced Computing) Institutes established by the ASCR (Advanced Scientific Computing Research) Program of the U.S. Department of Energy. These frameworks include EAVL (Extreme-scale Analysis and Visualization Library), DAX (Data Analysis at Extreme), DIY (Do It Yourself), and PISTON. The objective of these frameworks is to facilitate the adaptation of visualization and analysis algorithms to take advantage of the available parallelism in emerging multi-core and many-core hardware architectures, in anticipation of the need for such algorithms to be run in-situ with LCF (leadership-class facilities) simulation codes on supercomputers.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"78 1","pages":"206-214"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83940541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Poster: High Performance GPU Accelerated TSP Solver 海报:高性能GPU加速TSP求解器
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.225
K. Rocki, R. Suda
{"title":"Poster: High Performance GPU Accelerated TSP Solver","authors":"K. Rocki, R. Suda","doi":"10.1109/SC.Companion.2012.225","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.225","url":null,"abstract":"We are presenting a high performance GPU accelerated implementation of 2-opt local search algorithm for the Traveling Salesman Problem (TSP). GPU usage greatly decreases the time needed to optimize the route, however requires a complicated and well tuned implementation. With the increasing problem size, the time spent on comparing the graph edges grows significantly. We used instances from the TSPLIB library for for testing and our results show that by using our GPU algorithm, the time needed to perform a simple local search operation can be decreased approximately 5 to 45 times compared to parallel CPU code implementation using 6 cores. The code has been implemented in CUDA as well as in OpenCL and tested on NVIDIA and AMD devices. The experimental studies have shown that the optimization algorithm using the GPU local search converges from up to 300 times faster on average compared to the sequential CPU version, depending on the problem size. The main contributions of this work are the problem division scheme exploiting data locality which allows to solve arbitrarily big problem instances using GPU and the parallel implementation of the algorithm itself.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"10 1","pages":"1413-1414"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88065052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Application of High Performance Computing to Solvency and Profitability Calculations for Life Assurance Contracts 高性能计算在寿险合同偿付能力和盈利能力计算中的应用
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.140
Mark Tucker, J. M. Bull
{"title":"The Application of High Performance Computing to Solvency and Profitability Calculations for Life Assurance Contracts","authors":"Mark Tucker, J. M. Bull","doi":"10.1109/SC.Companion.2012.140","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.140","url":null,"abstract":"In the UK, pension providers are required by law to demonstrate solvency on a regular basis; the regulations governing how solvency is demonstrated are changing. Historically, it has been sufficient to report solvency using a single `best estimate' set of assumptions. The new regulations require a Monte Carlo approach to finding a worst-case scenario that requires computing power which is outside the systems currently available in the industry. This paper aims to show that the new regulations could be met by moving away from current actuarial valuation software packages and producing well-performing ab initio code, employing a variety of HPC techniques. Using a combination of algorithmic improvements, serial optimisations and multi-core parallelism, we demonstrate a performance improvement over commercial software of a factor of over 105. We show that this brings the Monte Carlo simulations within the bounds of practicality, and we suggest possibilities for further improvements, for example using clusters of GPUs. We also identify other possible use cases for high performance solvency and profitability calculations.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"56 1","pages":"1163-1170"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86829298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Poster: Acceleration of the BLAST Hydro Code on GPU 海报:在GPU上加速BLAST Hydro Code
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.172
Tingxing Dong, T. Kolev, R. Rieben, V. Dobrev
{"title":"Poster: Acceleration of the BLAST Hydro Code on GPU","authors":"Tingxing Dong, T. Kolev, R. Rieben, V. Dobrev","doi":"10.1109/SC.Companion.2012.172","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.172","url":null,"abstract":"The BLAST code implements a high-order numerical algorithm that solves the equations of compressible hydrodynamics using the Finite Element Method in a moving Lagrangian frame. BLAST is coded in C++ and parallelized by MPI. We accelerate the most computationally intensive parts (80%-95%) of BLAST on an NVIDIA GPU with the CUDA programming model. Several 2D and 3D problems were tested and a maximum speedup of 4.3x was delivered. Our results demonstrate the validity and capability of GPU computing.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"53 1","pages":"1337-1337"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83570189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Case for Optimistic Coordination in HPC Storage Systems 高性能计算存储系统的乐观协调问题
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.19
P. Carns, K. Harms, D. Kimpe, R. Ross, J. Wozniak, L. Ward, M. Curry, Ruth Klundt, Geoff Danielson, Cengiz Karakoyunlu, J. Chandy, Bradley Settlemeyer, W. Gropp
{"title":"A Case for Optimistic Coordination in HPC Storage Systems","authors":"P. Carns, K. Harms, D. Kimpe, R. Ross, J. Wozniak, L. Ward, M. Curry, Ruth Klundt, Geoff Danielson, Cengiz Karakoyunlu, J. Chandy, Bradley Settlemeyer, W. Gropp","doi":"10.1109/SC.Companion.2012.19","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.19","url":null,"abstract":"High-performance computing (HPC) storage systems rely on access coordination to ensure that concurrent updates do not produce incoherent results. HPC storage systems typically employ pessimistic distributed locking to provide this functionality in cases where applications cannot perform their own coordination. This approach, however, introduces significant performance overhead and complicates fault handling. In this work we evaluate the viability of optimistic conditional storage operations as an alternative to distributed locking in HPC storage systems. We investigate design strategies and compare the two approaches in a prototype object storage system using a parallel read/modify/write benchmark. Our prototype illustrates that conditional operations can be easily integrated into distributed object storage systems and can outperform standard coordination primitives for simple update workloads. Our experiments show that conditional updates can achieve over two orders of magnitude higher performance than pessimistic locking for some parallel read/modify/write workloads.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"34 1","pages":"48-53"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79485477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Communication avoiding algorithms 通信避免算法
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.351
J. Demmel
{"title":"Communication avoiding algorithms","authors":"J. Demmel","doi":"10.1109/SC.Companion.2012.351","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.351","url":null,"abstract":"","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"1 1","pages":"1942-2000"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88259438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信