2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

筛选
英文 中文
DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment DS-CUDA:在云环境中使用多个gpu的中间件
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.146
Minoru Oikawa, A. Kawai, K. Nomura, K. Yasuoka, Kazuyuki Yoshikawa, T. Narumi
{"title":"DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment","authors":"Minoru Oikawa, A. Kawai, K. Nomura, K. Yasuoka, Kazuyuki Yoshikawa, T. Narumi","doi":"10.1109/SC.Companion.2012.146","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.146","url":null,"abstract":"GPGPU (General-purpose computing on graphics processing units) has several difficulties when used in cloud environment, such as narrow bandwidth, higher cost, and lower security, compared with computation using only CPUs. Most high performance computing applications require huge communication between nodes, and do not fit a cloud environment, since network topology and its bandwidth are not fixed and they affect the performance of the application program. However, there are some applications for which little communication is needed, such as molecular dynamics (MD) simulation with the replica exchange method (REM). For such applications, we propose DS-CUDA (Distributed-shared compute unified device architecture), a middleware to use many GPUs in a cloud environment with lower cost and higher security. It virtualizes GPUs in a cloud such that they appear to be locally installed GPUs in a client machine. Its redundant mechanism ensures reliable calculation with consumer GPUs, which reduce the cost greatly. It also enhances the security level since no data except command and data for GPUs are stored in the cloud side. REM-MD simulation with 64 GPUs showed 58 and 36 times more speed than a locally-installed GPU via InfiniBand and the Internet, respectively.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"132 1","pages":"1207-1214"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80011066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
Explosive Charge Blowing a Hole in a Steel Plate Animation 炸药在钢板上炸出一个洞动画
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.364
Bradley Carvey, Nathan Fabian, D. Rogers
{"title":"Explosive Charge Blowing a Hole in a Steel Plate Animation","authors":"Bradley Carvey, Nathan Fabian, D. Rogers","doi":"10.1109/SC.Companion.2012.364","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.364","url":null,"abstract":"The animation shows a simulation of an explosive charge, blowing a hold in a steel plate. The simulation data was generated on Sandia National Lab's Red Sky Supercomputer. ParaView was used to export polygonal data, which was then textured and rendered using a commercial 3d rendering package.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"101 1","pages":"1576-1577"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80416795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Poster: Numeric Based Ordering for Preconditioned Conjugate Gradient 海报:预条件共轭梯度的基于数值的排序
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.309
J. Booth
{"title":"Poster: Numeric Based Ordering for Preconditioned Conjugate Gradient","authors":"J. Booth","doi":"10.1109/SC.Companion.2012.309","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.309","url":null,"abstract":"The ordering of a matrix vastly impact the convergence rate of precondition conjugate gradient method. Past ordering methods focus solely on a graph representation of the sparse matrix and do not give an inside into the convergence rate that is linked to the preconditioned eigenspectrum. This work attempt to investigate how numerical based ordering may produce a better preconditioned system in terms of faster convergence.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"91 1","pages":"1534-1534"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79963520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Load Balanced Parallel GPU Out-of-Core for Continuous LOD Model Visualization 负载均衡并行GPU核外连续LOD模型可视化
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.37
Chao Peng, Peng Mi, Yong Cao
{"title":"Load Balanced Parallel GPU Out-of-Core for Continuous LOD Model Visualization","authors":"Chao Peng, Peng Mi, Yong Cao","doi":"10.1109/SC.Companion.2012.37","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.37","url":null,"abstract":"Rendering massive 3D models has been recognized as a challenging task. Due to the limited size of GPU memory, a massive model with hundreds of millions of primitives cannot fit into most of modern GPUs. By applying parallel Level-Of-Detail (LOD), as proposed in [1], transferring only a portion of primitives rather than the whole to the GPU is sufficient for generating a desired simplified version of the model. However, the low bandwidth in CPU-GPU communication make data-transferring a very time-consuming process that prevents users from achieving high-performance rendering of massive 3D models on a single-GPU system. This paper explores a device-level parallel design that distributes the workloads in a multi-GPU multi-display system. Our multi-GPU out-of-core uses a load-balancing method and seamlessly integrates with the parallel LOD algorithm. Our experiments show highly interactive frame rates of the “Boeing 777” airplane model that consists of over 332 million triangles and over 223 million vertices.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"34 1","pages":"215-223"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81349994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Poster: Matrix Decomposition Based Conjugate Gradient Solver for Poisson Equation 海报:基于矩阵分解的泊松方程共轭梯度求解器
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.287
Hang Liu, J. Seo, R. Mittal
{"title":"Poster: Matrix Decomposition Based Conjugate Gradient Solver for Poisson Equation","authors":"Hang Liu, J. Seo, R. Mittal","doi":"10.1109/SC.Companion.2012.287","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.287","url":null,"abstract":"Finding a fast solver for the Poisson equation is important for many scientific applications. In this work, we design and develop a matrix decomposition based Conjugate Gradient (CG) solver, which leverages Graphics Processing Unit (GPU) clusters to accelerate the calculation of the Poisson equation. Our experiments show that the new CG solver is highly scalable and achieves significant speedup over a CPU-based Multi-Grid (MG) solver.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"18 1","pages":"1501-1501"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89500732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
High Performance Implementation of an Econometrics and Financial Application on GPUs 基于gpu的计量经济学和金融应用的高性能实现
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.138
M. Creel, M. Zubair
{"title":"High Performance Implementation of an Econometrics and Financial Application on GPUs","authors":"M. Creel, M. Zubair","doi":"10.1109/SC.Companion.2012.138","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.138","url":null,"abstract":"In this paper, we describe a GPU based implementation for an estimator based on an indirect likelihood inference method. This method relies on simulations from a model and on nonparametric density or regression function computations. The estimation application arises in various domains such as econometrics and finance, when the model is fully specified, but too complex for estimation by maximum likelihood. We implemented the estimator on a machine with two 2.67GHz Intel Xeon X5650 processors and four NVIDIA M2090 GPU devices. We optimized the GPU code by efficient use of shared memory and registers available on the GPU devices. We compared the optimized GPU code performance with a C based sequential version of the code that was executed on the host machine. We observed a speed up factor of up to 242 with four GPU devices.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"os-27 1","pages":"1147-1153"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87212408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A Python HPC Framework: PyTrilinos, ODIN, and Seamless Python高性能计算框架:PyTrilinos, ODIN和Seamless
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.83
K. W. Smith, W. Spotz, S. Ross-Ross
{"title":"A Python HPC Framework: PyTrilinos, ODIN, and Seamless","authors":"K. W. Smith, W. Spotz, S. Ross-Ross","doi":"10.1109/SC.Companion.2012.83","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.83","url":null,"abstract":"We present three Python software projects: PyTrilinos, for calling Trilinos distributed memory HPC solvers from Python; Optimized Distributed NumPy (ODIN), for distributed array computing; and Seamless, for automatic, Just-in-time compilation of Python source code. We argue that these three projects in combination provide a framework for high-performance computing in Python. They provide this framework by supplying necessary features (in the case of ODIN and Seamless) and algorithms (in the case of ODIN and PyTrilinos) for a user to develop HPC applications. Together they address the principal limitations (real or imagined) ascribed to Python when applied to high-performance computing. A high-level overview of each project is given, including brief explanations as to how these projects work in conjunction to the benefit of end users.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"34 1","pages":"593-599"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89436812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards Improving the Communication Performance of CRESTA's Co-Design Application NEK5000 提高CRESTA协同设计应用NEK5000的通信性能
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.92
Michael Schliephake, E. Laure
{"title":"Towards Improving the Communication Performance of CRESTA's Co-Design Application NEK5000","authors":"Michael Schliephake, E. Laure","doi":"10.1109/SC.Companion.2012.92","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.92","url":null,"abstract":"In order to achieve exascale performance, all aspects of applications and system software need to be analysed and potentially improved. The EU FP7 project “Collaborative Research into Exascale Systemware, Tools & Applications” (CRESTA) uses co-design of advanced simulation applications and system software as well as related development tools as a key element in its approach towards exascale. In this paper we present first results of a co-design activity using the highly scalable application NEK5000. We have analysed the communication structure of NEK5000 and propose new, optimised collective communication operations that will allow to improve the performance of NEK5000 and to prepare it for the use on several millions of cores available in future HPC systems. The latency-optimised communication operations can also be beneficial in other contexts, for instance we expect them to become an important building block for a runtime-system providing dynamic load balancing, also under development within CRESTA.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"106 1","pages":"669-674"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87902707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The PEPPHER Composition Tool: Performance-Aware Dynamic Composition of Applications for GPU-Based Systems PEPPHER组合工具:基于gpu系统的应用程序的性能感知动态组合
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.97
Usman Dastgeer, Lu Li, C. Kessler
{"title":"The PEPPHER Composition Tool: Performance-Aware Dynamic Composition of Applications for GPU-Based Systems","authors":"Usman Dastgeer, Lu Li, C. Kessler","doi":"10.1109/SC.Companion.2012.97","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.97","url":null,"abstract":"The PEPPHER component model defines an environment for annotation of native C/C++ based components for homogeneous and heterogeneous multicore and manycore systems, including GPU and multi-GPU based systems. For the same computational functionality, captured as a component, different sequential and explicitly parallel implementation variants using various types of execution units might be provided, together with metadata such as explicitly exposed tunable parameters. The goal is to compose an application from its components and variants such that, depending on the run-time context, the most suitable implementation variant will be chosen automatically for each invocation. We describe and evaluate the PEPPHER composition tool, which explores the application's components and their implementation variants, generates the necessary low-level code that interacts with the runtime system, and coordinates the native compilation and linking of the various code units to compose the overall application code. With several applications, we demonstrate how the composition tool provides a high-level programming front-end while effectively utilizing the task-based PEPPHER runtime system (StarPU) underneath.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"2 1","pages":"711-720"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84090326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Towards Energy Efficient Data Intensive Computing Using IEEE 802.3az 使用IEEE 802.3az实现节能数据密集型计算
2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.112
Dimitar Pavlov, Joris Soeurt, P. Grosso, Zhiming Zhao, K. V. D. Veldt, Hao Zhu, C. D. Laat
{"title":"Towards Energy Efficient Data Intensive Computing Using IEEE 802.3az","authors":"Dimitar Pavlov, Joris Soeurt, P. Grosso, Zhiming Zhao, K. V. D. Veldt, Hao Zhu, C. D. Laat","doi":"10.1109/SC.Companion.2012.112","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.112","url":null,"abstract":"Energy efficiency is an increasingly important requirement for computing and communication systems, especially with their increasing pervasiveness. The IEEE 802.3az protocol reduces the network energy consumption by turning active copper Ethernet links to a low power model when no traffic exists. However, the effect of 802.3az heavily depends on the network traffic patterns which makes system level energy optimization challenging. In clusters, distributed data intensive applications that generate heavy network traffic are common, and in turn the required network devices can consume large amounts of energy. In this research, we examined the 802.3az technology with the goal of applying it in clusters. We defined an energy budget calculator that takes energy-efficient Ethernet into account by including the energy models derived from tests of 802.3az enabled devices. The calculator is an integral tool in a global strategy to optimize the energy usage of applications in a high performance computing environment. We show a few practical examples of how real applications can better plan their execution by integrating this knowledge in their decision strategies.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"1 1","pages":"806-810"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86001164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信