Proceedings. IPDPS (Conference)最新文献

筛选
英文 中文
Predicting and Comparing the Performance of Array Management Libraries. 预测和比较数组管理库的性能。
Proceedings. IPDPS (Conference) Pub Date : 2020-05-01 Epub Date: 2020-07-14 DOI: 10.1109/ipdps47924.2020.00097
Donghe Kang, Oliver Rübel, Suren Byna, Spyros Blanas
{"title":"Predicting and Comparing the Performance of Array Management Libraries.","authors":"Donghe Kang,&nbsp;Oliver Rübel,&nbsp;Suren Byna,&nbsp;Spyros Blanas","doi":"10.1109/ipdps47924.2020.00097","DOIUrl":"https://doi.org/10.1109/ipdps47924.2020.00097","url":null,"abstract":"<p><p>Many applications are increasingly becoming I/O-bound. To improve scalability, analytical models of parallel I/O performance are often consulted to determine possible I/O optimizations. However, I/O performance modeling has predominantly focused on applications that directly issue I/O requests to a parallel file system or a local storage device. These I/O models are not directly usable by applications that access data through standardized I/O libraries, such as HDF5, FITS, and NetCDF, because a single I/O request to an object can trigger a cascade of I/O operations to different storage blocks. The I/O performance characteristics of applications that rely on these libraries is a complex function of the underlying data storage model, user-configurable parameters and object-level access patterns. As a consequence, I/O optimization is predominantly an ad-hoc process that is performed by application developers, who are often domain scientists with limited desire to delve into nuances of the storage hierarchy of modern computers. This paper presents an analytical cost model to predict the end-to-end execution time of applications that perform I/O through established array management libraries. The paper focuses on the HDF5 and Zarr array libraries, as examples of I/O libraries with radically different storage models: HDF5 stores every object in one file, while Zarr creates multiple files to store different objects. We find that accessing array objects via these I/O libraries introduces new overheads and optimizations. Specifically, in addition to I/O time, it is crucial to model the cost of transforming data to a particular storage layout (memory copy cost), as well as model the benefit of accessing a software cache. We evaluate the model on real applications that process observations (neuroscience) and simulation results (plasma physics). The evaluation on three HPC clusters reveals that I/O accounts for as little as 10% of the execution time in some cases, and hence models that only focus on I/O performance cannot accurately capture the performance of applications that use standard array storage libraries. In parallel experiments, our model correctly predicts the fastest storage library between HDF5 and Zarr 94% of the time, in contrast with 70% of the time for a cutting-edge I/O model.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"2020 ","pages":"906-915"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/ipdps47924.2020.00097","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39504794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
High-throughput Analysis of Large Microscopy Image Datasets on CPU-GPU Cluster Platforms. 在 CPU-GPU 集群平台上高通量分析大型显微镜图像数据集。
Proceedings. IPDPS (Conference) Pub Date : 2013-05-01 DOI: 10.1109/IPDPS.2013.11
George Teodoro, Tony Pan, Tahsin M Kurc, Jun Kong, Lee A D Cooper, Norbert Podhorszki, Scott Klasky, Joel H Saltz
{"title":"High-throughput Analysis of Large Microscopy Image Datasets on CPU-GPU Cluster Platforms.","authors":"George Teodoro, Tony Pan, Tahsin M Kurc, Jun Kong, Lee A D Cooper, Norbert Podhorszki, Scott Klasky, Joel H Saltz","doi":"10.1109/IPDPS.2013.11","DOIUrl":"10.1109/IPDPS.2013.11","url":null,"abstract":"<p><p>Analysis of large pathology image datasets offers significant opportunities for the investigation of disease morphology, but the resource requirements of analysis pipelines limit the scale of such studies. Motivated by a brain cancer study, we propose and evaluate a parallel image analysis application pipeline for high throughput computation of large datasets of high resolution pathology tissue images on distributed CPU-GPU platforms. To achieve efficient execution on these hybrid systems, we have built runtime support that allows us to express the cancer image analysis application as a hierarchical data processing pipeline. The application is implemented as a coarse-grain pipeline of stages, where each stage may be further partitioned into another pipeline of fine-grain operations. The fine-grain operations are efficiently managed and scheduled for computation on CPUs and GPUs using performance aware scheduling techniques along with several optimizations, including architecture aware process placement, data locality conscious task assignment, data prefetching, and asynchronous data copy. These optimizations are employed to maximize the utilization of the aggregate computing power of CPUs and GPUs and minimize data copy overheads. Our experimental evaluation shows that the cooperative use of CPUs and GPUs achieves significant improvements on top of GPU-only versions (up to 1.6×) and that the execution of the application as a set of fine-grain operations provides more opportunities for runtime optimizations and attains better performance than coarser-grain, monolithic implementations used in other works. An implementation of the cancer image analysis pipeline using the runtime support was able to process an image dataset consisting of 36,848 4Kx4K-pixel image tiles (about 1.8TB uncompressed) in less than 4 minutes (150 tiles/second) on 100 nodes of a state-of-the-art hybrid cluster system.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"2013 ","pages":"103-114"},"PeriodicalIF":0.0,"publicationDate":"2013-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240318/pdf/nihms-608079.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32833487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems. 在配备 CPU-GPU 的并行系统上加速大规模图像分析。
Proceedings. IPDPS (Conference) Pub Date : 2012-05-01 DOI: 10.1109/IPDPS.2012.101
George Teodoro, Tahsin M Kurc, Tony Pan, Lee A D Cooper, Jun Kong, Patrick Widener, Joel H Saltz
{"title":"Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems.","authors":"George Teodoro, Tahsin M Kurc, Tony Pan, Lee A D Cooper, Jun Kong, Patrick Widener, Joel H Saltz","doi":"10.1109/IPDPS.2012.101","DOIUrl":"10.1109/IPDPS.2012.101","url":null,"abstract":"<p><p>The past decade has witnessed a major paradigm shift in high performance computing with the introduction of accelerators as general purpose processors. These computing devices make available very high parallel computing power at low cost and power consumption, transforming current high performance platforms into heterogeneous CPU-GPU equipped systems. Although the theoretical performance achieved by these hybrid systems is impressive, taking practical advantage of this computing power remains a very challenging problem. Most applications are still deployed to either GPU or CPU, leaving the other resource under- or un-utilized. In this paper, we propose, implement, and evaluate a performance aware scheduling technique along with optimizations to make efficient collaborative use of CPUs and GPUs on a parallel system. In the context of feature computations in large scale image analysis applications, our evaluations show that intelligently co-scheduling CPUs and GPUs can significantly improve performance over GPU-only or multi-core CPU-only approaches.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"2012 ","pages":"1093-1104"},"PeriodicalIF":0.0,"publicationDate":"2012-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240502/pdf/nihms-608071.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32833486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel Mapping Approaches for GNUMAP. GNUMAP的并行映射方法。
Proceedings. IPDPS (Conference) Pub Date : 2011-01-01 DOI: 10.1109/ipdps.2011.184
Nathan L Clement, Mark J Clement, Quinn Snell, W Evan Johnson
{"title":"Parallel Mapping Approaches for GNUMAP.","authors":"Nathan L Clement,&nbsp;Mark J Clement,&nbsp;Quinn Snell,&nbsp;W Evan Johnson","doi":"10.1109/ipdps.2011.184","DOIUrl":"https://doi.org/10.1109/ipdps.2011.184","url":null,"abstract":"<p><p>Mapping short next-generation reads to reference genomes is an important element in SNP calling and expression studies. A major limitation to large-scale whole-genome mapping is the large memory requirements for the algorithm and the long run-time necessary for accurate studies. Several parallel implementations have been performed to distribute memory on different processors and to equally share the processing requirements. These approaches are compared with respect to their memory footprint, load balancing, and accuracy. When using MPI with multi-threading, linear speedup can be achieved for up to 256 processors.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"2011 ","pages":"435-443"},"PeriodicalIF":0.0,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/ipdps.2011.184","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31227950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Optimization of Applications with Non-blocking Neighborhood Collectives via Multisends on the Blue Gene/P Supercomputer. 蓝基因/P超级计算机上基于multisend的无阻塞邻域集体应用优化。
Proceedings. IPDPS (Conference) Pub Date : 2010-04-19 DOI: 10.1109/IPDPS.2010.5470407
Sameer Kumar, Philip Heidelberger, Dong Chen, Michael Hines
{"title":"Optimization of Applications with Non-blocking Neighborhood Collectives via Multisends on the Blue Gene/P Supercomputer.","authors":"Sameer Kumar, Philip Heidelberger, Dong Chen, Michael Hines","doi":"10.1109/IPDPS.2010.5470407","DOIUrl":"10.1109/IPDPS.2010.5470407","url":null,"abstract":"<p><p>We explore the multisend interface as a data mover interface to optimize applications with neighborhood collective communication operations. One of the limitations of the current MPI 2.1 standard is that the vector collective calls require counts and displacements (zero and nonzero bytes) to be specified for all the processors in the communicator. Further, all the collective calls in MPI 2.1 are blocking and do not permit overlap of communication with computation. We present the record replay persistent optimization to the multisend interface that minimizes the processor overhead of initiating the collective. We present four different case studies with the multisend API on Blue Gene/P (i) 3D-FFT, (ii) 4D nearest neighbor exchange as used in Quantum Chromodynamics, (iii) NAMD and (iv) neural network simulator NEURON. Performance results show 1.9× speedup with 32(3) 3D-FFTs, 1.9× speedup for 4D nearest neighbor exchange with the 2(4) problem, 1.6× speedup in NAMD and almost 3× speedup in NEURON with 256K cells and 1k connections/cell.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"2010 ","pages":"1-11"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111918/pdf/nihms244867.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30232712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Architectural Implications for Spatial Object Association Algorithms. 空间对象关联算法的结构影响
Proceedings. IPDPS (Conference) Pub Date : 2009-01-01 DOI: 10.1109/IPDPS.2009.5161078
Vijay S Kumar, Tahsin Kurc, Joel Saltz, Ghaleb Abdulla, Scott R Kohn, Celeste Matarazzo
{"title":"Architectural Implications for Spatial Object Association Algorithms.","authors":"Vijay S Kumar, Tahsin Kurc, Joel Saltz, Ghaleb Abdulla, Scott R Kohn, Celeste Matarazzo","doi":"10.1109/IPDPS.2009.5161078","DOIUrl":"10.1109/IPDPS.2009.5161078","url":null,"abstract":"<p><p>Spatial object association, also referred to as crossmatch of spatial datasets, is the problem of identifying and comparing objects in two or more datasets based on their positions in a common spatial coordinate system. In this work, we evaluate two crossmatch algorithms that are used for astronomical sky surveys, on the following database system architecture configurations: (1) Netezza Performance Server<sup>®</sup>, a parallel database system with active disk style processing capabilities, (2) MySQL Cluster, a high-throughput network database system, and (3) a hybrid configuration consisting of a collection of independent database system instances with data replication support. Our evaluation provides insights about how architectural characteristics of these systems affect the performance of the spatial crossmatch algorithms. We conducted our study using real use-case scenarios borrowed from a large-scale astronomy application known as the Large Synoptic Survey Telescope (LSST).</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":" ","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4324583/pdf/nihms436247.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33064686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Translational Research Design Templates, Grid Computing, and HPC. 转化研究设计模板,网格计算和高性能计算。
Proceedings. IPDPS (Conference) Pub Date : 2008-05-01 DOI: 10.1109/IPDPS.2008.4536089
Joel Saltz, Scott Oster, Shannon Hastings, Stephen Langella, Renato Ferreira, Justin Permar, Ashish Sharma, David Ervin, Tony Pan, Umit Catalyurek, Tahsin Kurc
{"title":"Translational Research Design Templates, Grid Computing, and HPC.","authors":"Joel Saltz, Scott Oster, Shannon Hastings, Stephen Langella, Renato Ferreira, Justin Permar, Ashish Sharma, David Ervin, Tony Pan, Umit Catalyurek, Tahsin Kurc","doi":"10.1109/IPDPS.2008.4536089","DOIUrl":"10.1109/IPDPS.2008.4536089","url":null,"abstract":"<p><p>Design templates that involve discovery, analysis, and integration of information resources commonly occur in many scientific research projects. In this paper we present examples of design templates from the biomedical translational research domain and discuss the requirements imposed on Grid middleware infrastructures by them. Using caGrid, which is a Grid middleware system based on the model driven architecture (MDA) and the service oriented architecture (SOA) paradigms, as a starting point, we discuss architecture directions for MDA and SOA based systems like caGrid to support common design templates.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"2008 14-18 April 2008","pages":"1-15"},"PeriodicalIF":0.0,"publicationDate":"2008-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3035416/pdf/nihms83717.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29664470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of Current BLAST Software on Nucleotide Sequences. 当前BLAST软件在核苷酸序列上的比较。
Proceedings. IPDPS (Conference) Pub Date : 2005-04-04 DOI: 10.1109/IPDPS.2005.145
I Elizabeth Cha, Eric C Rouchka
{"title":"Comparison of Current BLAST Software on Nucleotide Sequences.","authors":"I Elizabeth Cha, Eric C Rouchka","doi":"10.1109/IPDPS.2005.145","DOIUrl":"10.1109/IPDPS.2005.145","url":null,"abstract":"<p><p>The computational power needed for searching exponentially growing databases, such as GenBank, has increased dramatically. Three different implementations of the most widely used sequence alignment tool, known as BLAST (Basic Local Alignment Search Tool), are studied for their efficiency on nucleotide-nucleotide comparisons. The performance of these implementations are evaluated using target databases and query sequences of varying lengths and number of entries constructed from human genomic and EST sequences. In general, WU BLAST was found to be most efficient when the database and query composition are unknown. NCBI BLAST appears to work best when the database contains a small number of sequences, while mpiBLAST shows the power of database distribution when the number of bases per target database is large. The optimal number of compute nodes in mpiBLAST varies depending upon the database, yet in the cases studied, remains surprisingly low.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"19 ","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2005-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3021256/pdf/nihms137493.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29605649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Orientation Refinement of Virus Structures with Unknown Symmetry. 未知对称性病毒结构的定向细化。
Proceedings. IPDPS (Conference) Pub Date : 2003-04-22 DOI: 10.1109/IPDPS.2003.1213138
Yongchang Ji, Dan C Marinescu, Wei Zhang, Timothy S Baker
{"title":"Orientation Refinement of Virus Structures with Unknown Symmetry.","authors":"Yongchang Ji,&nbsp;Dan C Marinescu,&nbsp;Wei Zhang,&nbsp;Timothy S Baker","doi":"10.1109/IPDPS.2003.1213138","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213138","url":null,"abstract":"<p><p>Structural biology, in particular the structure determination of viruses and other large macromolecular complexes leads to data- and compute-intensive problems that require resources well beyond those available on a single system. Thus, there is an imperative need to develop parallel algorithms and programs for clusters and computational grids. We present one of the most challenging computational problems posed by the three-dimensional structure determination of viruses, the orientation refinement.</p>","PeriodicalId":89233,"journal":{"name":"Proceedings. IPDPS (Conference)","volume":"2003 ","pages":"1530-2075"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/IPDPS.2003.1213138","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32968945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信