Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing最新文献

筛选
英文 中文
Preparing for Supercomputing's Sixth Wave 为超级计算的第六次浪潮做准备
J. Vetter
{"title":"Preparing for Supercomputing's Sixth Wave","authors":"J. Vetter","doi":"10.1145/2907294.2911994","DOIUrl":"https://doi.org/10.1145/2907294.2911994","url":null,"abstract":"After five decades of sustained progress, Moore's law appears to be reaching its limits. In order to sustain the dramatic improvements to which we have become accustomed, computing will need to transform to Kurzweil's sixth wave of computing. The supercomputing community will likely need to re-think most of its fundamental technologies and tools, spanning innovative materials and devices, circuits, system architectures, programming systems, system software, and applications. We already see evidence of this transition in the move to new architectures that employ heterogeneous processing, non-volatile memory, multimode memory hierarchies, and optical interconnection networks. In this talk, I will recap progress in these areas over the past three decades, discuss current solutions, and contemplate various future technologies that our community will need for continued progress in supercomputing.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85521054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wiera: Towards Flexible Multi-Tiered Geo-Distributed Cloud Storage Instances Wiera:迈向灵活的多层地理分布式云存储实例
Kwangsung Oh, A. Chandra, J. Weissman
{"title":"Wiera: Towards Flexible Multi-Tiered Geo-Distributed Cloud Storage Instances","authors":"Kwangsung Oh, A. Chandra, J. Weissman","doi":"10.1145/2907294.2907322","DOIUrl":"https://doi.org/10.1145/2907294.2907322","url":null,"abstract":"Geo-distributed cloud storage systems must tame complexity at many levels: uniform APIs for storage access, supporting flexible storage policies that meet a wide array of application metrics, handling uncertain network dynamics and access dynamism, and operating across many levels of heterogeneity both within and across data-centers. In this paper, we present an integrated solution called Wiera. Wiera extends our earlier cloud storage system, Tiera, that is targeted to multi-tiered policy-based single cloud storage, to the wide-area and multiple data-centers (even across different providers). Wiera enables the specification of global data management policies built on top of local Tiera policies. Such policies enable the user to optimize for cost, performance, reliability, durability, and consistency, both within and across data-centers, and to express their tradeoffs. A key aspect of Wiera is first-class support for dynamism due to network, workload, and access patterns changes. Wiera policies can adapt to changes in user workload, poorly performing data tiers, failures, and changes in user metrics (e.g., cost). Wiera allows unmodified applications to reap the benefits of flexible data/storage policies by externalizing the policy specification. As far as we know, Wiera is the first geo-distributed cloud storage system which handles dynamism actively at run-time. We show how Wiera enables a rich specification of dynamic policies using a concise notation and describe the design and implementation of the system. We have implemented a Wiera prototype on multiple cloud environments, AWS and Azure, that illustrates potential benefits from managing dynamics and in using multiple cloud storage tiers both within and across data-centers.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75306171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Faster and Cheaper: Parallelizing Large-Scale Matrix Factorization on GPUs 更快更便宜:在gpu上并行化大规模矩阵分解
Wei Tan, Liangliang Cao, L. Fong
{"title":"Faster and Cheaper: Parallelizing Large-Scale Matrix Factorization on GPUs","authors":"Wei Tan, Liangliang Cao, L. Fong","doi":"10.1145/2907294.2907297","DOIUrl":"https://doi.org/10.1145/2907294.2907297","url":null,"abstract":"Matrix factorization (MF) is used by many popular algorithms such as collaborative filtering. GPU with massive cores and high memory bandwidth sheds light on accelerating MF much further when appropriately exploiting its architectural characteristics. This paper presents cuMF, a CUDA-based matrix factorization library that optimizes alternate least square (ALS) method to solve very large-scale MF. CuMF uses a set of techniques to maximize the performance on single and multiple GPUs. These techniques include smart access of sparse data leveraging GPU memory hierarchy, using data parallelism in conjunction with model parallelism, minimizing the communication overhead among GPUs, and a novel topology-aware parallel reduction scheme. With only a single machine with four Nvidia GPU cards, cuMF can be 6-10 times as fast, and 33-100 times as cost-efficient, compared with the state-of-art distributed CPU solutions. Moreover, cuMF can solve the largest matrix factorization problem ever reported in current literature, with impressively good performance.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"53 70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78112394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Interpretation of Chinese Address Information Based on Multi-factor Inference 基于多因素推理的中文地址信息解释
Xiaolin Li, Yanhui Duan, Huabing Zhou, Yi Zhang
{"title":"Interpretation of Chinese Address Information Based on Multi-factor Inference","authors":"Xiaolin Li, Yanhui Duan, Huabing Zhou, Yi Zhang","doi":"10.1109/ISPDC.2016.72","DOIUrl":"https://doi.org/10.1109/ISPDC.2016.72","url":null,"abstract":"","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"12 1","pages":"420-424"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72996168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Continuous Self-Checking Validation Framework on Processor Exceptions 处理器异常的连续自检验证框架
Jian Tan, Daifeng Li
{"title":"A Continuous Self-Checking Validation Framework on Processor Exceptions","authors":"Jian Tan, Daifeng Li","doi":"10.1109/ISPDC.2016.52","DOIUrl":"https://doi.org/10.1109/ISPDC.2016.52","url":null,"abstract":"","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"25 1","pages":"314-318"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74455236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resource Efficiency to Partition Big Streamed Graphs 大流图分区的资源效率
Víctor Medel Gracia, Unai Arronategui Arribalzaga
{"title":"Resource Efficiency to Partition Big Streamed Graphs","authors":"Víctor Medel Gracia, Unai Arronategui Arribalzaga","doi":"10.1109/ISPDC.2015.21","DOIUrl":"https://doi.org/10.1109/ISPDC.2015.21","url":null,"abstract":"Real time streaming and processing of big graphs is a relevant and challenging application to be executed in a Cloud infrastructure. We have analysed the amount of resources needed to partition large streamed graphs with different distributed architectures. We have improved state of the art limitations proposing a decentralised and scalable model which is more efficient in memory usage, network traffic and number of processing machines. The improvement has been achieved summarising incoming vertices of the graph and accessing to local information of the already partitioned graph. Classical approaches need all information about the previous vertices. In our system, local information is updated in a feedback scheme periodically. Our experimental results show that current architectures cannot process large scale streamed graphs due to memory limitations. We have proved that our architecture reduces the number of needed machines by seven because it accesses to local memory instead of a distributed one. The total memory size has been also reduced. Finally, our model allows to adjust the quality of the partition solution to the desired amount of memory and network traffic.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"38 1","pages":"120-129"},"PeriodicalIF":0.0,"publicationDate":"2015-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78958158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving GPU Performance Through Resource Sharing 通过资源共享提升GPU性能
Vishwesh Jatala, Jayvant Anantpur, Amey Karkare
{"title":"Improving GPU Performance Through Resource Sharing","authors":"Vishwesh Jatala, Jayvant Anantpur, Amey Karkare","doi":"10.1145/2907294.2907298","DOIUrl":"https://doi.org/10.1145/2907294.2907298","url":null,"abstract":"Graphics Processing Units (GPUs) consisting of Streaming Multiprocessors (SMs) achieve high throughput by running a large number of threads and context switching among them to hide execution latencies. The number of thread blocks, and hence the number of threads that can be launched on an SM, depends on the resource usage--e.g. number of registers, amount of shared memory--of the thread blocks. Since the allocation of threads to an SM is at the thread block granularity, some of the resources may not be used up completely and hence will be wasted. We propose an approach that shares the resources of SM to utilize the wasted resources by launching more thread blocks. We show the effectiveness of our approach for two resources: register sharing, and scratchpad (shared memory) sharing. We further propose optimizations to hide long execution latencies, thus reducing the number of stall cycles. We implemented our approach in GPGPU-Sim simulator and experimentally validated it on 19 applications from 4 different benchmark suites: GPGPU-Sim, Rodinia, CUDA-SDK, and Parboil. We observed that applications that underutilize register resource show a maximum improvement of 24% and an average improvement of 11% with register sharing. Similarly, the applications that underutilize scratchpad resource show a maximum improvement of 30% and an average improvement of 12.5% with scratchpad sharing. The remaining applications, which do not waste any resources, perform similar to the baseline approach.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"89 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77406955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
E-NEXT: Network of Excellence - Emerging Network Technologies E-NEXT:卓越网络-新兴网络技术
D. Grigoras
{"title":"E-NEXT: Network of Excellence - Emerging Network Technologies","authors":"D. Grigoras","doi":"10.1109/ISPDC.2005.22","DOIUrl":"https://doi.org/10.1109/ISPDC.2005.22","url":null,"abstract":"E-NEXT is an EU FP6 network of excellence that focuses on Internet protocols and services. This short paper presents an overview of the network's goals, organization and achievements","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"84 1","pages":"9-10"},"PeriodicalIF":0.0,"publicationDate":"2005-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83452622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
New Challenges in Parallel Optimization 并行优化中的新挑战
E. Alba
{"title":"New Challenges in Parallel Optimization","authors":"E. Alba","doi":"10.1109/ISPDC.2005.36","DOIUrl":"https://doi.org/10.1109/ISPDC.2005.36","url":null,"abstract":"Parallelism and Optimization are two disciplines that are used together in numerous applications. Solving complex problems in optimization often means to face complex search landscapes, what needs time-consuming operations. Exact and heuristic techniques are being used nowadays to get solutions to problems in mathematics, logistics, bioinformatics, telecommunications, and many other relevant fields. For these tasks it is mandatory to deal with cluster computing in many cases, multiprocessors, and even with computational grids. In this talk I will address the basic challenges of using parallel tools, software, and hardware for extending existing optimization procedures to work in a parallel environment. I will present some basic optimization algorithms, especially heuristic ones, and discuss the application of parallelism to them. Also, I will show how new techniques become possible due to parallelism, giving birth to a whole new class of algorithms and new research lines.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"10 1","pages":"5"},"PeriodicalIF":0.0,"publicationDate":"2005-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81984892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Era in Computing: Moving Services onto Grid 计算的新时代:将服务转移到网格
Ian T Foster
{"title":"A New Era in Computing: Moving Services onto Grid","authors":"Ian T Foster","doi":"10.1109/ISPDC.2005.7","DOIUrl":"https://doi.org/10.1109/ISPDC.2005.7","url":null,"abstract":"The Grid seems to be everywhere, with announcements appearing almost every day of Grid products, sales, and deployments from major vendors. However, in spite of the popularity of the term, there is often confusion as to what the Grid is and what problems it solves. Is there any \"there there\" or is it all just marketing hype? In this talk, I will address these questions, describing what the Grid is, what problems it solves, and what technology has been developed to build Grid infrastructure and create Grid applications. I will review the current status of Grid infrastructure and deployment and give examples of where Grid technology is being used not only to perform current tasks better, but to provide fundamentally new capabilities that are not possible otherwise.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"1 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2005-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72863701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信