IEEE International Symposium on High-Performance Parallel Distributed Computing最新文献

筛选
英文 中文
Exclusive squashing for thread-level speculation 线程级推测的排他性压缩
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2011-06-08 DOI: 10.1145/1996130.1996172
Álvaro García-Yágüez, D. Ferraris, Arturo González-Escribano
{"title":"Exclusive squashing for thread-level speculation","authors":"Álvaro García-Yágüez, D. Ferraris, Arturo González-Escribano","doi":"10.1145/1996130.1996172","DOIUrl":"https://doi.org/10.1145/1996130.1996172","url":null,"abstract":"Speculative parallelization is a runtime technique that optimistically executes sequential code in parallel, checking that no dependence violations appear. In this paper, we address the problem of minimizing the number of threads that should be restarted when a data dependence violation is found. We present a new mechanism that keeps track of inter-thread dependencies in order to selectively stop and restart offending threads, and all threads that have consumed data from them. Results show a reduction of 38.5% to 81.8% in the number of restarted threads for real application loops and up to a 10% speedup, depending on the amount of local computation.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"960 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114311980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Interactivity on the grid: limitations and opportunities 网格上的交互性:限制与机遇
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2011-06-08 DOI: 10.1145/1996130.1996171
M. Meoni
{"title":"Interactivity on the grid: limitations and opportunities","authors":"M. Meoni","doi":"10.1145/1996130.1996171","DOIUrl":"https://doi.org/10.1145/1996130.1996171","url":null,"abstract":"Grid computing has the intrinsic disadvantages of a batch system: the undetermined delay between the time a job is submitted and the time it is completed. Our approach to this problem is iGrid, a framework that proposes to change the Grid from a batch system to a more interactive distributed platform. Agents are deployed on nodes and wait to be contacted by an interactive job to start the computation, thus bypassing the regular Grid scheduler. A CPU fairshare mechanism allows users to get their fair iGrid machine share over a long period. Our prototype implementation scales to all 400 Grid nodes we were granted executing on top a software for data intensive parallel computing","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125583657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy-efficient E-puting everywhere 节能环保无处不在
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2011-06-08 DOI: 10.1145/1996130.1996146
Wu-chun Feng
{"title":"Energy-efficient E-puting everywhere","authors":"Wu-chun Feng","doi":"10.1145/1996130.1996146","DOIUrl":"https://doi.org/10.1145/1996130.1996146","url":null,"abstract":"Throughout the 1990s and much of the 2000s, the halls of high-performance computing (HPC) echoed with sentiments like the following: \"In HPC, no one cares about energy efficiency or power consumption, and no one ever will.\" While such extreme talk has subsided, computational performance (or speed) via parallelism still rule the roost. Conversely, one could argue that the consumer electronics space has taken a complementary approach, where energy efficiency and power consumption have been first-order design constraints, with speed only needing to be \"good enough\" for ordinary daily tasks. However, the increasing computational demands that end users will place on (consumer) electronics, such as computations for personalized medicine, point to the need for \"supercomputing in small spaces\" (http://sss.cs.vt.edu/). This trend, in turn, will elevate performance to be a first-order design constraint in consumer electronics, on par with energy efficiency and power consumption. This talk will discuss how a \"trickle-up\" approach will deliver supercomputing in small spaces via an increasingly converged world of energy-efficient (consumer) electronics and computing, or e-puting.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125472777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward high performance computing in unconventional computing environments 在非常规计算环境中实现高性能计算
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI: 10.1145/1851476.1851569
Brent Rood, N. Gnanasambandam, M. Lewis, Naveen Sharma
{"title":"Toward high performance computing in unconventional computing environments","authors":"Brent Rood, N. Gnanasambandam, M. Lewis, Naveen Sharma","doi":"10.1145/1851476.1851569","DOIUrl":"https://doi.org/10.1145/1851476.1851569","url":null,"abstract":"Parallel computing on volatile distributed resources requires schedulers that consider job and resource characteristics. We study unconventional computing environments containing devices spread throughout a single large organization. The devices are not necessarily typical general purpose machines; instead, they could be processors dedicated to special purpose tasks (for example printing and document processing), but capable of being leveraged for distributed computations. Harvesting their idle cycles can simultaneously help resources cooperate to perform their primary task and enable additional functionality and services. A new burstiness metric characterizes the volatility of the high-priority native tasks. A burstiness-aware scheduling heuristic opportunistically introduces grid jobs (a lower priority workload class) to avoid the higher-priority native applications, and effectively harvests idle cycles. Simulations based on real workload traces indicate that this approach improves makespan by an average of 18.3% over random scheduling, and comes within 7.6% of the theoretical upper bound.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115722677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
GPU-based parallel householder bidiagonalization 基于gpu的并行户主双对角化
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI: 10.1145/1851476.1851512
Fangbing Liu, F. Seinstra
{"title":"GPU-based parallel householder bidiagonalization","authors":"Fangbing Liu, F. Seinstra","doi":"10.1145/1851476.1851512","DOIUrl":"https://doi.org/10.1145/1851476.1851512","url":null,"abstract":"In this paper, we discuss the GPU-based implementation and optimization of Householder bidiagonalization, a matrix factorization method which is an integral part of full Singular Value Decomposition (SVD) - an important algorithm for many problems in the research domain of Multimedia Content Analysis (MMCA). On cluster computers, complex adaptive run-time techniques often must be implemented to overcome the growing negative performance impact of load imbalances and to ensure reasonable speedup. We show that the nature of the many-core platform can avoid the necessity of applying such complex run-time parallelization techniques in software while achieving a performance of 64 gigaflops/s on a single-GPU GTX 295 in double precision, 82% of the theoretical peak performance.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124409401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Cluster-wide context switch of virtualized jobs 虚拟化作业的集群范围上下文切换
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI: 10.1145/1851476.1851574
Fabien Hermenier, A. Lèbre, Jean-Marc Menaud
{"title":"Cluster-wide context switch of virtualized jobs","authors":"Fabien Hermenier, A. Lèbre, Jean-Marc Menaud","doi":"10.1145/1851476.1851574","DOIUrl":"https://doi.org/10.1145/1851476.1851574","url":null,"abstract":"Clusters are mostly used through Resources Management Systems (RMS) with a static allocation of resources for a bounded amount of time. Those approaches are known to be insufficient for an efficient use of clusters. To provide a finer RMS, job preemption, migration and dynamic allocation of resources are required. However due to the complexity of developing and using such mechanisms, advanced scheduling strategies have rarely been deployed. This trend is currently evolving thanks to the use of migration and preemption capabilities of Virtual Machines (VMs). However, although the manipulation of jobs composed of VM enables to change the state of the jobs according to the scheduling objective, changing the state and the location of numerous VMs at each decision is tedious and degrades the overall performance. In addition to the scheduling policy implementation, developers have to focus on the feasibility of the actions while executing them in the most efficient way.\u0000 In this paper, we argue such an operation is independent from the policy itself and can be addressed through a generic mechanism, the cluster-wide context switch. Thanks to it, developers can implement sophisticated algorithms to schedule jobs without handling the issues related to their manipulations. They only focus on the implementation of their algorithm to select the jobs to run while the cluster-wide context switch system performs the necessary actions to switch from the current to the new situation. As a proof of concept, we evaluate the interest of the cluster-wide context switch through a sample scheduler that executes jobs as early as possible, even partially, regarding to their current resources requirements and their priority.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116951550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
GatorShare: a file system framework for high-throughput data management GatorShare:用于高吞吐量数据管理的文件系统框架
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI: 10.1145/1851476.1851588
Jiangyan Xu, R. Figueiredo
{"title":"GatorShare: a file system framework for high-throughput data management","authors":"Jiangyan Xu, R. Figueiredo","doi":"10.1145/1851476.1851588","DOIUrl":"https://doi.org/10.1145/1851476.1851588","url":null,"abstract":"Voluntary Computing systems or Desktop Grids (DGs) enable sharing of commodity computing resources across the globe and have gained tremendous popularity among scientific research communities. Data management is one of the major challenges of adopting the Voluntary Computing paradigm for large data-intensive applications. To date, middleware for supporting such applications either lacks an efficient cooperative data distribution scheme or cannot easily accommodate existing data-intensive applications due to the requirement for using middleware-specific APIs.\u0000 To address this challenge, in this paper we introduce Gator-Share, a data management framework that offers a file system interface and an extensible architecture designed to support multiple data transfer protocols, including BitTorrent, based on which we implement a cooperative data distribution service for DGs. It eases the integration with Desktop Grids and enables high-throughput data management for unmodified data-intensive applications. To improve the performance of BitTorrent in Desktop Grids, we have enhanced BitTorrent by making it fully decentralized and capable of supporting partial file downloading in an on-demand fashion.\u0000 To justify this approach we present a quantitative evaluation of the framework in terms of data distribution efficiency. Experimental results show that the framework significantly improves the data dissemination performance for unmodified data-intensive applications compared to a traditional client/server architecture.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"15 25","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120822578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
DistriBit: a distributed dynamic binary translator system for thin client computing DistriBit:用于瘦客户端计算的分布式动态二进制翻译系统
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI: 10.1145/1851476.1851577
Haibing Guan, Yindong Yang, Kai Chen, Y. Ge, Liang Liu, Ying Chen
{"title":"DistriBit: a distributed dynamic binary translator system for thin client computing","authors":"Haibing Guan, Yindong Yang, Kai Chen, Y. Ge, Liang Liu, Ying Chen","doi":"10.1145/1851476.1851577","DOIUrl":"https://doi.org/10.1145/1851476.1851577","url":null,"abstract":"Although dynamic binary translators (DBT) are gaining popularity in the modern virtual execution environments (VEE), the requirement of DBTs' processing and memory resources has seriously hampered the performance of host platform. In this paper, we propose a distributed DBT system--DistriBit for resource-limited thin clients to overcome these challenges.\u0000 Since thin client always has small memory and cannot cache all translated code, we divide its unified cache into a 2-level cache and design a dual locality cache management scheme to help thin client manage its translated code. Meanwhile, to improve the execution efficiency of thin client and reduce the overhead of network transmission, we not only optimize translated code on the server but also predict those thin client required code with a prediction scheme.\u0000 Experimental results show that our DistriBit system could effectively improve a thin client's performance of SPECint2000 by 2%~26% relative to a monolithic thin client, and our dual locality cache management scheme that results in miss reduction of about 1.41%~20.6% for a thin client with a 2-level cache over a thin client with a unified cache.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124897011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Towards optimising distributed data streaming graphs using parallel streams 面向使用并行流优化分布式数据流图
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI: 10.1145/1851476.1851583
C. Liew, M. Atkinson, Jano van Hemert, Liangxiu Han
{"title":"Towards optimising distributed data streaming graphs using parallel streams","authors":"C. Liew, M. Atkinson, Jano van Hemert, Liangxiu Han","doi":"10.1145/1851476.1851583","DOIUrl":"https://doi.org/10.1145/1851476.1851583","url":null,"abstract":"Modern scientific collaborations have opened up the opportunity of solving complex problems that involve multi-disciplinary expertise and large-scale computational experiments. These experiments usually involve large amounts of data that are located in distributed data repositories running various software systems, and managed by different organisations. A common strategy to make the experiments more manageable is executing the processing steps as a workflow. In this paper, we look into the implementation of fine-grained data-flow between computational elements in a scientific workflow as streams. We model the distributed computation as a directed acyclic graph where the nodes represent the processing elements that incrementally implement specific subtasks. The processing elements are connected in a pipelined streaming manner, which allows task executions to overlap. We further optimise the execution by splitting pipelines across processes and by introducing extra parallel streams. We identify performance metrics and design a measurement tool to evaluate each enactment. We conducted experiments to evaluate our optimisation strategies with a real world problem in the Life Sciences---EURExpress-II. The paper presents our distributed data-handling model, the optimisation and instrumentation strategies and the evaluation experiments. We demonstrate linear speed up and argue that this use of data-streaming to enable both overlapped pipeline and parallelised enactment is a generally applicable optimisation strategy.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126125416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Comparison of resource platform selection approaches for scientific workflows 科学工作流资源平台选择方法的比较
IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI: 10.1145/1851476.1851541
Yogesh L. Simmhan, L. Ramakrishnan
{"title":"Comparison of resource platform selection approaches for scientific workflows","authors":"Yogesh L. Simmhan, L. Ramakrishnan","doi":"10.1145/1851476.1851541","DOIUrl":"https://doi.org/10.1145/1851476.1851541","url":null,"abstract":"Cloud computing is increasingly considered as an additional computational resource platform for scientific workflows. The cloud offers opportunity to scale-out applications from desktops and local cluster resources. Each platform has different properties (e.g., queue wait times in high performance systems, virtual machine startup overhead in clouds) and characteristics (e.g., custom environments in cloud) that makes choosing from these diverse resource platforms for a workflow execution a challenge for scientists. Scientists are often faced with deciding resource platform selection trade-offs with limited information on the actual workflows. While many workflow planning methods have explored resource selection or task scheduling, these methods often require fine-scale characterization of the workflow that is onerous for a scientist. In this paper, we describe our early exploratory work in using blackbox characteristics for a cost-benefit analysis of using different resource platforms. In our blackbox method, we use only limited high-level information on the workflow length, width, and data sizes. The length and width are indicative of the workflow duration and parallelism. We compare the effectiveness of this approach to other resource selection models using two exemplar scientific workflows on desktop, local cluster, HPC center, and cloud platforms. Early results suggest that the blackbox model often makes the same resource selections as a more fine-grained whitebox model. We believe the simplicity of the blackbox model can help inform a scientist on the applicability of a new resource platform, such as cloud resources, even before porting an existing workflow.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"297 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123667500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信