2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum最新文献

筛选
英文 中文
Systematic Reduction of Data Movement in Algebraic Multigrid Solvers 代数多网格解算器中数据移动的系统约简
Hormozd Gahvari, W. Gropp, K. E. Jordan, M. Schulz, U. Yang
{"title":"Systematic Reduction of Data Movement in Algebraic Multigrid Solvers","authors":"Hormozd Gahvari, W. Gropp, K. E. Jordan, M. Schulz, U. Yang","doi":"10.1109/IPDPSW.2013.164","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.164","url":null,"abstract":"Algebraic Multigrid (AMG) solvers find wide use in scientific simulation codes. Their ideal computational complexity makes them especially attractive for solving large problems on parallel machines. However, they also involve a substantial amount of data movement, posing challenges to performance and scalability. In this paper, we present an algorithm that provides a systematic means of reducing data movement in AMG. The algorithm operates by gathering and redistributing the problem data to reduce the need to move it on the communication-intensive coarse grid portion of AMG. The data is gathered in a way that ensures data locality by keeping data movement confined to specific regions of the machine. Any decision to gather data is made systematically through the means of a performance model. This approach results in substantial speedups on a multicore cluster when using AMG to solve a variety of test problems.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133710910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Adaptive Power and Resource Management Techniques for Multi-threaded Workloads 多线程工作负载的自适应电源和资源管理技术
Can Hankendi, A. Coskun
{"title":"Adaptive Power and Resource Management Techniques for Multi-threaded Workloads","authors":"Can Hankendi, A. Coskun","doi":"10.1109/IPDPSW.2013.258","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.258","url":null,"abstract":"As today's computing trends are moving towards the cloud, meeting the increasing computational demand while minimizing the energy costs in data centers has become essential. This work introduces two adaptive techniques to reduce the energy consumption of the computing clusters through power and resource management on multi-core processors. We first present a novel power capping technique to constrain the power consumption of computing nodes. Our technique combines Dynamic Voltage-Frequency Scaling (DVFS) and thread allocation on multi-core systems. By utilizing machine learning techniques, our power capping method is able to meet the power budgets 82% of the time without requiring any power measurement device and reduces the energy consumption by 51.6% on average in comparison to the state-of-the-art techniques. We then introduce an autonomous resource management technique for consolidated multi-threaded workloads running on multi-core servers. Our technique first classifies applications according to their energy efficiency measure, then proportionally allocates resources for co-scheduled applications to improve the energy efficiency. The proposed technique improves the energy efficiency by 17% in comparison to state-of-the-art co-scheduling policies.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133915763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Design Optimization of Energy- and Security-Critical Distributed Real-Time Embedded Systems 能源和安全关键分布式实时嵌入式系统的设计优化
Xia Zhang, Jinyu Zhan, Wei Jiang, Yuexi Ma, Ke Jiang
{"title":"Design Optimization of Energy- and Security-Critical Distributed Real-Time Embedded Systems","authors":"Xia Zhang, Jinyu Zhan, Wei Jiang, Yuexi Ma, Ke Jiang","doi":"10.1109/IPDPSW.2013.24","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.24","url":null,"abstract":"In this paper, we approach the design of energy-and security-critical distributed real-time embedded systems from the early mapping and scheduling phases. Modern Distributed Embedded Systems (DESs) are common to be connected to external networks, which is beneficial for various purposes, but also opens up the gate for potential security attacks. However, security protections in DESs result in significant time and energy overhead. In this work, we focus on the problem of providing the best confidentiality protection of internal communication in DESs under time and energy constraints. The complexity of finding the optimal solution grows exponentially as problem size grows. Therefore, we propose an efficient genetic algorithm based heuristic for solving the problem. Extensive experiments demonstrate the efficiency of the proposed technique.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134475960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Task Scheduling Greedy Heuristics for GPU Heterogeneous Cluster Involving the Weights of the Processor 涉及处理器权值的GPU异构集群任务调度贪心启发式算法
Keliang Zhang, Baifeng Wu
{"title":"Task Scheduling Greedy Heuristics for GPU Heterogeneous Cluster Involving the Weights of the Processor","authors":"Keliang Zhang, Baifeng Wu","doi":"10.1109/IPDPSW.2013.38","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.38","url":null,"abstract":"Modern GPUs are gradually used by more and more cluster computing systems as the high performance computing units due to their outstanding computational power, whereas bringing system-level (among different nodes) architectural heterogeneity to cluster. In this paper, based on MPI and CUDA programming model, we aim to investigate task scheduling for GPU heterogeneous cluster by taking into account the system-level heterogeneous characteristics and also involving the weights of the processor (both CPUs and GPUs). At first, based on our GPU heterogeneous cluster, we classify executing tasks to six major classifications according to their parallelism degrees, input data sizes, and processing workloads. Then, aiming to realize the approximately optimal mapping between tasks and computing resources, a task scheduling strategy is presented. In this paper, we present the WSLSA greedy heuristic which can involve the weights of the processor. Besides, we also define two measurement factors for the task assignments. One is the maximum value of total workloads for all task assignments to consider the maximum workloads for the GPU heterogeneity cluster. The other is the distribution of task assignments which can determine the load balance of the task assignments for the GPU heterogeneity cluster. The other is the distribution of task assignments which can determine the load balance of the task assignments for the GPU heterogeneity cluster.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133863491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Comparing Provisioning and Scheduling Strategies for Workflows on Clouds 比较云上工作流的供应和调度策略
M. Frîncu, S. Genaud, J. Gossa
{"title":"Comparing Provisioning and Scheduling Strategies for Workflows on Clouds","authors":"M. Frîncu, S. Genaud, J. Gossa","doi":"10.1109/IPDPSW.2013.55","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.55","url":null,"abstract":"Cloud computing is emerging as a leading solution for deploying on-demand applications in both the industry and the scientific community. An important problem which needs to be considered is that of scheduling tasks on existing resources. Since clouds are linked to grid systems much of the work done on the latter can be ported with some modifications due to specific aspects that concern clouds, e.g., virtualization, scalability and on-demand provisioning. Two types of applications are usually considered for cloud migration: bag-of-tasks and workflows. This paper deals with the second case and investigates the impact virtual machine provisioning policies have on the scheduling strategy when various workflow types and execution times are used. Five provisioning methods are proposed and tested on well known workflow scheduling algorithms such as CPA, Gain and HEFT. We show that some correlation between the application characteristics and provisioning method exists. This result paves the way for adaptive scheduling in which based on the workflow properties a specific provisioning can be applied in order to optimize execution times or costs.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"462 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113989294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Parallel Algorithms for Graph Optimization Using Tree Decompositions 基于树分解的并行图优化算法
Blair D. Sullivan, Dinesh Weerapurage, Chris Groër
{"title":"Parallel Algorithms for Graph Optimization Using Tree Decompositions","authors":"Blair D. Sullivan, Dinesh Weerapurage, Chris Groër","doi":"10.1109/IPDPSW.2013.242","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.242","url":null,"abstract":"Although many NP-hard graph optimization problems can be solved in polynomial time on graphs of bounded tree-width, the adoption of these techniques into mainstream scientific computation has been limited due to the high memory requirements of the dynamic programming tables and excessive runtimes of sequential implementations. This work addresses both challenges by proposing a set of new parallel algorithms for all steps of a tree decomposition-based approach to solve the maximum weighted independent set problem. A hybrid OpenMP/MPI implementation includes a highly scalable parallel dynamic programming algorithm leveraging the MADNESS task based runtime, and computational results demonstrate scaling. This work enables a significant expansion of the scale of graphs on which exact solutions to maximum weighted independent set can be obtained, and forms a framework for solving additional graph optimization problems with similar techniques.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115707593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Operation Synchronization Technique on Pipeline-Based Hardware Synthesis Applying Stream-Based Computing Framework 基于流计算框架的基于流水线的硬件综合操作同步技术
S. Yamagiwa, Ryoyu Watanabe, K. Wada
{"title":"Operation Synchronization Technique on Pipeline-Based Hardware Synthesis Applying Stream-Based Computing Framework","authors":"S. Yamagiwa, Ryoyu Watanabe, K. Wada","doi":"10.1109/IPDPSW.2013.61","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.61","url":null,"abstract":"Increasing the needs for real-time processing of information flood from environment surrounded us acquired by the advanced sensing technologies, any application acquiring such information requires a processing ability for large input dataflow that must be processed within a restricted short time. In order to achieve the required processing performance, we often consider pipeline-based hardware implementation. The timing for activating operators for input data in the design must be scheduled carefully arranging the timings for I/O among operators. However, when we need to revise the algorithm itself or any parameters, it is very hard to reschedule the activation timings of operators considering the design goals regarding the maximum frequency and the resource size. This paper shows a technique to schedule operation timings synchronizing the input data in a processing pipeline of hardware applying the stream-based computing framework using OpenCL kernel description. This paper especially proposes a new compiler-based approach for synthesizing the pipeline-based hardware applying a novel technique for calculating the timings at the operators' activation called the Pipeline Timing Adjustment (PTA). This paper mainly discusses the algorithm of the PTA and show the effect of the algorithm.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114106818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Abstracting the Communication Intent in Applications to Improve Portability and Productivity 抽象应用程序中的通信意图以提高可移植性和生产力
T. Mintz, Oscar R. Hernandez, Christos Kartsaklis, D. Bernholdt, M. Eisenbach, S. Pophale
{"title":"Toward Abstracting the Communication Intent in Applications to Improve Portability and Productivity","authors":"T. Mintz, Oscar R. Hernandez, Christos Kartsaklis, D. Bernholdt, M. Eisenbach, S. Pophale","doi":"10.1109/IPDPSW.2013.66","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.66","url":null,"abstract":"Programming with communication libraries such as the Message Passing Interface (MPI) obscures the high-level intent of the communication in an application and makes static communication analysis difficult to do. Compilers are unaware of communication libraries' specifics, leading to the exclusion of communication patterns from any automated analysis and optimizations. To overcome this, communication patterns can be expressed at higher-levels of abstraction and incrementally added to existing MPI applications. In this paper, we propose the use of directives to clearly express the communication intent of an application in a way that is not specific to a given communication library. Our communication directives allow programmers to express communication among processes in a portable way, giving hints to the compiler on regions of computations that can be overlapped with communication and relaxing communication constraints on the ordering, completion and synchronization of the communication imposed by specific libraries such as MPI. The directives can then be translated by the compiler into message passing calls that efficiently implement the intended pattern and be targeted to multiple communication libraries. Thus far, we have used the directives to express point-to-point communication patterns in C, C++ and Fortran applications, and have translated them to MPI and SHMEM.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"1 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114218068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scheduling Tightly-Coupled Applications on Heterogeneous Desktop Grids 异构桌面网格上紧耦合应用的调度
H. Casanova, F. Dufossé, Y. Robert, F. Vivien
{"title":"Scheduling Tightly-Coupled Applications on Heterogeneous Desktop Grids","authors":"H. Casanova, F. Dufossé, Y. Robert, F. Vivien","doi":"10.1109/IPDPSW.2013.10","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.10","url":null,"abstract":"Platforms that comprise volatile processors, such as desktop grids, have been traditionally used for executing independent-task applications. In this work we study the scheduling of tightly-coupled iterative master-worker applications onto volatile processors. The main challenge is that workers must be simultaneously available for the application to make progress. We consider three additional complications: (i) one should take into account that workers can become temporarily reclaimed and, for data-intensive applications; (ii) one should account for the limited bandwidth between the master and the workers; and (iii) workers are strongly heterogeneous, with different computing speeds and availability probability distributions. In this context, our first contribution is a theoretical study of the scheduling problem in its off-line version, i.e., when processor availability is known in advance. Even in this case the problem is NP-hard. Our second contribution is an analytical approximation of the expectation of the time needed by a set of workers to complete a set of tasks and of the probability of success of this computation. This approximation relies on a Markovian assumption for the temporal availability of processors. Our third contribution is a set of heuristics, some of which use the above approximation to favor reliable processors in a sensible manner. We evaluate these heuristics in simulation. We identify some heuristics that significantly outperform their competitors and derive heuristic design guidelines.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114431941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EDA and ML -- A Perfect Pair for Large-Scale Data Analysis EDA和ML——大规模数据分析的完美组合
R. Hafen, T. Critchlow
{"title":"EDA and ML -- A Perfect Pair for Large-Scale Data Analysis","authors":"R. Hafen, T. Critchlow","doi":"10.1109/IPDPSW.2013.118","DOIUrl":"https://doi.org/10.1109/IPDPSW.2013.118","url":null,"abstract":"In this position paper, we discuss how Exploratory Data Analysis (EDA) and Machine Learning (ML) can work together in large-scale data analysis environments. In particular, we describe how applying EDA techniques and ML methods in a complementary fashion can be used to address some of the challenges faced when applying ML techniques to large, real world data sets, and discuss tools that help do the job. This iterative approach is demonstrated with a simple example of how extracting events from a historical sensor data set was enabled by iteratively identifying and filtering various types of erroneous data.","PeriodicalId":234552,"journal":{"name":"2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum","volume":"356 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123429388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信