Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications最新文献

筛选
英文 中文
A Performance Optimization Framework for the Simultaneous Heterogeneous Computing Platforms 基于并行异构计算平台的性能优化框架
S. Li
{"title":"A Performance Optimization Framework for the Simultaneous Heterogeneous Computing Platforms","authors":"S. Li","doi":"10.1145/2916026.2916029","DOIUrl":"https://doi.org/10.1145/2916026.2916029","url":null,"abstract":"Heterogeneous computing platforms with multicore host system and many-core accelerator devices have taken a major step forward in the mainstream HPC computing market this year with the announcement of HP Apollo 6000 System's ProLiant XL250a server features the Intel® Xeon Phi™ coprocessors. Although many application developers attempt to use it in the same way as GPGPU acceleration platforms, doing so forfeits the processing capability of multicore host processors and introduces power inefficiency in business operations. In this paper, we propose an application optimization framework to turn sequential legacy applications into highly parallel applications that make use of the hardware resources both on the host CPU and on the accelerator devices to enable simultaneous heterogeneous computing. As a case study, we look at how to apply this framework and adopt a structured methodology to develop option pricing applications to take advantages of a heterogeneous computing environment.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115361766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications ACM并行和高性能应用软件工程方法研讨会论文集
Atul Kumar, S. Sarkar, M. Gerndt
{"title":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","authors":"Atul Kumar, S. Sarkar, M. Gerndt","doi":"10.1145/2916026","DOIUrl":"https://doi.org/10.1145/2916026","url":null,"abstract":"It is our great pleasure to welcome you to the Workshop on Software Engineering Methods for Parallel and High Performance Applications - SEM4HPC 2016. \u0000 \u0000The workshop aims to discuss parallel computing beyond traditional scientific computing and using them to develop enterprise and industrial applications. Compared to the traditional sequential computing paradigm, the software development, analysis and migration tools for parallel and high performance applications are far less matured for the IT industry to make a shift towards the new computing paradigm. The mission of this workshop is to bring the global industry and academic experts in this area to identify various research challenges that exist in software engineering methods for parallel and high performance application development, maintenance and migration. The workshop also aims to bring out the current state of the art and practice of the software engineering methods through case-studies, novel research ideas, and keynote and invited talks. \u0000 \u0000The call for papers attracted submissions from Germany, India, Spain, and the United States. We received eleven full technical papers out of which five were selected with an acceptance ratio of 45%. \u0000 \u0000We also encourage attendees to attend the keynote and invited talk presentations. These valuable and insightful talks can and will guide us to a better understanding of challenges in this area: \u0000Keynote: Challenges in Transition, Kazuaki Ishizaki (IBM Research -- Tokyo, Japan) \u0000Invited Talk: The READEX project for Dynamic Energy Efficiency Tuning, Michael Gerndt (Technical University of Munich, Germany) \u0000Invited Talk: Developer Productivity in HPC Application Development: An Overview of Recent Techniques, Santonu Sarkar (BITS Pilani -- Goa Campus, India)","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126495283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LUT Optimization In Implementation Of Combinational Karatsuba Ofman On Virtex-6 FPGA 在Virtex-6 FPGA上实现组合Karatsuba的LUT优化
D. Kapoor, Rahul Yamasani, S. Saurav, Abhishek Bajpai
{"title":"LUT Optimization In Implementation Of Combinational Karatsuba Ofman On Virtex-6 FPGA","authors":"D. Kapoor, Rahul Yamasani, S. Saurav, Abhishek Bajpai","doi":"10.1145/2916026.2916030","DOIUrl":"https://doi.org/10.1145/2916026.2916030","url":null,"abstract":"This paper discusses different approaches that allow optimizing the combinational logic used in Multipliers for Generic ECC (Elliptic Curve Cryptography) implementation in the Galois field GF(2n) . First,a Combinational Multiplier using Karatsuba Ofman logic with 2*2as a base multiplier has been studied. Proper utilization of Look Up Table (LUT) at base level results in effective optimization of the hardware resources. Hence in order to optimize LUT utilization, designs for combinational logic with 3*3 base and 2*3 base have been explored, keeping the LUT structure of Virtex-6 FPGA in mind. Comparisons have shown that, 3*3 base multipliers designed using Karatsuba Ofman algorithm outperformed 2*2 and 2*3 base Multiplier in terms of resource utilization. To further maximize utilization of hardware resources, the exploration has been further carried out using Shift and Add Algorithm(SAA) and it has been found that SAA remains optimized for lower length operands. Algorithmic and platform oriented optimization results in efficient hardware implementations. The final proposed design is a Hybrid Karatsuba Algorithm, which uses SAA at lower level and at higher level uses Karatsuba Ofman Logic. Again here using 3*3 bit Multiplier with SAA configuration is better than the other two. This approach stands a step closer for efficient implementations of fast algorithm on hardware based applications, as this hybrid multiplier is found to use least number of FPGA resources. All the operations in this paper have been performed based on Virtex-6 ML605 using ESD tool as XILINX 12.1","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128763650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Afternoon Session 1 会议详情:下午会议1
S. Sarkar
{"title":"Session details: Afternoon Session 1","authors":"S. Sarkar","doi":"10.1145/3248634","DOIUrl":"https://doi.org/10.1145/3248634","url":null,"abstract":"","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115487903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The READEX Project for Dynamic Energy Efficiency Tuning READEX动态能源效率调整项目
M. Gerndt
{"title":"The READEX Project for Dynamic Energy Efficiency Tuning","authors":"M. Gerndt","doi":"10.1145/2916026.2916033","DOIUrl":"https://doi.org/10.1145/2916026.2916033","url":null,"abstract":"High Performance Computing (HPC) systems consume a lot of energy. The overall energy consumption is one of the biggest challenges on the way towards exascale computers. Therefore, energy reduction techniques have to be applied on all levels from the basic chip technology up to the data center infrastructure. The READEX project explores the potential of dynamically switching application and system parameters, such as the clock frequency of the cores, to reduce the overall energy consumption of applications. An analysis is performed during application design time to precompute a tuning model that is then input to the runtime tuning library. This library switches the application and system configuration at runtime to adapt to varying application characteristics.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115898897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Afternoon Session 2 会议详情:下午会议2
M. Gerndt
{"title":"Session details: Afternoon Session 2","authors":"M. Gerndt","doi":"10.1145/3248635","DOIUrl":"https://doi.org/10.1145/3248635","url":null,"abstract":"","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114281546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive GPU Array Layout Auto-Tuning 自适应GPU阵列布局自动调整
Nicolas Weber, M. Goesele
{"title":"Adaptive GPU Array Layout Auto-Tuning","authors":"Nicolas Weber, M. Goesele","doi":"10.1145/2916026.2916031","DOIUrl":"https://doi.org/10.1145/2916026.2916031","url":null,"abstract":"Optimal performance is an important goal in compute intensive applications. For GPU applications, this requires a lot of experience and knowledge about the algorithms and the underlying hardware, making them an ideal target for auto-tuning approaches. We present an auto-tuner which optimizes array layouts in CUDA applications. Depending on the data and program parameters, kernels can have varying optimal configurations. We thus adjust array layouts adaptively at runtime and achieve or even exceed performance of hand optimized code. We automatically detect data characteristics to identify different performance scenarios without user input or additional programming. We perform an empirical analysis of the application in order to construct our decision models. Our adaptive optimization requires in principle profiling data for an extremely high number of scenarios which cannot be exhaustively evaluated for complex applications. We solve this by extending a previously published method that is able to efficiently profile single kernel calls and enhance it to find application-wide optimal solutions. Our method is able to optimize applications in a few minutes, reaching speed ups of up to 20% compared to hand optimized code.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115120237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Implementing an Efficient Path Based Equivalence Checker for Parallel Programs 并行程序中基于路径的等价检验器的实现
S. Bandyopadhyay, K. Banerjee
{"title":"Implementing an Efficient Path Based Equivalence Checker for Parallel Programs","authors":"S. Bandyopadhyay, K. Banerjee","doi":"10.1145/2916026.2916027","DOIUrl":"https://doi.org/10.1145/2916026.2916027","url":null,"abstract":"User written programs, when transformed by optimizing and parallelizing compilers, can be incorrect, if the compiler is not trusted. So, establishing the validity of these transformations is a crucial and challenging task. For program verification, the PRES+ (Petri net Representation of Embedded Systems) is now well accepted as a model to capture the data and control flow of a program. In this paper, an efficient path based equivalence checking method using a simple PRES+ model (which is easier to generate from a program) for validating several optimizing and parallelizing transformations is proposed. The experimental results demonstrate the efficiency of the method.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132151370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Session details: Morning Session 会话详细信息:上午的会话
Atul Kumar
{"title":"Session details: Morning Session","authors":"Atul Kumar","doi":"10.1145/3248633","DOIUrl":"https://doi.org/10.1145/3248633","url":null,"abstract":"","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125591553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developer Productivity in HPC Application Development: An Overview of Recent Techniques HPC应用程序开发中的开发人员生产力:最新技术概述
S. Sarkar
{"title":"Developer Productivity in HPC Application Development: An Overview of Recent Techniques","authors":"S. Sarkar","doi":"10.1145/2916026.2916034","DOIUrl":"https://doi.org/10.1145/2916026.2916034","url":null,"abstract":"Increasing computing power with evolving hardware architectures has lead to change in programming paradigm from serial to parallel. Unlike the sequential counterpart, application building for High Performance Computing (HPC) is extremely challenging for developers. In order to improve the programmer productivity, it is necessary to address the challenges such as: i) How to abstract the hardware and low level complexities to make programming easier? ii) What features should a design assistance tool have to simplify application development? iii) How should the programming languages be enhanced for HPC? iv) What sort of prediction techniques can be developed to assist programmers to predict potential speedup? v) Can refactoring techniques solve the issue of parallelizing existing serial code? In this talk we make an attempt to present a landscape of the existing approaches to assist the software building process in HPC from a developer's point of view, and highlight some important research questions. We also discuss the state of practice in the industry and some of the application specific tools developed for HPC.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132081203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信