Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications最新文献

A Performance Optimization Framework for the Simultaneous Heterogeneous Computing Platforms 基于并行异构计算平台的性能优化框架

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/2916026.2916029

S. Li

引用次数: 0

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications ACM并行和高性能应用软件工程方法研讨会论文集

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/2916026

Atul Kumar, S. Sarkar, M. Gerndt

{"title":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","authors":"Atul Kumar, S. Sarkar, M. Gerndt","doi":"10.1145/2916026","DOIUrl":"https://doi.org/10.1145/2916026","url":null,"abstract":"It is our great pleasure to welcome you to the Workshop on Software Engineering Methods for Parallel and High Performance Applications - SEM4HPC 2016. \u0000 \u0000The workshop aims to discuss parallel computing beyond traditional scientific computing and using them to develop enterprise and industrial applications. Compared to the traditional sequential computing paradigm, the software development, analysis and migration tools for parallel and high performance applications are far less matured for the IT industry to make a shift towards the new computing paradigm. The mission of this workshop is to bring the global industry and academic experts in this area to identify various research challenges that exist in software engineering methods for parallel and high performance application development, maintenance and migration. The workshop also aims to bring out the current state of the art and practice of the software engineering methods through case-studies, novel research ideas, and keynote and invited talks. \u0000 \u0000The call for papers attracted submissions from Germany, India, Spain, and the United States. We received eleven full technical papers out of which five were selected with an acceptance ratio of 45%. \u0000 \u0000We also encourage attendees to attend the keynote and invited talk presentations. These valuable and insightful talks can and will guide us to a better understanding of challenges in this area: \u0000Keynote: Challenges in Transition, Kazuaki Ishizaki (IBM Research -- Tokyo, Japan) \u0000Invited Talk: The READEX project for Dynamic Energy Efficiency Tuning, Michael Gerndt (Technical University of Munich, Germany) \u0000Invited Talk: Developer Productivity in HPC Application Development: An Overview of Recent Techniques, Santonu Sarkar (BITS Pilani -- Goa Campus, India)","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126495283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LUT Optimization In Implementation Of Combinational Karatsuba Ofman On Virtex-6 FPGA 在Virtex-6 FPGA上实现组合Karatsuba的LUT优化

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/2916026.2916030

D. Kapoor, Rahul Yamasani, S. Saurav, Abhishek Bajpai

{"title":"LUT Optimization In Implementation Of Combinational Karatsuba Ofman On Virtex-6 FPGA","authors":"D. Kapoor, Rahul Yamasani, S. Saurav, Abhishek Bajpai","doi":"10.1145/2916026.2916030","DOIUrl":"https://doi.org/10.1145/2916026.2916030","url":null,"abstract":"This paper discusses different approaches that allow optimizing the combinational logic used in Multipliers for Generic ECC (Elliptic Curve Cryptography) implementation in the Galois field GF(2n) . First,a Combinational Multiplier using Karatsuba Ofman logic with 2*2as a base multiplier has been studied. Proper utilization of Look Up Table (LUT) at base level results in effective optimization of the hardware resources. Hence in order to optimize LUT utilization, designs for combinational logic with 3*3 base and 2*3 base have been explored, keeping the LUT structure of Virtex-6 FPGA in mind. Comparisons have shown that, 3*3 base multipliers designed using Karatsuba Ofman algorithm outperformed 2*2 and 2*3 base Multiplier in terms of resource utilization. To further maximize utilization of hardware resources, the exploration has been further carried out using Shift and Add Algorithm(SAA) and it has been found that SAA remains optimized for lower length operands. Algorithmic and platform oriented optimization results in efficient hardware implementations. The final proposed design is a Hybrid Karatsuba Algorithm, which uses SAA at lower level and at higher level uses Karatsuba Ofman Logic. Again here using 3*3 bit Multiplier with SAA configuration is better than the other two. This approach stands a step closer for efficient implementations of fast algorithm on hardware based applications, as this hybrid multiplier is found to use least number of FPGA resources. All the operations in this paper have been performed based on Virtex-6 ML605 using ESD tool as XILINX 12.1","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128763650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Session details: Afternoon Session 1 会议详情:下午会议1

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/3248634

S. Sarkar

引用次数: 0

The READEX Project for Dynamic Energy Efficiency Tuning READEX动态能源效率调整项目

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/2916026.2916033

M. Gerndt

引用次数: 0

Session details: Afternoon Session 2 会议详情:下午会议2

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/3248635

M. Gerndt

引用次数: 0

Adaptive GPU Array Layout Auto-Tuning 自适应GPU阵列布局自动调整

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/2916026.2916031

Nicolas Weber, M. Goesele

{"title":"Adaptive GPU Array Layout Auto-Tuning","authors":"Nicolas Weber, M. Goesele","doi":"10.1145/2916026.2916031","DOIUrl":"https://doi.org/10.1145/2916026.2916031","url":null,"abstract":"Optimal performance is an important goal in compute intensive applications. For GPU applications, this requires a lot of experience and knowledge about the algorithms and the underlying hardware, making them an ideal target for auto-tuning approaches. We present an auto-tuner which optimizes array layouts in CUDA applications. Depending on the data and program parameters, kernels can have varying optimal configurations. We thus adjust array layouts adaptively at runtime and achieve or even exceed performance of hand optimized code. We automatically detect data characteristics to identify different performance scenarios without user input or additional programming. We perform an empirical analysis of the application in order to construct our decision models. Our adaptive optimization requires in principle profiling data for an extremely high number of scenarios which cannot be exhaustively evaluated for complex applications. We solve this by extending a previously published method that is able to efficiently profile single kernel calls and enhance it to find application-wide optimal solutions. Our method is able to optimize applications in a few minutes, reaching speed ups of up to 20% compared to hand optimized code.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115120237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Implementing an Efficient Path Based Equivalence Checker for Parallel Programs 并行程序中基于路径的等价检验器的实现

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/2916026.2916027

S. Bandyopadhyay, K. Banerjee

引用次数: 2

Session details: Morning Session 会话详细信息:上午的会话

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/3248633

Atul Kumar

引用次数: 0

Developer Productivity in HPC Application Development: An Overview of Recent Techniques HPC应用程序开发中的开发人员生产力:最新技术概述

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications Pub Date : 2016-05-31 DOI: 10.1145/2916026.2916034

S. Sarkar

{"title":"Developer Productivity in HPC Application Development: An Overview of Recent Techniques","authors":"S. Sarkar","doi":"10.1145/2916026.2916034","DOIUrl":"https://doi.org/10.1145/2916026.2916034","url":null,"abstract":"Increasing computing power with evolving hardware architectures has lead to change in programming paradigm from serial to parallel. Unlike the sequential counterpart, application building for High Performance Computing (HPC) is extremely challenging for developers. In order to improve the programmer productivity, it is necessary to address the challenges such as: i) How to abstract the hardware and low level complexities to make programming easier? ii) What features should a design assistance tool have to simplify application development? iii) How should the programming languages be enhanced for HPC? iv) What sort of prediction techniques can be developed to assist programmers to predict potential speedup? v) Can refactoring techniques solve the issue of parallelizing existing serial code? In this talk we make an attempt to present a landscape of the existing approaches to assist the software building process in HPC from a developer's point of view, and highlight some important research questions. We also discuss the state of practice in the industry and some of the application specific tools developed for HPC.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132081203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0