Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications最新文献

筛选
英文 中文
MPI Acceleration of Image Classification: Are We Seeing the Resurgence of MPI in Solving Big Data Problems? 图像分类的MPI加速:我们是否看到MPI在解决大数据问题中的复苏?
Sameer Kumar
{"title":"MPI Acceleration of Image Classification: Are We Seeing the Resurgence of MPI in Solving Big Data Problems?","authors":"Sameer Kumar","doi":"10.1145/3085158.3091993","DOIUrl":"https://doi.org/10.1145/3085158.3091993","url":null,"abstract":"Recent work has shown the effectiveness of the MPI programming paradigm in accelerating image classification via the Stochastic Gradient Descent optimization technique. Applications such as Caffe, Torch and Tensor Flow, that use Graphic Processing Unit accelerators within the SMP node, have been extended to use MPI across nodes with scalable speedups. In this talk, we will briefly review convolutional neural networks and the stochastic gradient technique to explore optimized solutions for the image classification problem. Next, I will review opportunities and challenges in parallel and distributed asynchronous stochastic gradient descent and the benefits from using MPI libraries. I will also present possible future directions for MPI based deep learning and other Big Data applications.","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128549535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Effective is Design Abstraction in Thrust?: An Empirical Evaluation 设计抽象在推力中有多有效?实证评价
Ajai V. George, Sankar Manoj, S. Gupte, S. Sarkar
{"title":"How Effective is Design Abstraction in Thrust?: An Empirical Evaluation","authors":"Ajai V. George, Sankar Manoj, S. Gupte, S. Sarkar","doi":"10.1145/3085158.3086159","DOIUrl":"https://doi.org/10.1145/3085158.3086159","url":null,"abstract":"High performance computing applications are far more difficult to write, therefore, practitioners expect a well-tuned software to last long and provide optimized performance even when the hardware is upgraded. It may also be necessary to write software using sufficient abstraction over the hardware so that it is capable of running on heterogeneous architecture. A good design abstraction paradigm strikes a balance between the abstraction and visibility over the hardware. This allows the programmer to write applications without having to understand the hardware nuances while exploiting the computing power optimally. In this paper we have analyzed the power of design abstraction of a popular design abstraction framework called Thrust both from ease of programming and performance perspectives. We have shown that while Thrust framework is good in describing an algorithm compared to the native CUDA or OpenMP version but it has quite a few design limitations. With respect to CUDA it does not provide any abstraction over the shared, texture or constant memory usage to the programmer. We have compared the performance of a Thrust application code in CUDA, OpenMP and the CPP backends with respect to the native versions (implementing exactly same algorithm), written for these backends and found that the current Thrust version performs poorly in most of the cases. While we conclude that the framework is not ready for writing applications that can exploit the optimal performance from the hardware, we also highlight the improvements necessary for the framework to make the performance comparable.","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130900587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Session 1 会话详细信息:会话1
Atul Kumar
{"title":"Session details: Session 1","authors":"Atul Kumar","doi":"10.1145/3248714","DOIUrl":"https://doi.org/10.1145/3248714","url":null,"abstract":"","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123863281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Session 2 会话详情:会话2
S. Sarkar
{"title":"Session details: Session 2","authors":"S. Sarkar","doi":"10.1145/3248715","DOIUrl":"https://doi.org/10.1145/3248715","url":null,"abstract":"","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"291 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116111350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using High Level GPU Tasks to Explore Memory and Communications Options on Heterogeneous Platforms 使用高级GPU任务探索异构平台上的内存和通信选项
Chao Liu, J. Bhimani, M. Leeser
{"title":"Using High Level GPU Tasks to Explore Memory and Communications Options on Heterogeneous Platforms","authors":"Chao Liu, J. Bhimani, M. Leeser","doi":"10.1145/3085158.3086160","DOIUrl":"https://doi.org/10.1145/3085158.3086160","url":null,"abstract":"Heterogeneous computing platforms that use GPUs for acceleration are becoming prevalent. Developing parallel applications for GPU platforms and optimizing GPU related applications for good performance is important. In this work, we develop a set of applications based on a high level task design, which ensures a well defined structure for portability improvement. Together with the GPU task implementation, we utilize a uniform interface to allocate and manage memory blocks that are used by both host and device. In this way we can choose the appropriate types of memory for host/device communication easily and flexibly in GPU tasks. Through asynchronous task execution and CUDA streams, we can explore concurrent GPU kernels for performance improvement when running multiple tasks. We developed a test benchmark set containing nine different kernel applications. Through tests we can learn that pinned memory can improve host/device data transfer for GPU platforms. The performance of unified memory differs a lot on different GPU architectures and is not a good choice if performance is the main focus. The multiple task tests show that applications based on our GPU tasks can effectively make use of the concurrent kernel ability of modern GPUs for better resource utilization.","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
READEX Tool Suite for Energy-efficiency Tuning of HPC Applications READEX工具套件的能源效率调整的HPC应用程序
Anamika Chowdhury, Madhura Kumaraswamy, M. Gerndt
{"title":"READEX Tool Suite for Energy-efficiency Tuning of HPC Applications","authors":"Anamika Chowdhury, Madhura Kumaraswamy, M. Gerndt","doi":"10.1145/3085158.3091994","DOIUrl":"https://doi.org/10.1145/3085158.3091994","url":null,"abstract":"The European Union Horizon 2020 READEX project is developing a tool suite for dynamic energy tuning of HPC applications. The tool suite performs an analysis during design-time before production run to construct a tuning model encapsulated with the best-found configurations that are then fed to the runtime tuning library. The library switches the configurations at runtime to adapt the application for energy-efficiency.","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134515003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PRESGen: A Fully Automatic Equivalence Checker for Validating Optimizing and Parallelizing Transformations PRESGen:一个用于验证优化和并行化转换的全自动等效检查器
S. Bandyopadhyay, K. Banerjee
{"title":"PRESGen: A Fully Automatic Equivalence Checker for Validating Optimizing and Parallelizing Transformations","authors":"S. Bandyopadhyay, K. Banerjee","doi":"10.1145/3085158.3086158","DOIUrl":"https://doi.org/10.1145/3085158.3086158","url":null,"abstract":"Petri net has been a popular choice of model of computation (MoC) for representing parallel programs. PRES+ is an extension of the traditional Petri net model which is specially equipped to precisely model embedded systems. Since multi-core and multiprocessor systems have proliferated in the domain of embedded systems as well, it has become critical to validate the optimizing and parallelizing transformations which embedded system specifications go through before being implemented in the hardware. PRES+ model based equivalence checkers for validating such transformations already exist. However, construction of the PRES+ models from the original and the translated codes in these equivalence checkers was not done in an automated manner; thus, leaving scope for inaccurate representation of the PRES+ models since they had to be done manually. Moreover, PRES+ model tends to grow more rapidly with the program size when compared to other MoCs, such as FSMD. To tackle these problems, we propose a method for automated construction of PRES+ models from high-level language programs and using an existing translation scheme to convert PRES+ models to FSMD models, we validate the transformations using a state-of-the-art FSMD equivalence checker. Thus, we have effectively composed an end-to-end fully automatic equivalence checker for validating optimizing and parallelizing transformations. The experimental results demonstrate the practical applicability of our method.","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122171318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications 2017年并行和高性能应用软件工程方法研讨会论文集
{"title":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","authors":"","doi":"10.1145/3085158","DOIUrl":"https://doi.org/10.1145/3085158","url":null,"abstract":"","PeriodicalId":425891,"journal":{"name":"Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122015060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信