Proceedings of the 11th ACM Conference on Computing Frontiers最新文献

筛选
英文 中文
SKOPE: a framework for modeling and exploring workload behavior SKOPE:建模和探索工作负载行为的框架
Proceedings of the 11th ACM Conference on Computing Frontiers Pub Date : 2014-05-20 DOI: 10.1145/2597917.2597928
Jiayuan Meng, Xingfu Wu, V. Morozov, V. Vishwanath, Kalyan Kumaran, V. Taylor
{"title":"SKOPE: a framework for modeling and exploring workload behavior","authors":"Jiayuan Meng, Xingfu Wu, V. Morozov, V. Vishwanath, Kalyan Kumaran, V. Taylor","doi":"10.1145/2597917.2597928","DOIUrl":"https://doi.org/10.1145/2597917.2597928","url":null,"abstract":"Understanding workload behavior plays an important role in performance studies. The growing complexity of applications and architectures has increased the gap among application developers, performance engineers, and hardware designers. To reduce this gap, we propose SKOPE, a SKeleton framework for Performance Exploration, that produces a descriptive model about the semantic behavior of a workload, which can infer potential transformations and help users understand how workloads may interact with and adapt to emerging hardware. SKOPE models can be shared, annotated, and studied by a community of performance engineers and system designers; they offer readability in the frontend and versatility in the backend. SKOPE can be used for performance analysis, tuning, and projection. We provide two example use cases. First, we project GPU performance from CPU code without GPU programming or accessing the hardware, and are able to automatically explore transformations and the projected best-achievable performance deviates from the measured by 18% on average. Second, we project the multi-node scaling trends of two scientific workloads, and are able to achieve a projection accuracy of 95%.","PeriodicalId":194910,"journal":{"name":"Proceedings of the 11th ACM Conference on Computing Frontiers","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128113626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Feature-based device selection in heterogeneous computing systems 异构计算系统中基于特征的设备选择
Proceedings of the 11th ACM Conference on Computing Frontiers Pub Date : 2014-05-20 DOI: 10.1145/2597917.2597927
Ayman Tarakji, Niels Ole Salscheider, Stephan Alt, Jan Heiducoff
{"title":"Feature-based device selection in heterogeneous computing systems","authors":"Ayman Tarakji, Niels Ole Salscheider, Stephan Alt, Jan Heiducoff","doi":"10.1145/2597917.2597927","DOIUrl":"https://doi.org/10.1145/2597917.2597927","url":null,"abstract":"With the advent of accelerator-based heterogeneous parallel systems, the need for a solution of the task-device matching problem is increasing. Due to the enormously growing diversity in existing computing architectures, optimal matching promises to deliver high performance at reduced energy costs. By means of OpenCL and particularly the LLVM compiler infrastructure, our approach makes the task-device matching decisions taking into account the characteristics and particularities of the different processing hardware. We evaluate our approach using a set of OpenCL based real-world applications and well established benchmarks, which are run on different hardware platforms and architectures. Our results indicate highly accurate predictions made by our model during the matching procedure.","PeriodicalId":194910,"journal":{"name":"Proceedings of the 11th ACM Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130555932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
SAMO: store aware memory optimizations SAMO:存储感知内存优化
Proceedings of the 11th ACM Conference on Computing Frontiers Pub Date : 2014-05-20 DOI: 10.1145/2597917.2597940
K. Raghavendra, Tripti S. Warrier, M. Mutyam
{"title":"SAMO: store aware memory optimizations","authors":"K. Raghavendra, Tripti S. Warrier, M. Mutyam","doi":"10.1145/2597917.2597940","DOIUrl":"https://doi.org/10.1145/2597917.2597940","url":null,"abstract":"Cache optimizations and DRAM scheduling play an important role in determining the performance of a system given that the demand for memory is ever increasing. In this paper we track stores both at cache and main memory and apply three different optimizations, one, at the cache level, so that stores are serviced faster and hence load store queue block cycles are reduced, two, at the miss handling architecture wherein we remove entries containing only store requests thereby reducing the cache stall cycles and three, at the main memory where stores are serviced with lesser priority so that actual reads get serviced faster. These three different memory optimizations combined together (store aware memory optimization, SAMO framework) on an average increase the performance of the system and can be augmented with any previously proposed optimization techniques at the memory. SAMO speeds-up the workloads on 4- and 8-core systems by a geometric mean of 5.0% and 7.4%, respectively, with a maximum speed-up of 21.9% and 17.8% on 4- and 8-core systems, respectively.","PeriodicalId":194910,"journal":{"name":"Proceedings of the 11th ACM Conference on Computing Frontiers","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130005828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High throughput genetic sequence analysis 高通量基因序列分析
Proceedings of the 11th ACM Conference on Computing Frontiers Pub Date : 2014-05-20 DOI: 10.1145/2597917.2597957
H. Lam, S. Cunningham, S. Sreevatsan, Daniel Boley
{"title":"High throughput genetic sequence analysis","authors":"H. Lam, S. Cunningham, S. Sreevatsan, Daniel Boley","doi":"10.1145/2597917.2597957","DOIUrl":"https://doi.org/10.1145/2597917.2597957","url":null,"abstract":"We present an application paradigm in which an unsupervised machine learning approach is applied to high dimensional influenza sequence datasets: (1) human A/H3N2, (2) avian H5, and (3) North American swine influenza H3N2 virus. Interesting visual patterns observed in the A/H3N2 influenza virus led us to hypothesize that vaccination could be one of the driving forces in the evolution of the human A/H3N2 influenza virus. We provide simulation study and statistical results to support our finding that the influenza virus evolves differently in a protected environment than it evolves in the wild. In the swine H3N2 case, our result suggests that the diversification of North American swine influenza virus can be attributed to the mutations at two positively selected sites on the hemaggluttinin protein.","PeriodicalId":194910,"journal":{"name":"Proceedings of the 11th ACM Conference on Computing Frontiers","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133659869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On generating multicast routes for SpiNNaker SpiNNaker多播路由生成研究
Proceedings of the 11th ACM Conference on Computing Frontiers Pub Date : 2014-05-20 DOI: 10.1145/2597917.2597938
J. Navaridas, M. Luján, L. Plana, S. Temple, S. Furber
{"title":"On generating multicast routes for SpiNNaker","authors":"J. Navaridas, M. Luján, L. Plana, S. Temple, S. Furber","doi":"10.1145/2597917.2597938","DOIUrl":"https://doi.org/10.1145/2597917.2597938","url":null,"abstract":"The human brain is an immense biological neural network characterized by high degrees of connectivity among neurons. Any system designed to simulate biologically-plausible spiking neuronal networks needs to support such connectivity and the associated communication traffic in the form of spike events. This paper demonstrates the adequacy of multicast communications to achieve such a demanding goal and introduces a collection of algorithms to generate multicast routes. These algorithms target the SpiNNaker interconnect; a two dimensional triangular toroidal mesh with support for selective multicast. As generating multicast routes is a NP-complete problem, these algorithms are an essential ingredient for an efficient operation of SpiNNaker. Although multicast networks have been studied in the literature, existing algorithms cannot be applied efficiently to SpiNNaker. A comprehensive evaluation analyzing the largest configuration of the SpiNNaker system (over 1 million ARM cores) shows that each algorithm provides diverse benefits and drawbacks which can be exploited to avoid possible bottlenecks. Results show that the communication infrastructure of SpiNNaker will be able to support the high communication pressure exerted by simulating in real-time biologically plausible spiking neural applications","PeriodicalId":194910,"journal":{"name":"Proceedings of the 11th ACM Conference on Computing Frontiers","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127356410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Hardware support for address mapping in PGAS languages: a UPC case study PGAS语言中地址映射的硬件支持:UPC案例研究
Proceedings of the 11th ACM Conference on Computing Frontiers Pub Date : 2013-09-09 DOI: 10.1145/2597917.2597945
O. Serres, Abdullah Kayi, Ahmad Anbar, T. El-Ghazawi
{"title":"Hardware support for address mapping in PGAS languages: a UPC case study","authors":"O. Serres, Abdullah Kayi, Ahmad Anbar, T. El-Ghazawi","doi":"10.1145/2597917.2597945","DOIUrl":"https://doi.org/10.1145/2597917.2597945","url":null,"abstract":"The Partitioned Global Address Space (PGAS) programming model strikes a balance between the explicit, locality-aware, message-passing model and locality-agnostic, but easy-to-use, shared memory model (e.g. OpenMP). However, the PGAS memory model comes at a performance cost which limits both scalability and performance. Compiler optimizations are often not sufficient and manual optimizations are needed which considerably limit the productivity advantage. This paper proposes a hardware architectural support for PGAS, which allows the processor to efficiently handle shared addresses through new instructions. A prototype compiler is realized allowing to use the support with unmodified code, preserving the PGAS productivity advantage. Speedups of up to 5.5x are demonstrated on the unmodified NAS Parallel Benchmarks using the Gem5 full system simulator.","PeriodicalId":194910,"journal":{"name":"Proceedings of the 11th ACM Conference on Computing Frontiers","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128522053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Proceedings of the 11th ACM Conference on Computing Frontiers 第11届ACM计算前沿会议论文集
Proceedings of the 11th ACM Conference on Computing Frontiers Pub Date : 1900-01-01 DOI: 10.1145/2597917
{"title":"Proceedings of the 11th ACM Conference on Computing Frontiers","authors":"","doi":"10.1145/2597917","DOIUrl":"https://doi.org/10.1145/2597917","url":null,"abstract":"","PeriodicalId":194910,"journal":{"name":"Proceedings of the 11th ACM Conference on Computing Frontiers","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123552955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信