Proceedings of the 15th ACM International Conference on Computing Frontiers最新文献

筛选
英文 中文
A decoupled access-execute architecture for reconfigurable accelerators 可重构加速器的解耦访问-执行体系结构
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203267
George Charitopoulos, Charalampos Vatsolakis, Grigorios Chrysos, D. Pnevmatikatos
{"title":"A decoupled access-execute architecture for reconfigurable accelerators","authors":"George Charitopoulos, Charalampos Vatsolakis, Grigorios Chrysos, D. Pnevmatikatos","doi":"10.1145/3203217.3203267","DOIUrl":"https://doi.org/10.1145/3203217.3203267","url":null,"abstract":"Mapping computational intensive applications on reconfigurable technology for acceleration requires two main implementation parts: (a) the data plane, i.e., efficient interconnected units that accelerate processing, and (b) the access-plane, i.e., efficient ways to access data and transfer them to/from the accelerator. Data plane construction is well understood and mature tools -such as High Level Synthesis (HLS)- that produce efficient reconfigurable architectures exist. The access plane, however, is more challenging: data fetching for big-data and high-performance computing applications is even more complex and time consuming than processing. Towards this end, we present DAER, a Decoupled Access-Execute architecture and framework for Reconfigurable accelerators. Our approach maps the code to be accelerated in two separate parts: (a) the fetch unit, responsible for fetching data to the accelerator and storing results back in memory, and (b) the processing unit, which processes the fetched data in a streaming way. This approach offers the user a structured and well-defined way of mapping applications on an FPGA. Additionally, it bodes well with other hardware-based optimization techniques, e.g. pipelining, custom processing and data prefetching, which hide the memory data access latency. We use the DAER framework and HLS mapping tools on five applications and show the proposed DAER framework achieves an order of magnitude performance speed-up compared to unmodified applications, and as much as 2x performance improvement compared to their optimized HLS versions. We, also, map the DAER-based architectures on HPC platforms showing the performance advantages of our approach on real world platforms.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"06 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115899233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An FPGA framework for edge-centric graph processing 边缘中心图形处理的FPGA框架
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203233
Shijie Zhou, R. Kannan, Hanqing Zeng, V. Prasanna
{"title":"An FPGA framework for edge-centric graph processing","authors":"Shijie Zhou, R. Kannan, Hanqing Zeng, V. Prasanna","doi":"10.1145/3203217.3203233","DOIUrl":"https://doi.org/10.1145/3203217.3203233","url":null,"abstract":"Many emerging real-world applications require fast processing of large-scale data represented in the form of graphs. In this paper, we design a Field-Programmable Gate Array (FPGA) framework to accelerate graph algorithms based on the edge-centric paradigm. Our design is flexible for accelerating general graph algorithms with various vertex attributes and update propagation functions, such as Sparse Matrix Vector Multiplication (SpMV), PageRank (PR), Single Source Shortest Path (SSSP), and Weakly Connected Component (WCC). The target platform consists of large external memory to store the graph data and FPGA to accelerate the processing. By taking an edge-centric graph algorithm and hardware resource constraints as inputs, our framework can determine the optimal design parameters and produce an optimized Register-Transfer Level (RTL) FPGA accelerator design. To improve data locality and increase parallelism, we partition the input graph into non-overlapping partitions. This enables our framework to efficiently buffer vertex data in the on-chip memory of FPGA and exploit both inter-partition and intra-partition parallelism. Further, we propose an optimized data layout to improve external memory performance and reduce data communication between FPGA and external memory. Based on our design methodology, we accelerate two fundamental graph algorithms for performance evaluation: Sparse Matrix Vector Multiplication (SpMV) and PageRank (PR). Experimental results show that our accelerators sustain a high throughput of up to 2250 Million Traversed Edges Per Second (MTEPS) and 2487 MTEPS for SpMV and PR, respectively. Compared with several highly-optimized multi-core designs, our FPGA framework achieves up to 20.5× speedup for SpMV, and 17.7× speedup for PR, respectively; compared with two state-of-the-art FPGA frameworks, our designs demonstrate up to 5.3× and 1.8× throughput improvement for SpMV and PR, respectively.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123309807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Taming irregular applications via advanced dynamic parallelism on GPUs 通过gpu上的高级动态并行性来驯服不规则应用程序
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203243
Jing Zhang, Ashwin M. Aji, Michael L. Chu, Hao Wang, Wu-chun Feng
{"title":"Taming irregular applications via advanced dynamic parallelism on GPUs","authors":"Jing Zhang, Ashwin M. Aji, Michael L. Chu, Hao Wang, Wu-chun Feng","doi":"10.1145/3203217.3203243","DOIUrl":"https://doi.org/10.1145/3203217.3203243","url":null,"abstract":"On recent GPU architectures, dynamic parallelism, which enables the launching of kernels from the GPU without CPU involvement, provides a way to improve the performance of irregular applications by generating child kernels dynamically to reduce workload imbalance and improve GPU utilization. However, in practice, dynamic parallelism does not improve performance due to high kernel launch overhead and low child kernel occupancy. Consequently, most existing studies focus on mitigating the kernel launch overhead. As the kernel launch overhead has decreased due to algorithmic redesigns and hardware architectural innovations, the organization of subtasks to child kernels becomes a new performance bottleneck. We present an in-depth characterization of existing software approaches for dynamic parallelism optimizations on the latest GPUs. We observe that current approaches of subtask aggregation, which use the \"one-size-fits-all\" method by treating all subtasks equally, can under-utilize resources and degrade overall performance, as different subtasks require various configurations for optimal performance. To address this problem, we leverage statistical and machine-learning techniques and propose a performance modeling and task scheduling tool that can (1) analyze the performance characteristics of subtasks to identify the critical performance factors, (2) predict the performance of new subtasks, and (3) generate the optimal aggregation strategy for new subtasks. Experimental results show that our approach with the optimal subtask aggregation strategy can achieve up to a 1.8-fold speedup over the existing task aggregation approach for dynamic parallelism.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124686398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Distributed learning-based state prediction for multi-agent systems with reduced communication effort 基于分布式学习的多智能体系统状态预测
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203230
Daniel Hinkelmann, A. Schmeink, Guido Dartmann
{"title":"Distributed learning-based state prediction for multi-agent systems with reduced communication effort","authors":"Daniel Hinkelmann, A. Schmeink, Guido Dartmann","doi":"10.1145/3203217.3203230","DOIUrl":"https://doi.org/10.1145/3203217.3203230","url":null,"abstract":"A novel distributed event-triggered communication for multi-agent systems is presented. Each agent predicts its future states via an artificial neural network, where the prediction is solely based on own past states. The approach is therefore scalable with the number of agents. A communication is triggered if the discrepancy between actual and predicted state exceeds a threshold. Numerical results show that this approach reduces the communication effort remarkably compared to existing methods.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124773668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gathering and analyzing identity leaks for a proactive warning of affected users 收集和分析身份泄露,为受影响的用户提供主动警告
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203269
Timo Malderle, Matthias Wübbeling, S. Knauer, Arnold Sykosch, M. Meier
{"title":"Gathering and analyzing identity leaks for a proactive warning of affected users","authors":"Timo Malderle, Matthias Wübbeling, S. Knauer, Arnold Sykosch, M. Meier","doi":"10.1145/3203217.3203269","DOIUrl":"https://doi.org/10.1145/3203217.3203269","url":null,"abstract":"Identity theft is a common consequence of successful cyber-attacks. Criminals steal identity data in order to either (mis)use the data themselves or sell entire identity collections of such data to other parties. Warning the victims of identity theft is crucial to avoid or limit the damage caused by identity misuse. However, in order to provide proactive warnings to victims in a timely fashion, the leaked identity data has to be available. Within this paper we present a methodology to gather and analyze leaked identity data to enable proactive warnings of victims.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121048749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Vulnerability analysis of Android auto infotainment apps Android汽车信息娱乐应用的漏洞分析
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203278
A. K. Mandal, Agostino Cortesi, Pietro Ferrara, F. Panarotto, F. Spoto
{"title":"Vulnerability analysis of Android auto infotainment apps","authors":"A. K. Mandal, Agostino Cortesi, Pietro Ferrara, F. Panarotto, F. Spoto","doi":"10.1145/3203217.3203278","DOIUrl":"https://doi.org/10.1145/3203217.3203278","url":null,"abstract":"With over 2 billion active mobile users and a large array of features, Android is the most popular operating system for mobile devices. Android Auto allows such devices to connect with an in-car compatible infotainment system, and it became a popular choice as well. However, as the trend for connecting car dashboard to the Internet or other devices grows, so does the potential for security threats. In this paper, a set of potential security threats are identified, and a static analyzer for the Android Auto infotainment system is presented. All the infotainment apps available in Google Play Store have been checked against that list of possible exposure scenarios. Results show that almost 80% of the apps are potentially vulnerable, out of which 25% poses security threats related to execution of JavaScript.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122769959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Horizon: a multi-abstraction framework for graph analytics Horizon:用于图分析的多抽象框架
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203270
Adnan Haider, Fabio Checconi, Xinyu Que, L. Schneidenbach, Daniele Buono, Xian-He Sun
{"title":"Horizon: a multi-abstraction framework for graph analytics","authors":"Adnan Haider, Fabio Checconi, Xinyu Que, L. Schneidenbach, Daniele Buono, Xian-He Sun","doi":"10.1145/3203217.3203270","DOIUrl":"https://doi.org/10.1145/3203217.3203270","url":null,"abstract":"A graph application written using a distributed graph processing framework can perform over an order of magnitude slower than its high-performance, native counterpart. This issue stems from the aim, common to most graph frameworks, of restricting the scope of application development to specific graph constructs, such as, for example, vertex or edge programs. In this paper we present Horizon, a distributed graph processing framework achieving close to native performance without penalizing productivity by providing a multi-layer, multi-abstraction model of computation. Compared to current frameworks, Horizon extends the scope of computation by exposing two notions usually relegated to implementations: graph data models and communication models. Horizon can reduce execution time by an average of 5.3× across different applications and datasets and process an order of magnitude larger graphs when compared to the state of the art.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130902431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The D.A.V.I.D.E. big-data-powered fine-grain power and performance monitoring support D.A.V.I.D.E.大数据驱动的精细电源和性能监控支持
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3205863
Andrea Bartolini, Andrea Borghesi, Antonio Libri, Francesco Beneventi, D. Gregori, S. Tinti, Cosimo Gianfreda, Piero Altoe
{"title":"The D.A.V.I.D.E. big-data-powered fine-grain power and performance monitoring support","authors":"Andrea Bartolini, Andrea Borghesi, Antonio Libri, Francesco Beneventi, D. Gregori, S. Tinti, Cosimo Gianfreda, Piero Altoe","doi":"10.1145/3203217.3205863","DOIUrl":"https://doi.org/10.1145/3203217.3205863","url":null,"abstract":"On the race toward exascale supercomputing systems are facing important challenges which limit the efficiency of the system. Among all, power and energy consumption fueled by the end of Dennard's scaling start to show their impact on limiting supercomputers peak performance and cost effectiveness. In this paper we present and describe a new methodology based on a set of HW and SW extensions for fine-grain monitoring of power and aggregation of them for fast analysis and visualization. We propose a turn-key system which uses MQTT communication layer, NoSQL database, fine grain monitoring and in future AI technology to measure and control power and performance. This methodology is shown as an integrated feature of the D.A.V.I.D.E. supercomputing machine.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125473089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
The SAGE project: a storage centric approach for exascale computing: invited paper SAGE项目:以存储为中心的百亿亿次计算方法:特邀论文
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3205341
Sai B. Narasimhamurthy, N. Danilov, S. Wu, G. Umanesan, Steven W. D. Chien, Sergio Rivas-Gomez, I. Peng, E. Laure, S. D. Witt, D. Pleiter, S. Markidis
{"title":"The SAGE project: a storage centric approach for exascale computing: invited paper","authors":"Sai B. Narasimhamurthy, N. Danilov, S. Wu, G. Umanesan, Steven W. D. Chien, Sergio Rivas-Gomez, I. Peng, E. Laure, S. D. Witt, D. Pleiter, S. Markidis","doi":"10.1145/3203217.3205341","DOIUrl":"https://doi.org/10.1145/3203217.3205341","url":null,"abstract":"SAGE (Percipient StorAGe for Exascale Data Centric Computing) is a European Commission funded project towards the era of Exascale computing. Its goal is to design and implement a Big Data/Extreme Computing (BDEC) capable infrastructure with associated software stack. The SAGE system follows a storage centric approach as it is capable of storing and processing large data volumes at the Exascale regime. SAGE addresses the convergence of Big Data Analysis and HPC in an era of next-generation data centric computing. This convergence is driven by the proliferation of massive data sources, such as large, dispersed scientific instruments and sensors where data needs to be processed, analyzed and integrated into simulations to derive scientific and innovative insights. A first prototype of the SAGE system has been been implemented and installed at the Jülich Supercomputing Center. The SAGE storage system consists of multiple types of storage device technologies in a multi-tier I/O hierarchy, including flash, disk, and non-volatile memory technologies. The main SAGE software component is the Seagate Mero Object Storage that is accessible via the Clovis API and higher level interfaces. The SAGE project also includes scientific applications for the validation of the SAGE concepts. The objective of this paper is to present the SAGE project concepts, the prototype of the SAGE platform and discuss the software architecture of the SAGE system.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128755447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Comprehensive assessment of run-time hardware-supported malware detection using general and ensemble learning 使用通用和集成学习对运行时硬件支持的恶意软件检测进行全面评估
Proceedings of the 15th ACM International Conference on Computing Frontiers Pub Date : 2018-05-08 DOI: 10.1145/3203217.3203264
H. Sayadi, Sai Manoj Pudukotai Dinakarrao, A. Houmansadr, S. Rafatirad, H. Homayoun
{"title":"Comprehensive assessment of run-time hardware-supported malware detection using general and ensemble learning","authors":"H. Sayadi, Sai Manoj Pudukotai Dinakarrao, A. Houmansadr, S. Rafatirad, H. Homayoun","doi":"10.1145/3203217.3203264","DOIUrl":"https://doi.org/10.1145/3203217.3203264","url":null,"abstract":"Recent studies have demonstrated the effectiveness of Hardware Performance Counters (HPCs) for detecting pattern of malicious applications. Hardware-supported detectors utilize Machine Learning (ML) classifiers for malware detection by analyzing a large number of HPC features, more than the very limited number of HPC registers available in modern microprocessors. Obtaining more HPCs requires running the application (malware or benign) more than once to collect the required data, which in turn makes the solution less practical for run-time detection of malware. In response to this challenge, in this work, we first identify the critical HPC features required for malware detection. Next, we explore the use of various ML techniques to classify benign and malware applications using the selected HPCs at run-time. Further, we investigate the effectiveness of ensemble learning in improving the performance of ML classifiers. For this purpose, we apply AdaBoost on all general ML classifiers. We thoroughly compare the general and ensemble ML classifiers in terms of accuracy, robustness, performance, and hardware overhead. The experimental results indicate that ensemble learning enhances the performance of malware detection for rule-based and tree-based algorithms up to 13%. However, it diminishes the performance of neural network and Bayesian network-based detectors by 6% and 4%, respectively.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134568927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信