Proceedings of the 18th ACM International Conference on Computing Frontiers最新文献_第2页

Performance prediction for convolutional neural networks on edge GPUs 边缘gpu上卷积神经网络的性能预测

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458666

Halima Bouzidi, Hamza Ouarnoughi, S. Niar, Abdessamad Ait El Cadi

{"title":"Performance prediction for convolutional neural networks on edge GPUs","authors":"Halima Bouzidi, Hamza Ouarnoughi, S. Niar, Abdessamad Ait El Cadi","doi":"10.1145/3457388.3458666","DOIUrl":"https://doi.org/10.1145/3457388.3458666","url":null,"abstract":"Edge computing is increasingly used for Artificial Intelligence (AI) purposes to meet latency, privacy, and energy challenges. Convolutional Neural networks (CNN) are more frequently deployed on Edge devices for several applications. However, due to their constrained computing resources and energy budget, Edge devices struggle to meet CNN's latency requirements while maintaining good accuracy. It is, therefore, crucial to choose the CNN with the best accuracy and latency trade-off while respecting hardware constraints. This paper presents and compares five of the widely used Machine Learning (ML) based approaches to predict CNN's inference execution time on Edge GPUs. For these 5 methods, in addition to their prediction accuracy, we also explore the time needed for their training and their hyperparameters' tuning. Finally, we compare times to run the prediction models on different platforms. The use of these methods will highly facilitate design space exploration by quickly providing the best CNN on a target Edge GPU. Experimental results show that XGBoost provides an interesting average prediction error even for unexplored and unseen CNN architectures. Random Forest depicts comparable accuracy but needs more effort and time to be trained. The other 3 approaches (OLS, MLP, and SVR) are less accurate for CNN performance estimation.","PeriodicalId":136482,"journal":{"name":"Proceedings of the 18th ACM International Conference on Computing Frontiers","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123846201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Scaling of multi-core quantum architectures: a communications-aware structured gap analysis 多核量子架构的缩放:一个通信感知的结构化差距分析

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458674

Santiago Rodrigo, Medina Bandic, S. Abadal, Hans van Someren, E. Alarcón, C. G. Almudever

{"title":"Scaling of multi-core quantum architectures: a communications-aware structured gap analysis","authors":"Santiago Rodrigo, Medina Bandic, S. Abadal, Hans van Someren, E. Alarcón, C. G. Almudever","doi":"10.1145/3457388.3458674","DOIUrl":"https://doi.org/10.1145/3457388.3458674","url":null,"abstract":"In the quest of large-scale quantum computers, multi-core distributed architectures are considered a compelling alternative to be explored. A crucial aspect in such approach is the stringent demand on communication among cores when qubits need to interact, which conditions the scalability potential of these architectures. In this work, we address the question of how the cost of the communication among cores impacts on the viability of the quantum multi-core approach. Methodologically, we consider a design space in which architectural variables (number of cores, number of qubits per core), application variables for several quantum benchmarks (number of qubits, number of gates, percentage of two-qubit gates) and inter-core communication latency are swept along with the definition of a figure of merit. This approach yields both a qualitative understanding of trends in the design space and companion dimensioning guidelines for the architecture, including optimal points, as well as quantitative answers to the question of beyond which communication performance levels the multi-core architecture pays off. Our results allow to determine the thresholds for inter-core communication latency in order for multi-core architectures to outperform single-core quantum processors.","PeriodicalId":136482,"journal":{"name":"Proceedings of the 18th ACM International Conference on Computing Frontiers","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116031849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

The Italian research on HPC key technologies across EuroHPC 意大利对欧洲高性能计算关键技术的研究

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458508

Marco Aldinucci, G. Agosta, A. Andreini, C. Ardagna, Andrea Bartolini, A. Cilardo, Biagio Cosenza, M. Danelutto, Roberto Esposito, W. Fornaciari, R. Giorgi, D. Lengani, R. Montella, M. Olivieri, S. Saponara, D. Simoni, M. Torquati

{"title":"The Italian research on HPC key technologies across EuroHPC","authors":"Marco Aldinucci, G. Agosta, A. Andreini, C. Ardagna, Andrea Bartolini, A. Cilardo, Biagio Cosenza, M. Danelutto, Roberto Esposito, W. Fornaciari, R. Giorgi, D. Lengani, R. Montella, M. Olivieri, S. Saponara, D. Simoni, M. Torquati","doi":"10.1145/3457388.3458508","DOIUrl":"https://doi.org/10.1145/3457388.3458508","url":null,"abstract":"High-Performance Computing (HPC) is one of the strategic priorities for research and innovation worldwide due to its relevance for industrial and scientific applications. We envision HPC as composed of three pillars: infrastructures, applications, and key technologies and tools. While infrastructures are by construction centralized in large-scale HPC centers, and applications are generally within the purview of domain-specific organizations, key technologies fall in an intermediate case where coordination is needed, but design and development are often decentralized. A large group of Italian researchers has started a dedicated laboratory within the National Interuniversity Consortium for Informatics (CINI) to address this challenge. The laboratory, albeit young, has managed to succeed in its first attempts to propose a coordinated approach to HPC research within the EuroHPC Joint Undertaking, participating in the calls 2019--20 to five successful proposals for an aggregate total cost of 95M€. In this paper, we outline the working group's scope and goals and provide an overview of the five funded projects, which become fully operational in March 2021, and cover a selection of key technologies provided by the working group partners, highlighting their usage development within the projects.","PeriodicalId":136482,"journal":{"name":"Proceedings of the 18th ACM International Conference on Computing Frontiers","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123556894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

On resilience of security-oriented error detecting architectures against power attacks: a theoretical analysis 面向安全的错误检测体系结构对电源攻击的弹性:理论分析

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458867

O. Keren, I. Polian

引用次数: 0

EVOLVE: HPC and cloud enhanced testbed for extracting value from large-scale diverse data EVOLVE: HPC和云增强测试平台，用于从大规模不同数据中提取价值

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458621

A. Chazapis, Jean-Thomas Acquaviva, A. Bilas, G. Gardikis, C. Kozanitis, S. Louloudakis, H. Nguyen, Christian Pinto, A. Scharl, D. Soudris

引用次数: 3

TEA-fed: time-efficient asynchronous federated learning for edge computing TEA-fed:用于边缘计算的高效异步联邦学习

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458655

Chen Zhou, Hao Tian, Hong Zhang, Jin Zhang, M. Dong, Juncheng Jia

{"title":"TEA-fed: time-efficient asynchronous federated learning for edge computing","authors":"Chen Zhou, Hao Tian, Hong Zhang, Jin Zhang, M. Dong, Juncheng Jia","doi":"10.1145/3457388.3458655","DOIUrl":"https://doi.org/10.1145/3457388.3458655","url":null,"abstract":"Federated learning (FL) has attracted more and more attention recently. The integration of FL and edge computing makes the edge system more efficient and intelligent. FL usually uses the server to actively select certain edge devices to participate in the global model training. However, the selected edge devices may be stragglers, or even crash during training. Meanwhile, the unselected idle edge devices cannot be fully utilized for training. Therefore, besides the widely studied communication efficiency and data heterogeneity issues in FL, we also take the above time efficiency into consideration, and propose a time-efficient asynchronous federated learning protocol, TEA-Fed, to solve these problems. With TEA-Fed, idle edge devices actively apply for training tasks and participate in model training asynchronously once assigned tasks. Considering that there may be a huge number of edge devices in edge computing, we introduce control parameters to limit the number of devices participating in training the identical model at the same time. Meanwhile, we also introduce caching mechanism and weighted averaging with respect to model staleness in the model aggregation step to reduce the adverse effects of model staleness and further improve the accuracy of the global model. Finally, the experimental results show that the protocol can accelerate the convergence of model training, improve the accuracy, and has robustness to heterogeneous data.","PeriodicalId":136482,"journal":{"name":"Proceedings of the 18th ACM International Conference on Computing Frontiers","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132759401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

cREAtIve: reconfigurable embedded artificial intelligence 创造性:可重构的嵌入式人工智能

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458857

Poona Bahrebar, Leon Denis, Maxim Bonnaerens, Kristof Coddens, J. Dambre, W. Favoreel, I. Khvastunov, A. Munteanu, Hung Nguyen-Duc, S. Schulte, D. Stroobandt, Ramses Valvekens, N. V. D. Broeck, Geert Verbruggen

引用次数: 0

Intelligent UAV-aided controller placement scheme for software-defined vehicular networks 软件定义车辆网络智能无人机辅助控制器布局方案

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458809

Na Lin, Qi Zhao, Liang Zhao

引用次数: 0

Architecting more than Moore: wireless plasticity for massive heterogeneous computer architectures (WiPLASH) 超越摩尔的架构:大规模异构计算机架构的无线可塑性(WiPLASH)

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458859

Joshua Klein, A. Levisse, G. Ansaloni, David Atienza Alonso, Marina Zapater, M. Dazzi, G. Karunaratne, I. Boybat, A. Sebastian, D. Rossi, Francesco Conti, Elana Pereira de Santana, P. Bolívar, M. Saeed, R. Negra, Zhenxing Wang, Kun-Ta Wang, M. Lemme, Akshay Jain, Robert Guirado, H. Taghvaee, S. Abadal

引用次数: 0

Ultra-compact binary neural networks for human activity recognition on RISC-V processors RISC-V处理器上用于人类活动识别的超紧凑二进制神经网络

Proceedings of the 18th ACM International Conference on Computing Frontiers Pub Date : 2021-05-11 DOI: 10.1145/3457388.3458656

Francesco Daghero, Chenhao Xie, D. J. Pagliari, A. Burrello, Marco Castellano, Luca Gandolfi, A. Calimera, E. Macii, M. Poncino

{"title":"Ultra-compact binary neural networks for human activity recognition on RISC-V processors","authors":"Francesco Daghero, Chenhao Xie, D. J. Pagliari, A. Burrello, Marco Castellano, Luca Gandolfi, A. Calimera, E. Macii, M. Poncino","doi":"10.1145/3457388.3458656","DOIUrl":"https://doi.org/10.1145/3457388.3458656","url":null,"abstract":"Human Activity Recognition (HAR) is a relevant inference task in many mobile applications. State-of-the-art HAR at the edge is typically achieved with lightweight machine learning models such as decision trees and Random Forests (RFs), whereas deep learning is less common due to its high computational complexity. In this work, we propose a novel implementation of HAR based on deep neural networks, and precisely on Binary Neural Networks (BNNs), targeting low-power general purpose processors with a RISC-V instruction set. BNNs yield very small memory footprints and low inference complexity, thanks to the replacement of arithmetic operations with bit-wise ones. However, existing BNN implementations on general purpose processors impose constraints tailored to complex computer vision tasks, which result in over-parametrized models for simpler problems like HAR. Therefore, we also introduce a new BNN inference library, which targets ultra-compact models explicitly. With experiments on a single-core RISC-V processor, we show that BNNs trained on two HAR datasets obtain higher classification accuracy compared to a state-of-the-art baseline based on RFs. Furthermore, our BNN reaches the same accuracy of a RF with either less memory (up to 91%) or more energy-efficiency (up to 70%), depending on the complexity of the features extracted by the RF.","PeriodicalId":136482,"journal":{"name":"Proceedings of the 18th ACM International Conference on Computing Frontiers","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121303955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11