提高现代 GPU 上胶囊网络的能效

IF 1.4 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Computer Architecture Letters Pub Date : 2024-02-23 DOI:10.1109/LCA.2024.3365149

Mohammad Hafezan;Ehsan Atoofian

{"title":"提高现代 GPU 上胶囊网络的能效","authors":"Mohammad Hafezan;Ehsan Atoofian","doi":"10.1109/LCA.2024.3365149","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNNs) have become the compelling solution in machine learning applications as they surpass human-level accuracy in a certain set of tasks. Despite the success of CNNs, they classify images based on the identification of specific features, ignoring the spatial relationships between different features due to the pooling layer. The capsule network (CapsNet) architecture proposed by Google Brain's team is an attempt to address this drawback by grouping several neurons into a single capsule and learning the spatial correlations between different input features. Thus, the CapsNet identifies not only the presence of a feature but also its relationship with other features. However, the success of the CapsNet comes at the cost of underutilization of resources when it is run on a modern GPU equipped with tensor cores (TCs). Due to the structure of capsules in the CapsNet, quite often, functional units in a TC are underutilized which prolong the execution of capsule layers and increase energy consumption. In this work, we propose an architecture to eliminate ineffectual operations and improve energy-efficiency of GPUs. Experimental measurements over a set of state-of-the-art datasets show that the proposed approach improves energy-efficiency by 15% while maintaining the accuracy of CapsNets.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"23 1","pages":"49-52"},"PeriodicalIF":1.4000,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Energy-Efficiency of Capsule Networks on Modern GPUs\",\"authors\":\"Mohammad Hafezan;Ehsan Atoofian\",\"doi\":\"10.1109/LCA.2024.3365149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural networks (CNNs) have become the compelling solution in machine learning applications as they surpass human-level accuracy in a certain set of tasks. Despite the success of CNNs, they classify images based on the identification of specific features, ignoring the spatial relationships between different features due to the pooling layer. The capsule network (CapsNet) architecture proposed by Google Brain's team is an attempt to address this drawback by grouping several neurons into a single capsule and learning the spatial correlations between different input features. Thus, the CapsNet identifies not only the presence of a feature but also its relationship with other features. However, the success of the CapsNet comes at the cost of underutilization of resources when it is run on a modern GPU equipped with tensor cores (TCs). Due to the structure of capsules in the CapsNet, quite often, functional units in a TC are underutilized which prolong the execution of capsule layers and increase energy consumption. In this work, we propose an architecture to eliminate ineffectual operations and improve energy-efficiency of GPUs. Experimental measurements over a set of state-of-the-art datasets show that the proposed approach improves energy-efficiency by 15% while maintaining the accuracy of CapsNets.\",\"PeriodicalId\":51248,\"journal\":{\"name\":\"IEEE Computer Architecture Letters\",\"volume\":\"23 1\",\"pages\":\"49-52\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-02-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Computer Architecture Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10444758/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Architecture Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10444758/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

卷积神经网络（CNN）已成为机器学习应用中引人注目的解决方案，因为它们在某些任务中的准确性已超过人类水平。尽管卷积神经网络取得了巨大成功，但它们是基于对特定特征的识别来对图像进行分类的，由于池化层的存在，忽略了不同特征之间的空间关系。谷歌大脑团队提出的胶囊网络（CapsNet）架构试图解决这一缺陷，它将多个神经元组合成一个胶囊，并学习不同输入特征之间的空间相关性。因此，CapsNet 不仅能识别特征的存在，还能识别其与其他特征之间的关系。然而，当 CapsNet 在配备张量内核（TC）的现代 GPU 上运行时，其成功的代价是资源利用率不足。由于 CapsNet 中的胶囊结构，TC 中的功能单元往往未得到充分利用，从而延长了胶囊层的执行时间，增加了能耗。在这项工作中，我们提出了一种消除无效操作和提高 GPU 能效的架构。在一组最先进的数据集上进行的实验测量表明，所提出的方法可将能效提高 15%，同时保持 CapsNets 的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Energy-Efficiency of Capsule Networks on Modern GPUs

Convolutional neural networks (CNNs) have become the compelling solution in machine learning applications as they surpass human-level accuracy in a certain set of tasks. Despite the success of CNNs, they classify images based on the identification of specific features, ignoring the spatial relationships between different features due to the pooling layer. The capsule network (CapsNet) architecture proposed by Google Brain's team is an attempt to address this drawback by grouping several neurons into a single capsule and learning the spatial correlations between different input features. Thus, the CapsNet identifies not only the presence of a feature but also its relationship with other features. However, the success of the CapsNet comes at the cost of underutilization of resources when it is run on a modern GPU equipped with tensor cores (TCs). Due to the structure of capsules in the CapsNet, quite often, functional units in a TC are underutilized which prolong the execution of capsule layers and increase energy consumption. In this work, we propose an architecture to eliminate ineffectual operations and improve energy-efficiency of GPUs. Experimental measurements over a set of state-of-the-art datasets show that the proposed approach improves energy-efficiency by 15% while maintaining the accuracy of CapsNets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Computer Architecture Letters COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-

CiteScore

4.60

自引率

4.30%

发文量

期刊介绍： IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. Submissions are welcomed on any topic in computer architecture, especially but not limited to: microprocessor and multiprocessor systems, microarchitecture and ILP processors, workload characterization, performance evaluation and simulation techniques, compiler-hardware and operating system-hardware interactions, interconnect architectures, memory and cache systems, power and thermal issues at the architecture level, I/O architectures and techniques, independent validation of previously published results, analysis of unsuccessful techniques, domain-specific processor architectures (e.g., embedded, graphics, network, etc.), real-time and high-availability architectures, reconfigurable systems.