更快更强：通过硬件异构释放数据处理潜力

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Internet of Things Journal Pub Date : 2025-01-07 DOI:10.1109/JIOT.2025.3526662

Cong Wang;Yang Luo;Wenzhuo Du;Ke Wang;Naijie Gu;Jun Yu

{"title":"更快更强：通过硬件异构释放数据处理潜力","authors":"Cong Wang;Yang Luo;Wenzhuo Du;Ke Wang;Naijie Gu;Jun Yu","doi":"10.1109/JIOT.2025.3526662","DOIUrl":null,"url":null,"abstract":"With the rapid advancement of AI technology, there has been a substantial surge in the need for computational resources. Particularly in deep learning, machine learning, and large-scale data analysis, the processing of extensive datasets necessitates exceptionally high levels of computational efficacy and speed. Conventional homogeneous computing platforms, predominantly reliant on central processing units (CPUs), have encountered challenges in meeting the escalating demands for high-performance computing. Consequently, this study advocates for heterogeneous hardware acceleration technology, strategically migrating data operations from CPU to varied hardware components [e.g., graphics processing unit (GPU), neural processing unit (NPU)] to enhance processing efficiency and computational performance during the data preprocessing phase. We conducted experiments to evaluate the impact of utilizing hardware heterogeneous acceleration technologies on data processing speed under various workloads and system hardware configurations. By adjusting parameters like batch size and CPU utilization rates, we compared the performance of frameworks that support hardware heterogeneity with popular deep learning frameworks (e.g., PyTorch and TensorFlow) across various hardware configurations and neural network models. Empirical findings demonstrate that the system framework optimized through heterogeneous hardware acceleration technology (the preprocessing speed is improved in all the given experimental environment tests) exhibits commendable universality and superiority in performance. Codes are available at <uri>https://github.com/mindspore-ai/mindspore</uri>.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 10","pages":"14559-14576"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Faster and Stronger: Unleashing Data Processing Potential Through Hardware Heterogeneity\",\"authors\":\"Cong Wang;Yang Luo;Wenzhuo Du;Ke Wang;Naijie Gu;Jun Yu\",\"doi\":\"10.1109/JIOT.2025.3526662\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid advancement of AI technology, there has been a substantial surge in the need for computational resources. Particularly in deep learning, machine learning, and large-scale data analysis, the processing of extensive datasets necessitates exceptionally high levels of computational efficacy and speed. Conventional homogeneous computing platforms, predominantly reliant on central processing units (CPUs), have encountered challenges in meeting the escalating demands for high-performance computing. Consequently, this study advocates for heterogeneous hardware acceleration technology, strategically migrating data operations from CPU to varied hardware components [e.g., graphics processing unit (GPU), neural processing unit (NPU)] to enhance processing efficiency and computational performance during the data preprocessing phase. We conducted experiments to evaluate the impact of utilizing hardware heterogeneous acceleration technologies on data processing speed under various workloads and system hardware configurations. By adjusting parameters like batch size and CPU utilization rates, we compared the performance of frameworks that support hardware heterogeneity with popular deep learning frameworks (e.g., PyTorch and TensorFlow) across various hardware configurations and neural network models. Empirical findings demonstrate that the system framework optimized through heterogeneous hardware acceleration technology (the preprocessing speed is improved in all the given experimental environment tests) exhibits commendable universality and superiority in performance. Codes are available at <uri>https://github.com/mindspore-ai/mindspore</uri>.\",\"PeriodicalId\":54347,\"journal\":{\"name\":\"IEEE Internet of Things Journal\",\"volume\":\"12 10\",\"pages\":\"14559-14576\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Internet of Things Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10829854/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10829854/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

随着人工智能技术的快速发展，对计算资源的需求大幅增加。特别是在深度学习、机器学习和大规模数据分析中，处理大量数据集需要非常高的计算效率和速度。传统的同构计算平台主要依赖于中央处理单元（cpu），在满足不断增长的高性能计算需求方面遇到了挑战。因此，本研究提倡异构硬件加速技术，战略性地将数据操作从CPU迁移到各种硬件组件[例如，图形处理单元（GPU），神经处理单元（NPU）]，以提高数据预处理阶段的处理效率和计算性能。我们进行了实验来评估在各种工作负载和系统硬件配置下利用硬件异构加速技术对数据处理速度的影响。通过调整批处理大小和CPU利用率等参数，我们比较了支持硬件异构的框架与流行的深度学习框架（例如PyTorch和TensorFlow）在各种硬件配置和神经网络模型中的性能。实证结果表明，采用异构硬件加速技术优化的系统框架（在所有给定的实验环境测试中预处理速度都有所提高）具有良好的通用性和性能优越性。代码可在https://github.com/mindspore-ai/mindspore上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Faster and Stronger: Unleashing Data Processing Potential Through Hardware Heterogeneity

With the rapid advancement of AI technology, there has been a substantial surge in the need for computational resources. Particularly in deep learning, machine learning, and large-scale data analysis, the processing of extensive datasets necessitates exceptionally high levels of computational efficacy and speed. Conventional homogeneous computing platforms, predominantly reliant on central processing units (CPUs), have encountered challenges in meeting the escalating demands for high-performance computing. Consequently, this study advocates for heterogeneous hardware acceleration technology, strategically migrating data operations from CPU to varied hardware components [e.g., graphics processing unit (GPU), neural processing unit (NPU)] to enhance processing efficiency and computational performance during the data preprocessing phase. We conducted experiments to evaluate the impact of utilizing hardware heterogeneous acceleration technologies on data processing speed under various workloads and system hardware configurations. By adjusting parameters like batch size and CPU utilization rates, we compared the performance of frameworks that support hardware heterogeneity with popular deep learning frameworks (e.g., PyTorch and TensorFlow) across various hardware configurations and neural network models. Empirical findings demonstrate that the system framework optimized through heterogeneous hardware acceleration technology (the preprocessing speed is improved in all the given experimental environment tests) exhibits commendable universality and superiority in performance. Codes are available at https://github.com/mindspore-ai/mindspore.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Internet of Things Journal Computer Science-Information Systems

CiteScore

17.60

自引率

13.20%

发文量

1982

期刊介绍： The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.