探索边缘AI智慧城市应用的GPU共享技术

IF 1.6 4区 计算机科学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
Sooyeon Woo, Jihwan Yeo, Jinhong Kim, Kyungwoon Lee
{"title":"探索边缘AI智慧城市应用的GPU共享技术","authors":"Sooyeon Woo,&nbsp;Jihwan Yeo,&nbsp;Jinhong Kim,&nbsp;Kyungwoon Lee","doi":"10.4218/etrij.2025-0065","DOIUrl":null,"url":null,"abstract":"<p>The growing adoption of edge AI in smart city applications such as traffic management, surveillance, and environmental monitoring necessitates efficient computational strategies to satisfy the requirements for low latency and high accuracy. This study investigated GPU sharing techniques to improve resource utilization and throughput when running multiple AI applications simultaneously on edge devices. Using the NVIDIA Jetson AGX Orin platform and object detection workloads with the YOLOv8 model, we explored the performance tradeoffs of the threading and multiprocessing approaches. Our findings reveal distinct advantages and limitations. Threading minimizes memory usage by sharing CUDA contexts, whereas multiprocessing achieves higher GPU utilization and shorter inference times by leveraging independent CUDA contexts. However, scalability challenges arise from resource contention and synchronization overheads. This study provides insights into optimizing GPU sharing for edge AI applications, highlighting key tradeoffs and opportunities for enhancing performance in resource-constrained environments.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"47 5","pages":"855-864"},"PeriodicalIF":1.6000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2025-0065","citationCount":"0","resultStr":"{\"title\":\"Exploring GPU sharing techniques for edge AI smart city applications\",\"authors\":\"Sooyeon Woo,&nbsp;Jihwan Yeo,&nbsp;Jinhong Kim,&nbsp;Kyungwoon Lee\",\"doi\":\"10.4218/etrij.2025-0065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The growing adoption of edge AI in smart city applications such as traffic management, surveillance, and environmental monitoring necessitates efficient computational strategies to satisfy the requirements for low latency and high accuracy. This study investigated GPU sharing techniques to improve resource utilization and throughput when running multiple AI applications simultaneously on edge devices. Using the NVIDIA Jetson AGX Orin platform and object detection workloads with the YOLOv8 model, we explored the performance tradeoffs of the threading and multiprocessing approaches. Our findings reveal distinct advantages and limitations. Threading minimizes memory usage by sharing CUDA contexts, whereas multiprocessing achieves higher GPU utilization and shorter inference times by leveraging independent CUDA contexts. However, scalability challenges arise from resource contention and synchronization overheads. This study provides insights into optimizing GPU sharing for edge AI applications, highlighting key tradeoffs and opportunities for enhancing performance in resource-constrained environments.</p>\",\"PeriodicalId\":11901,\"journal\":{\"name\":\"ETRI Journal\",\"volume\":\"47 5\",\"pages\":\"855-864\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2025-0065\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ETRI Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.4218/etrij.2025-0065\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ETRI Journal","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.4218/etrij.2025-0065","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

随着交通管理、监控和环境监测等智能城市应用越来越多地采用边缘人工智能,需要高效的计算策略来满足低延迟和高精度的要求。本研究研究了GPU共享技术,以提高在边缘设备上同时运行多个人工智能应用程序时的资源利用率和吞吐量。使用NVIDIA Jetson AGX Orin平台和YOLOv8模型的对象检测工作负载,我们探索了线程和多处理方法的性能权衡。我们的发现揭示了明显的优势和局限性。线程通过共享CUDA上下文最小化内存使用,而多处理通过利用独立的CUDA上下文实现更高的GPU利用率和更短的推理时间。然而,可伸缩性挑战来自资源争用和同步开销。本研究为优化边缘人工智能应用的GPU共享提供了见解,突出了在资源受限环境中提高性能的关键权衡和机会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Exploring GPU sharing techniques for edge AI smart city applications

Exploring GPU sharing techniques for edge AI smart city applications

The growing adoption of edge AI in smart city applications such as traffic management, surveillance, and environmental monitoring necessitates efficient computational strategies to satisfy the requirements for low latency and high accuracy. This study investigated GPU sharing techniques to improve resource utilization and throughput when running multiple AI applications simultaneously on edge devices. Using the NVIDIA Jetson AGX Orin platform and object detection workloads with the YOLOv8 model, we explored the performance tradeoffs of the threading and multiprocessing approaches. Our findings reveal distinct advantages and limitations. Threading minimizes memory usage by sharing CUDA contexts, whereas multiprocessing achieves higher GPU utilization and shorter inference times by leveraging independent CUDA contexts. However, scalability challenges arise from resource contention and synchronization overheads. This study provides insights into optimizing GPU sharing for edge AI applications, highlighting key tradeoffs and opportunities for enhancing performance in resource-constrained environments.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ETRI Journal
ETRI Journal 工程技术-电信学
CiteScore
4.00
自引率
7.10%
发文量
98
审稿时长
6.9 months
期刊介绍: ETRI Journal is an international, peer-reviewed multidisciplinary journal published bimonthly in English. The main focus of the journal is to provide an open forum to exchange innovative ideas and technology in the fields of information, telecommunications, and electronics. Key topics of interest include high-performance computing, big data analytics, cloud computing, multimedia technology, communication networks and services, wireless communications and mobile computing, material and component technology, as well as security. With an international editorial committee and experts from around the world as reviewers, ETRI Journal publishes high-quality research papers on the latest and best developments from the global community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信