针对边缘 GenAI 的延迟感知服务布局

Defense + Commercial Sensing Pub Date : 2024-06-06 DOI:10.1117/12.3013437

Bipul Thapa, Lena Mashayekhy

{"title":"针对边缘 GenAI 的延迟感知服务布局","authors":"Bipul Thapa, Lena Mashayekhy","doi":"10.1117/12.3013437","DOIUrl":null,"url":null,"abstract":"In the rapidly evolving landscape of artificial intelligence, Large Language Models (LLMs) and Generative AI (GenAI) have emerged as front-runners in shaping the next generation of intelligent applications, where human-like data generation is necessary. While their capabilities have shown transformative potential in centralized computing environments, there is a growing shift towards decentralized edge AI models, where computations are orchestrated closer to data sources to provide immediate insights, faster response times, and localized intelligence without the overhead of cloud communication. For latency-critical applications like autonomous vehicle driving, GenAI at the edge is vital, allowing vehicles to instantly generate and adapt driving strategies based on ever-changing road conditions and traffic patterns. In this paper, we propose a latency-aware service placement approach, designed for the seamless deployment of GenAI services on these cloudlets. We represent GenAI as a Direct Acyclic Graph, where GenAI operations represent the nodes and the dependencies between these operations represent the edges. We propose an Ant Colony Optimization approach that guides the placement of GenAI services at the edge based on capabilities of cloudlets and network conditions. Through experimental validation, we achieve notable GenAI performance at the edge with lower latency and efficient resource utilization. This advancement is expected to revolutionize and innovate in the field of GenAI, paving the way for more efficient and transformative applications at the edge.","PeriodicalId":178341,"journal":{"name":"Defense + Commercial Sensing","volume":"60 2","pages":"130580G - 130580G-14"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Latency-aware service placement for GenAI at the edge\",\"authors\":\"Bipul Thapa, Lena Mashayekhy\",\"doi\":\"10.1117/12.3013437\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the rapidly evolving landscape of artificial intelligence, Large Language Models (LLMs) and Generative AI (GenAI) have emerged as front-runners in shaping the next generation of intelligent applications, where human-like data generation is necessary. While their capabilities have shown transformative potential in centralized computing environments, there is a growing shift towards decentralized edge AI models, where computations are orchestrated closer to data sources to provide immediate insights, faster response times, and localized intelligence without the overhead of cloud communication. For latency-critical applications like autonomous vehicle driving, GenAI at the edge is vital, allowing vehicles to instantly generate and adapt driving strategies based on ever-changing road conditions and traffic patterns. In this paper, we propose a latency-aware service placement approach, designed for the seamless deployment of GenAI services on these cloudlets. We represent GenAI as a Direct Acyclic Graph, where GenAI operations represent the nodes and the dependencies between these operations represent the edges. We propose an Ant Colony Optimization approach that guides the placement of GenAI services at the edge based on capabilities of cloudlets and network conditions. Through experimental validation, we achieve notable GenAI performance at the edge with lower latency and efficient resource utilization. This advancement is expected to revolutionize and innovate in the field of GenAI, paving the way for more efficient and transformative applications at the edge.\",\"PeriodicalId\":178341,\"journal\":{\"name\":\"Defense + Commercial Sensing\",\"volume\":\"60 2\",\"pages\":\"130580G - 130580G-14\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Defense + Commercial Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3013437\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Defense + Commercial Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3013437","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在快速发展的人工智能领域，大型语言模型（LLM）和生成式人工智能（GenAI）已成为塑造下一代智能应用的领跑者，在这些应用中，需要像人类一样生成数据。虽然它们的能力在集中式计算环境中已显示出变革潜力，但现在正日益转向分散式边缘人工智能模型，在这种模型中，计算的协调工作更接近数据源，以提供即时洞察力、更快的响应时间和本地化智能，而无需云通信的开销。对于自动驾驶汽车等对延迟要求极高的应用来说，边缘 GenAI 至关重要，它能让汽车根据不断变化的路况和交通模式即时生成和调整驾驶策略。在本文中，我们提出了一种延迟感知服务放置方法，旨在将 GenAI 服务无缝部署到这些小云中。我们将 GenAI 表述为直接循环图，其中 GenAI 操作代表节点，这些操作之间的依赖关系代表边。我们提出了一种蚁群优化（Ant Colony Optimization）方法，可根据小云的能力和网络条件指导将 GenAI 服务放置在边上。通过实验验证，我们在边缘实现了显著的 GenAI 性能，延迟更低，资源利用效率更高。这一进步有望在 GenAI 领域带来革命性的创新，为更高效、更具变革性的边缘应用铺平道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Latency-aware service placement for GenAI at the edge

In the rapidly evolving landscape of artificial intelligence, Large Language Models (LLMs) and Generative AI (GenAI) have emerged as front-runners in shaping the next generation of intelligent applications, where human-like data generation is necessary. While their capabilities have shown transformative potential in centralized computing environments, there is a growing shift towards decentralized edge AI models, where computations are orchestrated closer to data sources to provide immediate insights, faster response times, and localized intelligence without the overhead of cloud communication. For latency-critical applications like autonomous vehicle driving, GenAI at the edge is vital, allowing vehicles to instantly generate and adapt driving strategies based on ever-changing road conditions and traffic patterns. In this paper, we propose a latency-aware service placement approach, designed for the seamless deployment of GenAI services on these cloudlets. We represent GenAI as a Direct Acyclic Graph, where GenAI operations represent the nodes and the dependencies between these operations represent the edges. We propose an Ant Colony Optimization approach that guides the placement of GenAI services at the edge based on capabilities of cloudlets and network conditions. Through experimental validation, we achieve notable GenAI performance at the edge with lower latency and efficient resource utilization. This advancement is expected to revolutionize and innovate in the field of GenAI, paving the way for more efficient and transformative applications at the edge.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Defense + Commercial Sensing

自引率

0.00%

发文量