Cascade: A Platform for Delay-Sensitive Edge Intelligence

arXiv - CS - Operating Systems Pub Date : 2023-11-29 DOI:arxiv-2311.17329

Weijia Song, Thiago Garrett, Yuting Yang, Mingzhao Liu, Edward Tremel, Lorenzo Rosa, Andrea Merlina, Roman Vitenberg, Ken Birman

引用次数: 0

Abstract

Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to untangle this puzzle. Innovations include a legacy-friendly storage layer that moves data with minimal copying and a "fast path" that collocates data and computation to maximize responsiveness. Our evaluation shows that Cascade reduces latency by orders of magnitude with no loss of throughput.

查看原文本刊更多论文

级联:延迟敏感边缘智能平台

交互式智能计算应用程序越来越普遍，因此需要对AI/ML平台进行优化，以减少每个事件的延迟，同时保持高吞吐量和高效的资源管理。然而，许多智能应用程序运行在AI/ML平台上，即使以高尾延迟为代价，也会为高吞吐量进行优化。Cascade是一个新的AI/MLhosting平台，旨在解开这个谜团。创新包括传统友好的存储层，以最小的复制移动数据，以及“快速路径”，将数据和计算并置，以最大限度地提高响应能力。我们的评估表明，级联在不损失吞吐量的情况下减少了几个数量级的延迟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Operating Systems

自引率

0.00%

发文量