EmBARDiment: an Embodied AI Agent for Productivity in XR

arXiv - CS - Multiagent Systems Pub Date : 2024-08-15 DOI:arxiv-2408.08158

Riccardo Bovo, Steven Abreu, Karan Ahuja, Eric J Gonzalez, Li-Te Cheng, Mar Gonzalez-Franco

引用次数: 0

Abstract

XR devices running chat-bots powered by Large Language Models (LLMs) have tremendous potential as always-on agents that can enable much better productivity scenarios. However, screen based chat-bots do not take advantage of the the full-suite of natural inputs available in XR, including inward facing sensor data, instead they over-rely on explicit voice or text prompts, sometimes paired with multi-modal data dropped as part of the query. We propose a solution that leverages an attention framework that derives context implicitly from user actions, eye-gaze, and contextual memory within the XR environment. This minimizes the need for engineered explicit prompts, fostering grounded and intuitive interactions that glean user insights for the chat-bot. Our user studies demonstrate the imminent feasibility and transformative potential of our approach to streamline user interaction in XR with chat-bots, while offering insights for the design of future XR-embodied LLM agents.

查看原文本刊更多论文

EmBARDiment：用于提高 XR 生产率的嵌入式人工智能代理

运行由大型语言模型（LLM）驱动的聊天机器人的 XR 设备作为始终在线的代理具有巨大的潜力，可以大大提高工作效率。然而，基于屏幕的聊天机器人并没有利用 XR 中可用的全套自然输入，包括面向内部的传感器数据，而是过度依赖明确的语音或文本提示，有时还搭配作为查询一部分的多模态数据。我们提出的解决方案利用了一种注意力框架，该框架可以从 XR 环境中的用户行为、眼球注视和上下文记忆中获取上下文。我们的用户研究证明了我们的方法在 XR 中简化用户与聊天机器人交互的迫切可行性和变革潜力，同时也为未来 XR 嵌入式 LLM 代理的设计提供了启示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Multiagent Systems

自引率

0.00%

发文量