EmBARDiment:用于提高 XR 生产率的嵌入式人工智能代理

Riccardo Bovo, Steven Abreu, Karan Ahuja, Eric J Gonzalez, Li-Te Cheng, Mar Gonzalez-Franco
{"title":"EmBARDiment:用于提高 XR 生产率的嵌入式人工智能代理","authors":"Riccardo Bovo, Steven Abreu, Karan Ahuja, Eric J Gonzalez, Li-Te Cheng, Mar Gonzalez-Franco","doi":"arxiv-2408.08158","DOIUrl":null,"url":null,"abstract":"XR devices running chat-bots powered by Large Language Models (LLMs) have\ntremendous potential as always-on agents that can enable much better\nproductivity scenarios. However, screen based chat-bots do not take advantage\nof the the full-suite of natural inputs available in XR, including inward\nfacing sensor data, instead they over-rely on explicit voice or text prompts,\nsometimes paired with multi-modal data dropped as part of the query. We propose\na solution that leverages an attention framework that derives context\nimplicitly from user actions, eye-gaze, and contextual memory within the XR\nenvironment. This minimizes the need for engineered explicit prompts, fostering\ngrounded and intuitive interactions that glean user insights for the chat-bot.\nOur user studies demonstrate the imminent feasibility and transformative\npotential of our approach to streamline user interaction in XR with chat-bots,\nwhile offering insights for the design of future XR-embodied LLM agents.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EmBARDiment: an Embodied AI Agent for Productivity in XR\",\"authors\":\"Riccardo Bovo, Steven Abreu, Karan Ahuja, Eric J Gonzalez, Li-Te Cheng, Mar Gonzalez-Franco\",\"doi\":\"arxiv-2408.08158\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"XR devices running chat-bots powered by Large Language Models (LLMs) have\\ntremendous potential as always-on agents that can enable much better\\nproductivity scenarios. However, screen based chat-bots do not take advantage\\nof the the full-suite of natural inputs available in XR, including inward\\nfacing sensor data, instead they over-rely on explicit voice or text prompts,\\nsometimes paired with multi-modal data dropped as part of the query. We propose\\na solution that leverages an attention framework that derives context\\nimplicitly from user actions, eye-gaze, and contextual memory within the XR\\nenvironment. This minimizes the need for engineered explicit prompts, fostering\\ngrounded and intuitive interactions that glean user insights for the chat-bot.\\nOur user studies demonstrate the imminent feasibility and transformative\\npotential of our approach to streamline user interaction in XR with chat-bots,\\nwhile offering insights for the design of future XR-embodied LLM agents.\",\"PeriodicalId\":501315,\"journal\":{\"name\":\"arXiv - CS - Multiagent Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multiagent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.08158\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.08158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

运行由大型语言模型(LLM)驱动的聊天机器人的 XR 设备作为始终在线的代理具有巨大的潜力,可以大大提高工作效率。然而,基于屏幕的聊天机器人并没有利用 XR 中可用的全套自然输入,包括面向内部的传感器数据,而是过度依赖明确的语音或文本提示,有时还搭配作为查询一部分的多模态数据。我们提出的解决方案利用了一种注意力框架,该框架可以从 XR 环境中的用户行为、眼球注视和上下文记忆中获取上下文。我们的用户研究证明了我们的方法在 XR 中简化用户与聊天机器人交互的迫切可行性和变革潜力,同时也为未来 XR 嵌入式 LLM 代理的设计提供了启示。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
EmBARDiment: an Embodied AI Agent for Productivity in XR
XR devices running chat-bots powered by Large Language Models (LLMs) have tremendous potential as always-on agents that can enable much better productivity scenarios. However, screen based chat-bots do not take advantage of the the full-suite of natural inputs available in XR, including inward facing sensor data, instead they over-rely on explicit voice or text prompts, sometimes paired with multi-modal data dropped as part of the query. We propose a solution that leverages an attention framework that derives context implicitly from user actions, eye-gaze, and contextual memory within the XR environment. This minimizes the need for engineered explicit prompts, fostering grounded and intuitive interactions that glean user insights for the chat-bot. Our user studies demonstrate the imminent feasibility and transformative potential of our approach to streamline user interaction in XR with chat-bots, while offering insights for the design of future XR-embodied LLM agents.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信