Attentional object spotting by integrating multimodal input

Chen Yu, D. Ballard, Shenghuo Zhu
{"title":"Attentional object spotting by integrating multimodal input","authors":"Chen Yu, D. Ballard, Shenghuo Zhu","doi":"10.1109/ICMI.2002.1167008","DOIUrl":null,"url":null,"abstract":"An intelligent human-computer interface is expected to allow computers to work with users in a cooperative manner. To achieve this goal, computers need to be aware of user attention and provide assistance without explicit user requests. Cognitive studies of eye movements suggest that in accomplishing well-learned tasks, the performer's focus of attention is locked onto ongoing work and more than 90% of eye movements are closely related to the objects being manipulated in the tasks. In light of this, we have developed an attentional object spotting system that integrates multimodal data consisting of eye position, head position and video from the \"first-person\" perspective. To detect the user's focus of attention, we modeled eye gaze and head movements using a hidden Markov model (HMM) representation. For each attentional point in time, the object of user interest is automatically extracted and recognized. We report the results of experiments on finding attentional objects in the natural task of \"making a peanut-butter sandwich\".","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMI.2002.1167008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

An intelligent human-computer interface is expected to allow computers to work with users in a cooperative manner. To achieve this goal, computers need to be aware of user attention and provide assistance without explicit user requests. Cognitive studies of eye movements suggest that in accomplishing well-learned tasks, the performer's focus of attention is locked onto ongoing work and more than 90% of eye movements are closely related to the objects being manipulated in the tasks. In light of this, we have developed an attentional object spotting system that integrates multimodal data consisting of eye position, head position and video from the "first-person" perspective. To detect the user's focus of attention, we modeled eye gaze and head movements using a hidden Markov model (HMM) representation. For each attentional point in time, the object of user interest is automatically extracted and recognized. We report the results of experiments on finding attentional objects in the natural task of "making a peanut-butter sandwich".
整合多模态输入的注意目标识别
一个智能的人机界面被期望允许计算机以一种合作的方式与用户一起工作。为了实现这一目标,计算机需要意识到用户的注意力,并在没有用户明确请求的情况下提供帮助。对眼球运动的认知研究表明,在完成熟练掌握的任务时,表演者的注意力集中在正在进行的工作上,超过90%的眼球运动与任务中被操纵的物体密切相关。鉴于此,我们开发了一个注意力物体识别系统,该系统集成了多模态数据,包括眼睛位置、头部位置和“第一人称”视角的视频。为了检测用户的注意力焦点,我们使用隐马尔可夫模型(HMM)表示对眼睛注视和头部运动进行建模。对于每一个关注时间点,自动提取和识别用户感兴趣的对象。我们报告了在“制作花生酱三明治”这一自然任务中寻找注意对象的实验结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信