Hand-Priming in Object Localization for Assistive Egocentric Vision.

Kyungjun Lee, Abhinav Shrivastava, Hernisa Kacorri
{"title":"Hand-Priming in Object Localization for Assistive Egocentric Vision.","authors":"Kyungjun Lee, Abhinav Shrivastava, Hernisa Kacorri","doi":"10.1109/wacv45572.2020.9093353","DOIUrl":null,"url":null,"abstract":"<p><p>Egocentric vision holds great promises for increasing access to visual information and improving the quality of life for people with visual impairments, with object recognition being one of the daily challenges for this population. While we strive to improve recognition performance, it remains difficult to identify which object is of interest to the user; the object may not even be included in the frame due to challenges in camera aiming without visual feedback. Also, gaze information, commonly used to infer the area of interest in egocentric vision, is often not dependable. However, blind users often tend to include their hand either interacting with the object that they wish to recognize or simply placing it in proximity for better camera aiming. We propose localization models that leverage the presence of the hand as the contextual information for priming the center area of the object of interest. In our approach, hand segmentation is fed to either the entire localization network or its last convolutional layers. Using egocentric datasets from sighted and blind individuals, we show that the hand-priming achieves higher precision than other approaches, such as fine-tuning, multi-class, and multi-task learning, which also encode hand-object interactions in localization.</p>","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7423407/pdf/nihms-1609047.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/wacv45572.2020.9093353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/5/14 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Egocentric vision holds great promises for increasing access to visual information and improving the quality of life for people with visual impairments, with object recognition being one of the daily challenges for this population. While we strive to improve recognition performance, it remains difficult to identify which object is of interest to the user; the object may not even be included in the frame due to challenges in camera aiming without visual feedback. Also, gaze information, commonly used to infer the area of interest in egocentric vision, is often not dependable. However, blind users often tend to include their hand either interacting with the object that they wish to recognize or simply placing it in proximity for better camera aiming. We propose localization models that leverage the presence of the hand as the contextual information for priming the center area of the object of interest. In our approach, hand segmentation is fed to either the entire localization network or its last convolutional layers. Using egocentric datasets from sighted and blind individuals, we show that the hand-priming achieves higher precision than other approaches, such as fine-tuning, multi-class, and multi-task learning, which also encode hand-object interactions in localization.

辅助性眼心视觉物体定位中的手部定位
以自我为中心的视觉在增加视觉信息的获取和改善视障人士的生活质量方面大有可为,而物体识别则是视障人士面临的日常挑战之一。在我们努力提高识别性能的同时,识别用户感兴趣的物体仍然很困难;在没有视觉反馈的情况下,由于相机瞄准方面的挑战,物体甚至可能不在画面中。此外,在自我中心视觉中通常用于推断感兴趣区域的注视信息往往并不可靠。然而,盲人用户通常倾向于将他们的手与他们希望识别的物体进行互动,或者只是将其放在附近以更好地瞄准摄像机。我们提出的定位模型可以利用手的存在作为背景信息,以引出感兴趣物体的中心区域。在我们的方法中,手部分割信息被输入到整个定位网络或其最后的卷积层中。通过使用视力正常和失明人士的自我中心数据集,我们证明了手部定位比其他方法(如微调、多类和多任务学习)实现了更高的精确度,其他方法也在定位中编码了手与物体之间的相互作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信