An audiovisual cognitive optimization strategy guided by salient object ranking for intelligent visual prothesis systems.

Journal of neural engineering Pub Date : 2024-11-19 DOI:10.1088/1741-2552/ad94a4

Junling Liang, Heng Li, Xinyu Chai, Qi Gao, Meixuan Zhou, Tianruo Guo, Yao Chen, Liqing Di

{"title":"An audiovisual cognitive optimization strategy guided by salient object ranking for intelligent visual prothesis systems.","authors":"Junling Liang, Heng Li, Xinyu Chai, Qi Gao, Meixuan Zhou, Tianruo Guo, Yao Chen, Liqing Di","doi":"10.1088/1741-2552/ad94a4","DOIUrl":null,"url":null,"abstract":"Objective: Visual prostheses are effective tools for restoring vision, yet real-world complexities pose ongoing challenges. The progress in AI has led to the emergence of the concept of intelligent visual prosthetics with auditory support, leveraging deep learning to create practical artificial vision perception beyond merely restoring natural sight for the blind.Approach: This study introduces an object-based attention mechanism that simulates human gaze points when observing the external world to descriptions of physical regions. By transforming this mechanism into a ranking problem of salient entity regions, we introduce prior visual attention cues to build a new salient object ranking dataset, and propose a salient object ranking (SaOR) network aimed at providing depth perception for prosthetic vision. Furthermore, we propose a SaOR-guided image description method to align with human observation patterns, toward providing additional visual information by auditory feedback. Finally, the integration of the two aforementioned algorithms constitutes an audiovisual cognitive optimization strategy for prosthetic vision.Main results: Through conducting psychophysical experiments based on scene description tasks under simulated prosthetic vision, we verify that the SaOR method improves the subjects' performance in terms of object identification and understanding the correlation among objects. Additionally, the cognitive optimization strategy incorporating image description further enhances their prosthetic visual cognition.Significance: This offers valuable technical insights for designing next-generation intelligent visual prostheses and establishes a theoretical groundwork for developing their visual information processing strategies. Code will be made publicly available.","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neural engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1741-2552/ad94a4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: Visual prostheses are effective tools for restoring vision, yet real-world complexities pose ongoing challenges. The progress in AI has led to the emergence of the concept of intelligent visual prosthetics with auditory support, leveraging deep learning to create practical artificial vision perception beyond merely restoring natural sight for the blind.

Approach: This study introduces an object-based attention mechanism that simulates human gaze points when observing the external world to descriptions of physical regions. By transforming this mechanism into a ranking problem of salient entity regions, we introduce prior visual attention cues to build a new salient object ranking dataset, and propose a salient object ranking (SaOR) network aimed at providing depth perception for prosthetic vision. Furthermore, we propose a SaOR-guided image description method to align with human observation patterns, toward providing additional visual information by auditory feedback. Finally, the integration of the two aforementioned algorithms constitutes an audiovisual cognitive optimization strategy for prosthetic vision.

Main results: Through conducting psychophysical experiments based on scene description tasks under simulated prosthetic vision, we verify that the SaOR method improves the subjects' performance in terms of object identification and understanding the correlation among objects. Additionally, the cognitive optimization strategy incorporating image description further enhances their prosthetic visual cognition.

Significance: This offers valuable technical insights for designing next-generation intelligent visual prostheses and establishes a theoretical groundwork for developing their visual information processing strategies. Code will be made publicly available.

查看原文本刊更多论文

以突出对象排序为指导的视听认知优化策略，适用于智能视觉假肢系统。

目的：视觉义肢是恢复视力的有效工具，但现实世界的复杂性带来了持续的挑战。随着人工智能的进步，出现了具有听觉支持的智能视觉义肢的概念，利用深度学习来创造实用的人工视觉感知，而不仅仅是为盲人恢复自然视力：本研究引入了一种基于物体的注意力机制，该机制模拟人类观察外部世界时的注视点，以描述物理区域。通过将这一机制转化为突出实体区域的排序问题，我们引入了先前的视觉注意力线索，建立了一个新的突出物体排序数据集，并提出了一个突出物体排序（SaOR）网络，旨在为假肢视觉提供深度感知。此外，我们还提出了一种以 SaOR 为导向的图像描述方法，以符合人类的观察模式，从而通过听觉反馈提供额外的视觉信息。最后，上述两种算法的整合构成了假肢视觉的视听认知优化策略：通过在模拟假肢视觉下进行基于场景描述任务的心理物理实验，我们验证了 SaOR 方法提高了受试者在物体识别和理解物体间相关性方面的表现。此外，结合图像描述的认知优化策略进一步增强了受试者的假肢视觉认知能力：意义：这为设计下一代智能视觉义肢提供了宝贵的技术启示，并为开发义肢的视觉信息处理策略奠定了理论基础。代码将公开发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of neural engineering

自引率

0.00%

发文量