Eye Tracking-Enhanced Deep Learning for Medical Image Analysis: A Systematic Review on Data Efficiency, Interpretability, and Multimodal Integration.

IF 3.7 3区医学 Q2 ENGINEERING, BIOMEDICAL

Bioengineering Pub Date : 2025-09-05 DOI:10.3390/bioengineering12090954

Jiangxia Duan, Meiwei Zhang, Minghui Song, Xiaopan Xu, Hongbing Lu

{"title":"Eye Tracking-Enhanced Deep Learning for Medical Image Analysis: A Systematic Review on Data Efficiency, Interpretability, and Multimodal Integration.","authors":"Jiangxia Duan, Meiwei Zhang, Minghui Song, Xiaopan Xu, Hongbing Lu","doi":"10.3390/bioengineering12090954","DOIUrl":null,"url":null,"abstract":"<p><p>Deep learning (DL) has revolutionized medical image analysis (MIA), enabling early anomaly detection, precise lesion segmentation, and automated disease classification. However, its clinical integration faces two major challenges: reliance on limited, narrowly annotated datasets that inadequately capture real-world patient diversity, and the inherent \"black-box\" nature of DL decision-making, which complicates physician scrutiny and accountability. Eye tracking (ET) technology offers a transformative solution by capturing radiologists' gaze patterns to generate supervisory signals. These signals enhance DL models through two key mechanisms: providing weak supervision to improve feature recognition and diagnostic accuracy, particularly when labeled data are scarce, and enabling direct comparison between machine and human attention to bridge interpretability gaps and build clinician trust. This approach also extends effectively to multimodal learning models (MLMs) and vision-language models (VLMs), supporting the alignment of machine reasoning with clinical expertise by grounding visual observations in diagnostic context, refining attention mechanisms, and validating complex decision pathways. Conducted in accordance with the PRISMA statement and registered in PROSPERO (ID: CRD42024569630), this review synthesizes state-of-the-art strategies for ET-DL integration. We further propose a unified framework in which ET innovatively serves as a data efficiency optimizer, a model interpretability validator, and a multimodal alignment supervisor. This framework paves the way for clinician-centered AI systems that prioritize verifiable reasoning, seamless workflow integration, and intelligible performance, thereby addressing key implementation barriers and outlining a path for future clinical deployment.</p>","PeriodicalId":8874,"journal":{"name":"Bioengineering","volume":"12 9","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12467291/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioengineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/bioengineering12090954","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning (DL) has revolutionized medical image analysis (MIA), enabling early anomaly detection, precise lesion segmentation, and automated disease classification. However, its clinical integration faces two major challenges: reliance on limited, narrowly annotated datasets that inadequately capture real-world patient diversity, and the inherent "black-box" nature of DL decision-making, which complicates physician scrutiny and accountability. Eye tracking (ET) technology offers a transformative solution by capturing radiologists' gaze patterns to generate supervisory signals. These signals enhance DL models through two key mechanisms: providing weak supervision to improve feature recognition and diagnostic accuracy, particularly when labeled data are scarce, and enabling direct comparison between machine and human attention to bridge interpretability gaps and build clinician trust. This approach also extends effectively to multimodal learning models (MLMs) and vision-language models (VLMs), supporting the alignment of machine reasoning with clinical expertise by grounding visual observations in diagnostic context, refining attention mechanisms, and validating complex decision pathways. Conducted in accordance with the PRISMA statement and registered in PROSPERO (ID: CRD42024569630), this review synthesizes state-of-the-art strategies for ET-DL integration. We further propose a unified framework in which ET innovatively serves as a data efficiency optimizer, a model interpretability validator, and a multimodal alignment supervisor. This framework paves the way for clinician-centered AI systems that prioritize verifiable reasoning, seamless workflow integration, and intelligible performance, thereby addressing key implementation barriers and outlining a path for future clinical deployment.

查看原文本刊更多论文

眼动追踪-增强深度学习用于医学图像分析：对数据效率、可解释性和多模态集成的系统回顾。

深度学习（DL）彻底改变了医学图像分析（MIA），实现了早期异常检测、精确的病变分割和自动疾病分类。然而，它的临床整合面临着两个主要挑战：依赖于有限的、狭隘的注释数据集，这些数据集不能充分捕捉现实世界患者的多样性，以及DL决策固有的“黑箱”性质，这使得医生的审查和问责复杂化。眼动追踪（ET）技术通过捕捉放射科医生的凝视模式来生成监控信号，提供了一种变革性的解决方案。这些信号通过两个关键机制增强深度学习模型：提供弱监督以提高特征识别和诊断准确性，特别是在标记数据稀缺的情况下，以及在机器和人类注意力之间进行直接比较，以弥合可解释性差距并建立临床医生的信任。这种方法还有效地扩展到多模态学习模型（MLMs）和视觉语言模型（VLMs），通过在诊断环境中建立视觉观察，精炼注意力机制和验证复杂的决策途径，支持机器推理与临床专业知识的结合。根据PRISMA声明进行并在PROSPERO （ID: CRD42024569630）注册，该综述综合了ET-DL整合的最先进策略。我们进一步提出了一个统一的框架，其中ET创新地充当数据效率优化器、模型可解释性验证器和多模态对齐监督器。该框架为以临床为中心的人工智能系统铺平了道路，这些系统优先考虑可验证的推理、无缝的工作流集成和可理解的性能，从而解决了关键的实施障碍，并为未来的临床部署概述了路径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioengineering Chemical Engineering-Bioengineering

CiteScore

4.00

自引率

8.70%

发文量

661

期刊介绍： Aims Bioengineering (ISSN 2306-5354) provides an advanced forum for the science and technology of bioengineering. It publishes original research papers, comprehensive reviews, communications and case reports. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. All aspects of bioengineering are welcomed from theoretical concepts to education and applications. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. There are, in addition, four key features of this Journal: ● We are introducing a new concept in scientific and technical publications “The Translational Case Report in Bioengineering”. It is a descriptive explanatory analysis of a transformative or translational event. Understanding that the goal of bioengineering scholarship is to advance towards a transformative or clinical solution to an identified transformative/clinical need, the translational case report is used to explore causation in order to find underlying principles that may guide other similar transformative/translational undertakings. ● Manuscripts regarding research proposals and research ideas will be particularly welcomed. ● Electronic files and software regarding the full details of the calculation and experimental procedure, if unable to be published in a normal way, can be deposited as supplementary material. ● We also accept manuscripts communicating to a broader audience with regard to research projects financed with public funds. Scope ● Bionics and biological cybernetics: implantology; bio–abio interfaces ● Bioelectronics: wearable electronics; implantable electronics; “more than Moore” electronics; bioelectronics devices ● Bioprocess and biosystems engineering and applications: bioprocess design; biocatalysis; bioseparation and bioreactors; bioinformatics; bioenergy; etc. ● Biomolecular, cellular and tissue engineering and applications: tissue engineering; chromosome engineering; embryo engineering; cellular, molecular and synthetic biology; metabolic engineering; bio-nanotechnology; micro/nano technologies; genetic engineering; transgenic technology ● Biomedical engineering and applications: biomechatronics; biomedical electronics; biomechanics; biomaterials; biomimetics; biomedical diagnostics; biomedical therapy; biomedical devices; sensors and circuits; biomedical imaging and medical information systems; implants and regenerative medicine; neurotechnology; clinical engineering; rehabilitation engineering ● Biochemical engineering and applications: metabolic pathway engineering; modeling and simulation ● Translational bioengineering