从带有专家注释的图像中进行家庭伤口患者转诊决策的多模式AI

IF 4.4 3区医学 Q2 ENGINEERING, BIOMEDICAL

IEEE Journal of Translational Engineering in Health and Medicine-Jtehm Pub Date : 2025-07-11 DOI:10.1109/JTEHM.2025.3588427

Reza Saadati Fard;Emmanuel Agu;Palawat Busaranuvong;Deepak Kumar;Shefalika Gautam;Bengisu Tulu;Diane Strong

{"title":"从带有专家注释的图像中进行家庭伤口患者转诊决策的多模式AI","authors":"Reza Saadati Fard;Emmanuel Agu;Palawat Busaranuvong;Deepak Kumar;Shefalika Gautam;Bengisu Tulu;Diane Strong","doi":"10.1109/JTEHM.2025.3588427","DOIUrl":null,"url":null,"abstract":"Chronic wounds affect 8.5 million Americans, especially the elderly and patients with diabetes. As regular care is critical for proper healing, many patients receive care in their homes from visiting nurses and caregivers with variable wound expertise. Problematic, non-healing wounds should be referred to experts in wound clinics to avoid adverse outcomes such as limb amputations. Unfortunately, due to the lack of wound expertise, referral decisions made in non-clinical settings can be erroneous, delayed or unnecessary. This paper proposes the Deep Multimodal Wound Assessment Tool (DM-WAT), a novel machine learning framework to support visiting nurses by recommending wound referral decisions from smartphone-captured wound images and associated clinical notes. DM-WAT extracts visual features from wound images using DeiT-Base-Distilled, a Vision Transformer (ViT) architecture. Distillation-based training facilitates representation learning and knowledge transfer from a larger teacher model to DeiT-Base, enabling robust performance on our small wound image dataset of 205 wound images. DM-WAT extracts text features from clinical notes using DeBERTa-base, which comprehends context by disentangling content and position information from clinical notes. Visual and text features are combined using an intermediate fusion approach. To overcome the challenges posed by a small and imbalanced dataset, DM-WAT integrates image and text augmentation along with transfer learning via pre-trained feature extractors to achieve high performance. In rigorous evaluation, DM-WAT achieved an accuracy of 77% <inline-formula> <tex-math>$\\pm ~3$ </tex-math></inline-formula>% and an F1 score of 70% <inline-formula> <tex-math>$\\pm ~2$ </tex-math></inline-formula>%, outperforming the prior state of the art and all baseline single-modality and multimodal approaches. Additionally, to interpret DM-WAT’s recommendations, the Score-CAM and Captum interpretation algorithms provided insights into the specific parts of the image and text inputs that the model focused on during decision-making.","PeriodicalId":54255,"journal":{"name":"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm","volume":"13 ","pages":"341-353"},"PeriodicalIF":4.4000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11078373","citationCount":"0","resultStr":"{\"title\":\"Multimodal AI for Home Wound Patient Referral Decisions From Images With Specialist Annotations\",\"authors\":\"Reza Saadati Fard;Emmanuel Agu;Palawat Busaranuvong;Deepak Kumar;Shefalika Gautam;Bengisu Tulu;Diane Strong\",\"doi\":\"10.1109/JTEHM.2025.3588427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chronic wounds affect 8.5 million Americans, especially the elderly and patients with diabetes. As regular care is critical for proper healing, many patients receive care in their homes from visiting nurses and caregivers with variable wound expertise. Problematic, non-healing wounds should be referred to experts in wound clinics to avoid adverse outcomes such as limb amputations. Unfortunately, due to the lack of wound expertise, referral decisions made in non-clinical settings can be erroneous, delayed or unnecessary. This paper proposes the Deep Multimodal Wound Assessment Tool (DM-WAT), a novel machine learning framework to support visiting nurses by recommending wound referral decisions from smartphone-captured wound images and associated clinical notes. DM-WAT extracts visual features from wound images using DeiT-Base-Distilled, a Vision Transformer (ViT) architecture. Distillation-based training facilitates representation learning and knowledge transfer from a larger teacher model to DeiT-Base, enabling robust performance on our small wound image dataset of 205 wound images. DM-WAT extracts text features from clinical notes using DeBERTa-base, which comprehends context by disentangling content and position information from clinical notes. Visual and text features are combined using an intermediate fusion approach. To overcome the challenges posed by a small and imbalanced dataset, DM-WAT integrates image and text augmentation along with transfer learning via pre-trained feature extractors to achieve high performance. In rigorous evaluation, DM-WAT achieved an accuracy of 77% <inline-formula> <tex-math>$\\\\pm ~3$ </tex-math></inline-formula>% and an F1 score of 70% <inline-formula> <tex-math>$\\\\pm ~2$ </tex-math></inline-formula>%, outperforming the prior state of the art and all baseline single-modality and multimodal approaches. Additionally, to interpret DM-WAT’s recommendations, the Score-CAM and Captum interpretation algorithms provided insights into the specific parts of the image and text inputs that the model focused on during decision-making.\",\"PeriodicalId\":54255,\"journal\":{\"name\":\"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm\",\"volume\":\"13 \",\"pages\":\"341-353\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11078373\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11078373/\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11078373/","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

850万美国人受到慢性伤口的影响，尤其是老年人和糖尿病患者。由于定期护理对正常愈合至关重要，许多患者在家中接受来访护士和具有不同伤口专业知识的护理人员的护理。有问题的，未愈合的伤口应提交给伤口诊所的专家，以避免不良后果，如截肢。不幸的是，由于缺乏伤口专业知识，在非临床环境下做出的转诊决定可能是错误的、延迟的或不必要的。本文提出了深度多模式伤口评估工具（DM-WAT），这是一种新颖的机器学习框架，可以通过智能手机捕获的伤口图像和相关临床记录推荐伤口转诊决策来支持来访护士。DM-WAT使用一种视觉转换（Vision Transformer, ViT）架构，即DeiT-Base-Distilled，从伤口图像中提取视觉特征。基于蒸馏的训练促进了表征学习和从更大的教师模型到DeiT-Base的知识转移，使我们的205个伤口图像的小伤口图像数据集具有强大的性能。DM-WAT使用DeBERTa-base从临床笔记中提取文本特征，该数据库通过从临床笔记中分离内容和位置信息来理解上下文。使用中间融合方法将视觉和文本特征结合起来。为了克服小而不平衡的数据集带来的挑战，DM-WAT通过预训练的特征提取器集成了图像和文本增强以及迁移学习，以实现高性能。在严格的评估中，DM-WAT的准确率达到77%，F1得分为70%，优于现有的技术水平和所有基线单模态和多模态方法。此外，为了解释DM-WAT的建议，Score-CAM和Captum解释算法提供了对模型在决策过程中关注的图像和文本输入的特定部分的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multimodal AI for Home Wound Patient Referral Decisions From Images With Specialist Annotations

Chronic wounds affect 8.5 million Americans, especially the elderly and patients with diabetes. As regular care is critical for proper healing, many patients receive care in their homes from visiting nurses and caregivers with variable wound expertise. Problematic, non-healing wounds should be referred to experts in wound clinics to avoid adverse outcomes such as limb amputations. Unfortunately, due to the lack of wound expertise, referral decisions made in non-clinical settings can be erroneous, delayed or unnecessary. This paper proposes the Deep Multimodal Wound Assessment Tool (DM-WAT), a novel machine learning framework to support visiting nurses by recommending wound referral decisions from smartphone-captured wound images and associated clinical notes. DM-WAT extracts visual features from wound images using DeiT-Base-Distilled, a Vision Transformer (ViT) architecture. Distillation-based training facilitates representation learning and knowledge transfer from a larger teacher model to DeiT-Base, enabling robust performance on our small wound image dataset of 205 wound images. DM-WAT extracts text features from clinical notes using DeBERTa-base, which comprehends context by disentangling content and position information from clinical notes. Visual and text features are combined using an intermediate fusion approach. To overcome the challenges posed by a small and imbalanced dataset, DM-WAT integrates image and text augmentation along with transfer learning via pre-trained feature extractors to achieve high performance. In rigorous evaluation, DM-WAT achieved an accuracy of 77%

$\pm ~3$

% and an F1 score of 70%

$\pm ~2$

%, outperforming the prior state of the art and all baseline single-modality and multimodal approaches. Additionally, to interpret DM-WAT’s recommendations, the Score-CAM and Captum interpretation algorithms provided insights into the specific parts of the image and text inputs that the model focused on during decision-making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Journal of Translational Engineering in Health and Medicine-Jtehm Engineering-Biomedical Engineering

CiteScore

7.40

自引率

2.90%

发文量

审稿时长

27 weeks

期刊介绍： The IEEE Journal of Translational Engineering in Health and Medicine is an open access product that bridges the engineering and clinical worlds, focusing on detailed descriptions of advanced technical solutions to a clinical need along with clinical results and healthcare relevance. The journal provides a platform for state-of-the-art technology directions in the interdisciplinary field of biomedical engineering, embracing engineering, life sciences and medicine. A unique aspect of the journal is its ability to foster a collaboration between physicians and engineers for presenting broad and compelling real world technological and engineering solutions that can be implemented in the interest of improving quality of patient care and treatment outcomes, thereby reducing costs and improving efficiency. The journal provides an active forum for clinical research and relevant state-of the-art technology for members of all the IEEE societies that have an interest in biomedical engineering as well as reaching out directly to physicians and the medical community through the American Medical Association (AMA) and other clinical societies. The scope of the journal includes, but is not limited, to topics on: Medical devices, healthcare delivery systems, global healthcare initiatives, and ICT based services; Technological relevance to healthcare cost reduction; Technology affecting healthcare management, decision-making, and policy; Advanced technical work that is applied to solving specific clinical needs.