Multi-modality multiorgan image segmentation using continual learning with enhanced hard attention to the task.

Medical physics Pub Date : 2025-04-23 DOI:10.1002/mp.17842

Ming-Long Wu, Yi-Fan Peng

{"title":"Multi-modality multiorgan image segmentation using continual learning with enhanced hard attention to the task.","authors":"Ming-Long Wu, Yi-Fan Peng","doi":"10.1002/mp.17842","DOIUrl":null,"url":null,"abstract":"Background: Enabling a deep neural network (DNN) to learn multiple tasks using the concept of continual learning potentially better mimics human brain functions. However, current continual learning studies for medical image segmentation are mostly limited to single-modality images at identical anatomical locations.Purpose: To propose and evaluate a continual learning method termed eHAT (enhanced hard attention to the task) for performing multi-modality, multiorgan segmentation tasks using a DNN.Methods: Four public datasets covering the lumbar spine, heart, and brain acquired by magnetic resonance imaging (MRI) and computed tomography (CT) were included to segment the vertebral bodies, the right ventricle, and brain tumors, respectively. Three-task (spine CT, heart MRI, and brain MRI) and four-task (spine CT, heart MRI, brain MRI, and spine MRI) models were tested for eHAT, with the three-task results compared with state-of-the-art continual learning methods. The effectiveness of multitask performance was measured using the forgetting rate, defined as the average difference in Dice coefficients and Hausdorff distances between multiple-task and single-task models. The ability to transfer knowledge to different tasks was evaluated using backward transfer (BWT).Results: The forgetting rates were -2.51% to -0.60% for the three-task eHAT models with varying task orders, substantially better than the -18.13% to -3.59% using original hard attention to the task (HAT), while those in four-task models were -2.54% to -1.59%. In addition, four-task U-net models with eHAT using only half the number of channels (1/4 parameters) yielded nearly equal performance with or without regularization. A retrospective model comparison showed that eHAT with fixed or automatic regularization had significantly superior BWT (-3% to 0%) compared to HAT (-22% to -4%).Conclusion: We demonstrate for the first time that eHAT effectively achieves continual learning of multi-modality, multiorgan segmentation tasks using a single DNN, with improved forgetting rates compared with HAT.","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Enabling a deep neural network (DNN) to learn multiple tasks using the concept of continual learning potentially better mimics human brain functions. However, current continual learning studies for medical image segmentation are mostly limited to single-modality images at identical anatomical locations.

Purpose: To propose and evaluate a continual learning method termed eHAT (enhanced hard attention to the task) for performing multi-modality, multiorgan segmentation tasks using a DNN.

Methods: Four public datasets covering the lumbar spine, heart, and brain acquired by magnetic resonance imaging (MRI) and computed tomography (CT) were included to segment the vertebral bodies, the right ventricle, and brain tumors, respectively. Three-task (spine CT, heart MRI, and brain MRI) and four-task (spine CT, heart MRI, brain MRI, and spine MRI) models were tested for eHAT, with the three-task results compared with state-of-the-art continual learning methods. The effectiveness of multitask performance was measured using the forgetting rate, defined as the average difference in Dice coefficients and Hausdorff distances between multiple-task and single-task models. The ability to transfer knowledge to different tasks was evaluated using backward transfer (BWT).

Results: The forgetting rates were -2.51% to -0.60% for the three-task eHAT models with varying task orders, substantially better than the -18.13% to -3.59% using original hard attention to the task (HAT), while those in four-task models were -2.54% to -1.59%. In addition, four-task U-net models with eHAT using only half the number of channels (1/4 parameters) yielded nearly equal performance with or without regularization. A retrospective model comparison showed that eHAT with fixed or automatic regularization had significantly superior BWT (-3% to 0%) compared to HAT (-22% to -4%).

Conclusion: We demonstrate for the first time that eHAT effectively achieves continual learning of multi-modality, multiorgan segmentation tasks using a single DNN, with improved forgetting rates compared with HAT.

查看原文本刊更多论文

基于持续学习的多模态多器官图像分割，增强了对任务的关注。

背景：利用持续学习的概念，使深度神经网络（DNN）学习多个任务，可能更好地模仿人类大脑功能。然而，目前医学图像分割的持续学习研究大多局限于相同解剖位置的单模态图像。目的：提出并评估一种称为eHAT（增强硬注意任务）的持续学习方法，用于使用深度神经网络执行多模态，多器官分割任务。方法：采用磁共振成像（MRI）和计算机断层扫描（CT）获得的4个公开数据集，包括腰椎、心脏和大脑，分别对椎体、右心室和脑肿瘤进行分割。eHAT测试了三任务（脊柱CT、心脏MRI和脑部MRI）和四任务（脊柱CT、心脏MRI、脑部MRI和脊柱MRI）模型，并将三任务结果与最先进的持续学习方法进行了比较。多任务表现的有效性是用遗忘率来衡量的，遗忘率定义为多任务模型和单任务模型之间的骰子系数和豪斯多夫距离的平均差异。运用逆向迁移（BWT）方法评价知识向不同任务转移的能力。结果：不同任务顺序下，三任务eHAT模型的遗忘率为-2.51% ~ -0.60%，显著优于原始硬注意（HAT）模型的遗忘率为-18.13% ~ -3.59%，优于四任务模型的遗忘率为-2.54% ~ -1.59%。此外，使用eHAT的四任务U-net模型仅使用一半的通道数量（1/4参数），在有或没有正则化的情况下产生了几乎相同的性能。一项回顾性模型比较显示，与HAT（-22%至-4%）相比，固定或自动正则化的eHAT具有显著优于BWT（-3%至0%）。结论：我们首次证明eHAT有效地实现了使用单个DNN的多模态、多器官分割任务的持续学习，与HAT相比，遗忘率有所提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical physics

自引率

0.00%

发文量