{"title":"基于持续学习的多模态多器官图像分割,增强了对任务的关注。","authors":"Ming-Long Wu, Yi-Fan Peng","doi":"10.1002/mp.17842","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Enabling a deep neural network (DNN) to learn multiple tasks using the concept of continual learning potentially better mimics human brain functions. However, current continual learning studies for medical image segmentation are mostly limited to single-modality images at identical anatomical locations.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>To propose and evaluate a continual learning method termed eHAT (enhanced hard attention to the task) for performing multi-modality, multiorgan segmentation tasks using a DNN.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Four public datasets covering the lumbar spine, heart, and brain acquired by magnetic resonance imaging (MRI) and computed tomography (CT) were included to segment the vertebral bodies, the right ventricle, and brain tumors, respectively. Three-task (spine CT, heart MRI, and brain MRI) and four-task (spine CT, heart MRI, brain MRI, and spine MRI) models were tested for eHAT, with the three-task results compared with state-of-the-art continual learning methods. The effectiveness of multitask performance was measured using the forgetting rate, defined as the average difference in Dice coefficients and Hausdorff distances between multiple-task and single-task models. The ability to transfer knowledge to different tasks was evaluated using backward transfer (BWT).</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The forgetting rates were −2.51% to −0.60% for the three-task eHAT models with varying task orders, substantially better than the −18.13% to −3.59% using original hard attention to the task (HAT), while those in four-task models were −2.54% to −1.59%. In addition, four-task U-net models with eHAT using only half the number of channels (1/4 parameters) yielded nearly equal performance with or without regularization. A retrospective model comparison showed that eHAT with fixed or automatic regularization had significantly superior BWT (−3% to 0%) compared to HAT (−22% to −4%).</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>We demonstrate for the first time that eHAT effectively achieves continual learning of multi-modality, multiorgan segmentation tasks using a single DNN, with improved forgetting rates compared with HAT.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 7","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-modality multiorgan image segmentation using continual learning with enhanced hard attention to the task\",\"authors\":\"Ming-Long Wu, Yi-Fan Peng\",\"doi\":\"10.1002/mp.17842\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Enabling a deep neural network (DNN) to learn multiple tasks using the concept of continual learning potentially better mimics human brain functions. However, current continual learning studies for medical image segmentation are mostly limited to single-modality images at identical anatomical locations.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>To propose and evaluate a continual learning method termed eHAT (enhanced hard attention to the task) for performing multi-modality, multiorgan segmentation tasks using a DNN.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Four public datasets covering the lumbar spine, heart, and brain acquired by magnetic resonance imaging (MRI) and computed tomography (CT) were included to segment the vertebral bodies, the right ventricle, and brain tumors, respectively. Three-task (spine CT, heart MRI, and brain MRI) and four-task (spine CT, heart MRI, brain MRI, and spine MRI) models were tested for eHAT, with the three-task results compared with state-of-the-art continual learning methods. The effectiveness of multitask performance was measured using the forgetting rate, defined as the average difference in Dice coefficients and Hausdorff distances between multiple-task and single-task models. The ability to transfer knowledge to different tasks was evaluated using backward transfer (BWT).</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The forgetting rates were −2.51% to −0.60% for the three-task eHAT models with varying task orders, substantially better than the −18.13% to −3.59% using original hard attention to the task (HAT), while those in four-task models were −2.54% to −1.59%. In addition, four-task U-net models with eHAT using only half the number of channels (1/4 parameters) yielded nearly equal performance with or without regularization. A retrospective model comparison showed that eHAT with fixed or automatic regularization had significantly superior BWT (−3% to 0%) compared to HAT (−22% to −4%).</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>We demonstrate for the first time that eHAT effectively achieves continual learning of multi-modality, multiorgan segmentation tasks using a single DNN, with improved forgetting rates compared with HAT.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 7\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/mp.17842\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mp.17842","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Multi-modality multiorgan image segmentation using continual learning with enhanced hard attention to the task
Background
Enabling a deep neural network (DNN) to learn multiple tasks using the concept of continual learning potentially better mimics human brain functions. However, current continual learning studies for medical image segmentation are mostly limited to single-modality images at identical anatomical locations.
Purpose
To propose and evaluate a continual learning method termed eHAT (enhanced hard attention to the task) for performing multi-modality, multiorgan segmentation tasks using a DNN.
Methods
Four public datasets covering the lumbar spine, heart, and brain acquired by magnetic resonance imaging (MRI) and computed tomography (CT) were included to segment the vertebral bodies, the right ventricle, and brain tumors, respectively. Three-task (spine CT, heart MRI, and brain MRI) and four-task (spine CT, heart MRI, brain MRI, and spine MRI) models were tested for eHAT, with the three-task results compared with state-of-the-art continual learning methods. The effectiveness of multitask performance was measured using the forgetting rate, defined as the average difference in Dice coefficients and Hausdorff distances between multiple-task and single-task models. The ability to transfer knowledge to different tasks was evaluated using backward transfer (BWT).
Results
The forgetting rates were −2.51% to −0.60% for the three-task eHAT models with varying task orders, substantially better than the −18.13% to −3.59% using original hard attention to the task (HAT), while those in four-task models were −2.54% to −1.59%. In addition, four-task U-net models with eHAT using only half the number of channels (1/4 parameters) yielded nearly equal performance with or without regularization. A retrospective model comparison showed that eHAT with fixed or automatic regularization had significantly superior BWT (−3% to 0%) compared to HAT (−22% to −4%).
Conclusion
We demonstrate for the first time that eHAT effectively achieves continual learning of multi-modality, multiorgan segmentation tasks using a single DNN, with improved forgetting rates compared with HAT.
期刊介绍:
Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments
Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.