Performance evaluation of deformable image registration algorithms: target registration error and its correlation to Dice similarity coefficient.

IF 4.2
Yun Ming Wong, Wen Siang Lew, James Cheow Lei Lee, Hong Qi Tan
{"title":"Performance evaluation of deformable image registration algorithms: target registration error and its correlation to Dice similarity coefficient.","authors":"Yun Ming Wong, Wen Siang Lew, James Cheow Lei Lee, Hong Qi Tan","doi":"10.1016/j.zemedi.2025.09.001","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The wide usability of deformable image registration (DIR) deems the process of quality assurance important for a reliable clinical translation. Our work mainly aimed to compare the performances of four DIR software, in terms of voxel mapping accuracy quantified through target registration error (TRE), and its organ-wise correlation with Dice similarity coefficient (DSC), a widely used segmentation metric.</p><p><strong>Methods: </strong>CT scans were taken for one static scenario and four deformation scenarios simulated using an in-house deformable anthropomorphic pelvis phantom. Their CT numbers were overridden based on actual patient scan, and these overridden scans were used as input images in this study. Four DIR software were tested: RayStation v10B, Velocity v4.1, Slicer, and Plastimatch. Multiple DIRs were performed for each software, using different algorithm options or parameters. The TRE was quantified by calculating the difference between the true and mapped marker positions. Subsequently, Pearson correlation tests were done to examine the correlation between DSC and mean TRE, separately for bladder, prostate, rectum and all organs combined. Similar analyses were conducted for prostate alone, to gain more insights regarding a homogeneous medium. Additionally, DSC was used to predict whether the mean TRE exceeded 3 mm. The classification performance was assessed using accuracy, precision, recall, F1-score, specificity and area under the Receiver Operating Characteristic curve (AUC).</p><p><strong>Results: </strong>Among the four software tested, RayStation achieved the lowest mean TRE for all deformation scenarios, with values between 1.48 mm and 3.06 mm. Pearson correlation tests revealed an exceptionally strong negative correlation between DSC and mean TRE for SlicerElastix, where the correlation coefficients ranged from -0.901 to -0.987. In line with the strongest correlation found, SlicerElastix achieved the highest classification performance scores overall. For all three organs, the scores at their corresponding best DSC threshold were mostly higher than 0.80, and the AUCs were close to 1.</p><p><strong>Conclusion: </strong>In short, this work quantified and compared four DIR software based on the voxel mapping accuracy as well as its correlation with DSC, in the major organs in prostate radiotherapy.</p>","PeriodicalId":101315,"journal":{"name":"Zeitschrift fur medizinische Physik","volume":" ","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Zeitschrift fur medizinische Physik","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.zemedi.2025.09.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: The wide usability of deformable image registration (DIR) deems the process of quality assurance important for a reliable clinical translation. Our work mainly aimed to compare the performances of four DIR software, in terms of voxel mapping accuracy quantified through target registration error (TRE), and its organ-wise correlation with Dice similarity coefficient (DSC), a widely used segmentation metric.

Methods: CT scans were taken for one static scenario and four deformation scenarios simulated using an in-house deformable anthropomorphic pelvis phantom. Their CT numbers were overridden based on actual patient scan, and these overridden scans were used as input images in this study. Four DIR software were tested: RayStation v10B, Velocity v4.1, Slicer, and Plastimatch. Multiple DIRs were performed for each software, using different algorithm options or parameters. The TRE was quantified by calculating the difference between the true and mapped marker positions. Subsequently, Pearson correlation tests were done to examine the correlation between DSC and mean TRE, separately for bladder, prostate, rectum and all organs combined. Similar analyses were conducted for prostate alone, to gain more insights regarding a homogeneous medium. Additionally, DSC was used to predict whether the mean TRE exceeded 3 mm. The classification performance was assessed using accuracy, precision, recall, F1-score, specificity and area under the Receiver Operating Characteristic curve (AUC).

Results: Among the four software tested, RayStation achieved the lowest mean TRE for all deformation scenarios, with values between 1.48 mm and 3.06 mm. Pearson correlation tests revealed an exceptionally strong negative correlation between DSC and mean TRE for SlicerElastix, where the correlation coefficients ranged from -0.901 to -0.987. In line with the strongest correlation found, SlicerElastix achieved the highest classification performance scores overall. For all three organs, the scores at their corresponding best DSC threshold were mostly higher than 0.80, and the AUCs were close to 1.

Conclusion: In short, this work quantified and compared four DIR software based on the voxel mapping accuracy as well as its correlation with DSC, in the major organs in prostate radiotherapy.

可变形图像配准算法的性能评价:目标配准误差及其与Dice相似系数的相关性。
目的:可变形图像配准(DIR)的广泛应用使得质量保证过程对可靠的临床翻译至关重要。我们的工作主要是比较四种DIR软件的性能,通过目标配准误差(TRE)量化体素映射精度,以及其与Dice相似系数(DSC)的器官相关性,DSC是一种广泛使用的分割度量。方法:采用内部可变形的拟人骨盆假体模拟一种静态场景和四种变形场景的CT扫描。他们的CT编号根据患者的实际扫描被重写,这些重写的扫描被用作本研究的输入图像。测试了四个DIR软件:RayStation v10B, Velocity v4.1, Slicer和Plastimatch。使用不同的算法选项或参数,对每个软件执行多个dir。通过计算真实标记位置与映射标记位置之间的差值来量化TRE。随后,分别对膀胱、前列腺、直肠和所有器官联合进行Pearson相关检验,检验DSC与平均TRE之间的相关性。对前列腺单独进行了类似的分析,以获得更多关于均匀介质的见解。此外,DSC用于预测平均TRE是否超过3 mm。采用准确度、精密度、召回率、f1评分、特异性和受试者工作特征曲线(AUC)下面积评价分类效果。结果:在四个测试软件中,RayStation在所有变形场景下的平均TRE值最低,在1.48 mm至3.06 mm之间。Pearson相关检验显示,SlicerElastix的DSC与平均TRE之间存在异常强的负相关,相关系数范围为-0.901至-0.987。与发现的最强相关性一致,SlicerElastix总体上获得了最高的分类性能分数。3个脏器在相应最佳DSC阈值处的得分均大于0.80,auc均接近1。结论:简而言之,本工作基于体素映射精度及其与DSC的相关性,对四种DIR软件在前列腺放疗主要器官中的量化和比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信