Co-Learning Multimodality PET-CT Features via a Cascaded CNN-Transformer Network

IF 4.6 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Lei Bi;Xiaohang Fu;Qiufang Liu;Shaoli Song;David Dagan Feng;Michael Fulham;Jinman Kim
{"title":"Co-Learning Multimodality PET-CT Features via a Cascaded CNN-Transformer Network","authors":"Lei Bi;Xiaohang Fu;Qiufang Liu;Shaoli Song;David Dagan Feng;Michael Fulham;Jinman Kim","doi":"10.1109/TRPMS.2024.3417901","DOIUrl":null,"url":null,"abstract":"<italic>Background:</i>\n Automated segmentation of multimodality positron emission tomography—computed tomography (PET-CT) data is a major challenge in the development of computer-aided diagnosis systems (CADs). In this context, convolutional neural network (CNN)-based methods are considered as the state-of-the-art. These CNN-based methods, however, have difficulty in co-learning the complementary PET-CT image features and in learning the global context when focusing solely on local patterns. \n<italic>Methods:</i>\n We propose a cascaded CNN-transformer network (CCNN-TN) tailored for PET-CT image segmentation. We employed a transformer network (TN) because of its ability to establish global context via self-attention and embedding image patches. We extended the TN definition by cascading multiple TNs and CNNs to learn the global and local contexts. We also introduced a hyper fusion branch that iteratively fuses the separately extracted complementary image features. We evaluated our approach, when compared to current state-of-the-art CNN methods, on three datasets: two nonsmall cell lung cancer (NSCLC) and one soft tissue sarcoma (STS). \n<italic>Results:</i>\n Our CCNN-TN method achieved a dice similarity coefficient (DSC) score of 72.25% (NSCLC), 67.11% (NSCLC), and 66.36% (STS) for segmentation of tumors. Compared to other methods the DSC was higher for our CCNN-TN by 4.5%, 1.31%, and 3.44%. \n<italic>Conclusion:</i>\n Our experimental results demonstrate that CCNN-TN, when compared to the existing methods, achieved more generalizable results across different datasets and has consistent performance across various image fusion strategies and network backbones.","PeriodicalId":46807,"journal":{"name":"IEEE Transactions on Radiation and Plasma Medical Sciences","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Radiation and Plasma Medical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10570071/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Automated segmentation of multimodality positron emission tomography—computed tomography (PET-CT) data is a major challenge in the development of computer-aided diagnosis systems (CADs). In this context, convolutional neural network (CNN)-based methods are considered as the state-of-the-art. These CNN-based methods, however, have difficulty in co-learning the complementary PET-CT image features and in learning the global context when focusing solely on local patterns. Methods: We propose a cascaded CNN-transformer network (CCNN-TN) tailored for PET-CT image segmentation. We employed a transformer network (TN) because of its ability to establish global context via self-attention and embedding image patches. We extended the TN definition by cascading multiple TNs and CNNs to learn the global and local contexts. We also introduced a hyper fusion branch that iteratively fuses the separately extracted complementary image features. We evaluated our approach, when compared to current state-of-the-art CNN methods, on three datasets: two nonsmall cell lung cancer (NSCLC) and one soft tissue sarcoma (STS). Results: Our CCNN-TN method achieved a dice similarity coefficient (DSC) score of 72.25% (NSCLC), 67.11% (NSCLC), and 66.36% (STS) for segmentation of tumors. Compared to other methods the DSC was higher for our CCNN-TN by 4.5%, 1.31%, and 3.44%. Conclusion: Our experimental results demonstrate that CCNN-TN, when compared to the existing methods, achieved more generalizable results across different datasets and has consistent performance across various image fusion strategies and network backbones.
通过级联 CNN 变换器网络共同学习多模态 PET-CT 特征
背景:多模态正电子发射计算机断层扫描(PET-CT)数据的自动分割是计算机辅助诊断系统(CAD)开发过程中的一大挑战。在这方面,基于卷积神经网络(CNN)的方法被认为是最先进的方法。然而,这些基于卷积神经网络的方法很难共同学习互补的 PET-CT 图像特征,并且在只关注局部模式时,很难学习全局背景。方法:我们提出了一种为 PET-CT 图像分割量身定制的级联 CNN 变换器网络(CCNN-TN)。我们采用变换器网络(TN),因为它能够通过自我关注和嵌入图像补丁来建立全局上下文。我们通过级联多个 TN 和 CNN 来学习全局和局部上下文,从而扩展了 TN 的定义。我们还引入了超融合分支,迭代融合分别提取的互补图像特征。与目前最先进的 CNN 方法相比,我们在三个数据集上评估了我们的方法:两个非小细胞肺癌(NSCLC)和一个软组织肉瘤(STS)。研究结果我们的 CCNN-TN 方法在肿瘤分割方面的骰子相似系数(DSC)得分分别为 72.25%(NSCLC)、67.11%(NSCLC)和 66.36%(STS)。与其他方法相比,我们的 CCNN-TN 的 DSC 分别高出 4.5%、1.31% 和 3.44%。结论我们的实验结果表明,与现有方法相比,CCNN-TN 在不同数据集上取得了更具通用性的结果,并且在各种图像融合策略和网络骨干上具有一致的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Radiation and Plasma Medical Sciences
IEEE Transactions on Radiation and Plasma Medical Sciences RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
8.00
自引率
18.20%
发文量
109
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信