Wenshuai Zhang, Lei Wang, Pengcheng Dai, Zhiyao Liu, Juan Wang, Qun Liu
{"title":"SuperFormer:用于医学图像分割的类似unet的超级令牌转换器","authors":"Wenshuai Zhang, Lei Wang, Pengcheng Dai, Zhiyao Liu, Juan Wang, Qun Liu","doi":"10.1002/ima.70208","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>The application of computer-aided diagnosis in the medical field is gradually becoming widespread. Multi-organ segmentation in clinical abdominal CT images and cardiac MRI images poses a challenging task. Accurate segmentation of multiple organs is a crucial prerequisite for disease diagnosis and treatment planning. In this paper, we introduce a multi-organ segmentation method based on CT or MRI images: SuperFormer.SuperFormer is a hierarchical encoder-decoder network with two compelling designs: (1) It introduces the super token transformer block into the U-shaped encoder-decoder structure, making it easier to extract global information while significantly improving computational efficiency. (2) It presents a channel-based multi-scale Transformer context bridge for effectively extracting correlations of global dependencies and local context in multi-scale features generated by our hierarchical Transformer encoder. This guides the efficient connection of fused multi-scale channel information to decoder features, eliminating the semantic gap. In medical image segmentation, SuperFormer demonstrates a powerful ability to capture more discriminative dependencies and context. Experimental results on multi-organ segmentation and cardiac segmentation tasks demonstrate the algorithm's superiority, effectiveness, and robustness. Specifically, experimental results from training SuperFormer from scratch even surpass state-of-the-art methods pretrained on ImageNet, and its core design can be extended to other visual segmentation tasks.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 5","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SuperFormer: Unet-Like Super Token Transformer for Medical Image Segmentation\",\"authors\":\"Wenshuai Zhang, Lei Wang, Pengcheng Dai, Zhiyao Liu, Juan Wang, Qun Liu\",\"doi\":\"10.1002/ima.70208\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>The application of computer-aided diagnosis in the medical field is gradually becoming widespread. Multi-organ segmentation in clinical abdominal CT images and cardiac MRI images poses a challenging task. Accurate segmentation of multiple organs is a crucial prerequisite for disease diagnosis and treatment planning. In this paper, we introduce a multi-organ segmentation method based on CT or MRI images: SuperFormer.SuperFormer is a hierarchical encoder-decoder network with two compelling designs: (1) It introduces the super token transformer block into the U-shaped encoder-decoder structure, making it easier to extract global information while significantly improving computational efficiency. (2) It presents a channel-based multi-scale Transformer context bridge for effectively extracting correlations of global dependencies and local context in multi-scale features generated by our hierarchical Transformer encoder. This guides the efficient connection of fused multi-scale channel information to decoder features, eliminating the semantic gap. In medical image segmentation, SuperFormer demonstrates a powerful ability to capture more discriminative dependencies and context. Experimental results on multi-organ segmentation and cardiac segmentation tasks demonstrate the algorithm's superiority, effectiveness, and robustness. Specifically, experimental results from training SuperFormer from scratch even surpass state-of-the-art methods pretrained on ImageNet, and its core design can be extended to other visual segmentation tasks.</p>\\n </div>\",\"PeriodicalId\":14027,\"journal\":{\"name\":\"International Journal of Imaging Systems and Technology\",\"volume\":\"35 5\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Imaging Systems and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ima.70208\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70208","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
SuperFormer: Unet-Like Super Token Transformer for Medical Image Segmentation
The application of computer-aided diagnosis in the medical field is gradually becoming widespread. Multi-organ segmentation in clinical abdominal CT images and cardiac MRI images poses a challenging task. Accurate segmentation of multiple organs is a crucial prerequisite for disease diagnosis and treatment planning. In this paper, we introduce a multi-organ segmentation method based on CT or MRI images: SuperFormer.SuperFormer is a hierarchical encoder-decoder network with two compelling designs: (1) It introduces the super token transformer block into the U-shaped encoder-decoder structure, making it easier to extract global information while significantly improving computational efficiency. (2) It presents a channel-based multi-scale Transformer context bridge for effectively extracting correlations of global dependencies and local context in multi-scale features generated by our hierarchical Transformer encoder. This guides the efficient connection of fused multi-scale channel information to decoder features, eliminating the semantic gap. In medical image segmentation, SuperFormer demonstrates a powerful ability to capture more discriminative dependencies and context. Experimental results on multi-organ segmentation and cardiac segmentation tasks demonstrate the algorithm's superiority, effectiveness, and robustness. Specifically, experimental results from training SuperFormer from scratch even surpass state-of-the-art methods pretrained on ImageNet, and its core design can be extended to other visual segmentation tasks.
期刊介绍:
The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals.
IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging.
The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered.
The scope of the journal includes, but is not limited to, the following in the context of biomedical research:
Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.;
Neuromodulation and brain stimulation techniques such as TMS and tDCS;
Software and hardware for imaging, especially related to human and animal health;
Image segmentation in normal and clinical populations;
Pattern analysis and classification using machine learning techniques;
Computational modeling and analysis;
Brain connectivity and connectomics;
Systems-level characterization of brain function;
Neural networks and neurorobotics;
Computer vision, based on human/animal physiology;
Brain-computer interface (BCI) technology;
Big data, databasing and data mining.