{"title":"hybrid - mednet:一种用于医学图像分割的具有多维特征融合的CNN-transformer混合网络。","authors":"Yumna Memon, Feng Zeng","doi":"10.1088/1361-6560/ae0976","DOIUrl":null,"url":null,"abstract":"<p><p>Twin-to-twin transfusion syndrome (TTTS) is a complex prenatal condition in which monochorionic twins experience an imbalance in blood flow due to abnormal vascular connections in the shared placenta. Fetoscopic laser photocoagulation is the first-line treatment for TTTS, aimed at coagulating these abnormal connections. However, the procedure is complicated by a limited field of view, occlusions, poor-quality endoscopic images, and distortions caused by artifacts. To optimize the visualization of placental vessels during surgical procedures, we propose Hybrid-MedNet, a novel hybrid CNN-transformer network that incorporates multi-dimensional deep feature learning techniques. The network introduces a BiPath tokenization module that enhances vessel boundary detection by capturing both channel dependencies and spatial features through parallel attention mechanisms. A context-aware transformer block addresses the weak inductive bias problem in traditional transformers while preserving spatial relationships crucial for accurate vessel identification in distorted fetoscopic images. Furthermore, we develop a multi-scale trifusion module that integrates multi-dimensional features to capture rich vascular representations from the encoder and facilitate precise vessel information transfer to the decoder for improved segmentation accuracy. Experimental results show that our approach achieves a Dice score of 95.40% on fetoscopic images, outperforming ten state-of-the-art segmentation methods. The consistent superior performance across four segmentation tasks and ten distinct datasets confirms the robustness and effectiveness of our method for diverse and complex medical imaging applications.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":" ","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hybrid-MedNet: a hybrid CNN-transformer network with multi-dimensional feature fusion for medical image segmentation.\",\"authors\":\"Yumna Memon, Feng Zeng\",\"doi\":\"10.1088/1361-6560/ae0976\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Twin-to-twin transfusion syndrome (TTTS) is a complex prenatal condition in which monochorionic twins experience an imbalance in blood flow due to abnormal vascular connections in the shared placenta. Fetoscopic laser photocoagulation is the first-line treatment for TTTS, aimed at coagulating these abnormal connections. However, the procedure is complicated by a limited field of view, occlusions, poor-quality endoscopic images, and distortions caused by artifacts. To optimize the visualization of placental vessels during surgical procedures, we propose Hybrid-MedNet, a novel hybrid CNN-transformer network that incorporates multi-dimensional deep feature learning techniques. The network introduces a BiPath tokenization module that enhances vessel boundary detection by capturing both channel dependencies and spatial features through parallel attention mechanisms. A context-aware transformer block addresses the weak inductive bias problem in traditional transformers while preserving spatial relationships crucial for accurate vessel identification in distorted fetoscopic images. Furthermore, we develop a multi-scale trifusion module that integrates multi-dimensional features to capture rich vascular representations from the encoder and facilitate precise vessel information transfer to the decoder for improved segmentation accuracy. Experimental results show that our approach achieves a Dice score of 95.40% on fetoscopic images, outperforming ten state-of-the-art segmentation methods. The consistent superior performance across four segmentation tasks and ten distinct datasets confirms the robustness and effectiveness of our method for diverse and complex medical imaging applications.</p>\",\"PeriodicalId\":20185,\"journal\":{\"name\":\"Physics in medicine and biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Physics in medicine and biology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1088/1361-6560/ae0976\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/ae0976","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
Hybrid-MedNet: a hybrid CNN-transformer network with multi-dimensional feature fusion for medical image segmentation.
Twin-to-twin transfusion syndrome (TTTS) is a complex prenatal condition in which monochorionic twins experience an imbalance in blood flow due to abnormal vascular connections in the shared placenta. Fetoscopic laser photocoagulation is the first-line treatment for TTTS, aimed at coagulating these abnormal connections. However, the procedure is complicated by a limited field of view, occlusions, poor-quality endoscopic images, and distortions caused by artifacts. To optimize the visualization of placental vessels during surgical procedures, we propose Hybrid-MedNet, a novel hybrid CNN-transformer network that incorporates multi-dimensional deep feature learning techniques. The network introduces a BiPath tokenization module that enhances vessel boundary detection by capturing both channel dependencies and spatial features through parallel attention mechanisms. A context-aware transformer block addresses the weak inductive bias problem in traditional transformers while preserving spatial relationships crucial for accurate vessel identification in distorted fetoscopic images. Furthermore, we develop a multi-scale trifusion module that integrates multi-dimensional features to capture rich vascular representations from the encoder and facilitate precise vessel information transfer to the decoder for improved segmentation accuracy. Experimental results show that our approach achieves a Dice score of 95.40% on fetoscopic images, outperforming ten state-of-the-art segmentation methods. The consistent superior performance across four segmentation tasks and ten distinct datasets confirms the robustness and effectiveness of our method for diverse and complex medical imaging applications.
期刊介绍:
The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry