Dongxin Zhao , Jianhua Liu , Peng Geng , Jiaxin Yang , Ziqian Zhang , Yin Zhang
{"title":"Mid-Net: Rethinking efficient network architectures for small-sample vascular segmentation","authors":"Dongxin Zhao , Jianhua Liu , Peng Geng , Jiaxin Yang , Ziqian Zhang , Yin Zhang","doi":"10.1016/j.inffus.2024.102777","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning-based medical image segmentation methods have demonstrated significant clinical applications. However, training these methods on small-sample vascular datasets remains challenging due to the scarcity of labeled data and severe category imbalance. To address this, this paper proposes Mid-Net, which fully exploits the often-overlooked feature representation potential of the middle-layer network through cross-layer guidance to improve model learning efficiency in data-constrained environments. Mid-Net consists of three core components: the encoding path, the guidance path, and the calibration path. In the encoding path, a feature pyramid structure with large kernel convolutions is used to extract semantic information at different scales. The guidance path combines the sensitivity of the shallow-layer network to spatial details with the global perceptual abilities of the deep-layer network to provide more discriminative guidance to the middle-layer network in a feature-decoupled manner. The calibration path further calibrates the spatial location information of the middle-layer network through end-to-end supervised learning. Experiments conducted on the publicly available retinal vascular datasets DRIVE, STARE, and CHASE_DB1, as well as coronary angiography datasets DCA1 and CHUAC, demonstrate that Mid-Net achieves superior segmentation results with lower computational resource requirements compared to state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102777"},"PeriodicalIF":14.7000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524005554","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning-based medical image segmentation methods have demonstrated significant clinical applications. However, training these methods on small-sample vascular datasets remains challenging due to the scarcity of labeled data and severe category imbalance. To address this, this paper proposes Mid-Net, which fully exploits the often-overlooked feature representation potential of the middle-layer network through cross-layer guidance to improve model learning efficiency in data-constrained environments. Mid-Net consists of three core components: the encoding path, the guidance path, and the calibration path. In the encoding path, a feature pyramid structure with large kernel convolutions is used to extract semantic information at different scales. The guidance path combines the sensitivity of the shallow-layer network to spatial details with the global perceptual abilities of the deep-layer network to provide more discriminative guidance to the middle-layer network in a feature-decoupled manner. The calibration path further calibrates the spatial location information of the middle-layer network through end-to-end supervised learning. Experiments conducted on the publicly available retinal vascular datasets DRIVE, STARE, and CHASE_DB1, as well as coronary angiography datasets DCA1 and CHUAC, demonstrate that Mid-Net achieves superior segmentation results with lower computational resource requirements compared to state-of-the-art methods.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.