{"title":"基于双交换数据混合和交叉EMA策略的半监督医学图像分割。","authors":"Licheng Zheng, Lihui Wang, Yingfeng Ou, Li Wang, Caiqing Jian, Yuemin Zhu","doi":"10.1002/mp.17809","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Semi-supervised medical image segmentation methods based on mean teacher (MT) framework provide a promising means for addressing the dense prediction problems with limited annotated images and numerous unlabeled images. However, the confirmation bias caused by the distribution difference between labeled and unlabeled data and the parameters-coupling problem of MT prevent the model from further improving the segmentation performance.</p><p><strong>Purpose: </strong>To reduce confirmation bias and alleviate the parameter coupling problem in MT framework, a novel data augmentation strategy and a cross exponential moving averaging (crossEMA) architecture are proposed in this work.</p><p><strong>Methods: </strong>Specifically, a dual swap mixing data augmentation method was first proposed, which exchanges the patches between labeled and unlabeled images twice to decrease the confirmation bias caused by distribution divergency. Subsequently, a novel architecture for both student and teacher networks was designed with structurally identical dual decoders, one of which adopted a dropout operation. Labeled, unlabeled, and mixed images are fed into this MT architecture. For unlabeled data, the pseudo-labels generated by the dual decoders of the teacher network were used to supervise the predictions of the corresponding decoders of the student network. For mixed data, the real labels of the labeled data are mixed with the pseudo-labels of the unlabeled data predicted by the teacher network to form the supervisory information, which is used to constrain the prediction consistency for mixed data between student and teacher networks. To overcome the parameter coupling problem between the student and teacher networks, the encoder parameters of the teacher network were updated using an exponential moving average (EMA) strategy, while its dual decoder parameters were updated using a cross EMA strategy, which means the perturbed decoder parameters of the student network were updated with the non-perturbed decoder parameters of the student network and vice versa.</p><p><strong>Results: </strong>By comparing with several state-of-the-art (SOTA) semi-supervised segmentation methods on four publicly available datasets, we validated that the proposed method outperforms existing models. The Dice similarity coefficient (DSC) and volume similarity (VS) were improved by at least 2.33% and 1.86%, respectively, compared to the corresponding sub-optimal methods. Through multiple ablation experiments, we verified that the proposed dual swap strategy can reduce the distributional differences between unlabeled data and labeled+mixed data. In addition, the cross EMA strategy can avoid early convergence of the student and teacher networks.</p><p><strong>Conclusions: </strong>The proposed strategies can alleviate the confirmation bias caused by the distribution discrepancy between labeled and unlabeled data in semi-supervised learning, as well as the issue of parameter coupling between the student and teacher networks in the MT architecture, providing therefore a promising approach to semi-supervised medical image segmentation.</p>","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semi-supervised medical image segmentation based on dual swap data mixing and cross EMA strategies.\",\"authors\":\"Licheng Zheng, Lihui Wang, Yingfeng Ou, Li Wang, Caiqing Jian, Yuemin Zhu\",\"doi\":\"10.1002/mp.17809\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Semi-supervised medical image segmentation methods based on mean teacher (MT) framework provide a promising means for addressing the dense prediction problems with limited annotated images and numerous unlabeled images. However, the confirmation bias caused by the distribution difference between labeled and unlabeled data and the parameters-coupling problem of MT prevent the model from further improving the segmentation performance.</p><p><strong>Purpose: </strong>To reduce confirmation bias and alleviate the parameter coupling problem in MT framework, a novel data augmentation strategy and a cross exponential moving averaging (crossEMA) architecture are proposed in this work.</p><p><strong>Methods: </strong>Specifically, a dual swap mixing data augmentation method was first proposed, which exchanges the patches between labeled and unlabeled images twice to decrease the confirmation bias caused by distribution divergency. Subsequently, a novel architecture for both student and teacher networks was designed with structurally identical dual decoders, one of which adopted a dropout operation. Labeled, unlabeled, and mixed images are fed into this MT architecture. For unlabeled data, the pseudo-labels generated by the dual decoders of the teacher network were used to supervise the predictions of the corresponding decoders of the student network. For mixed data, the real labels of the labeled data are mixed with the pseudo-labels of the unlabeled data predicted by the teacher network to form the supervisory information, which is used to constrain the prediction consistency for mixed data between student and teacher networks. To overcome the parameter coupling problem between the student and teacher networks, the encoder parameters of the teacher network were updated using an exponential moving average (EMA) strategy, while its dual decoder parameters were updated using a cross EMA strategy, which means the perturbed decoder parameters of the student network were updated with the non-perturbed decoder parameters of the student network and vice versa.</p><p><strong>Results: </strong>By comparing with several state-of-the-art (SOTA) semi-supervised segmentation methods on four publicly available datasets, we validated that the proposed method outperforms existing models. The Dice similarity coefficient (DSC) and volume similarity (VS) were improved by at least 2.33% and 1.86%, respectively, compared to the corresponding sub-optimal methods. Through multiple ablation experiments, we verified that the proposed dual swap strategy can reduce the distributional differences between unlabeled data and labeled+mixed data. In addition, the cross EMA strategy can avoid early convergence of the student and teacher networks.</p><p><strong>Conclusions: </strong>The proposed strategies can alleviate the confirmation bias caused by the distribution discrepancy between labeled and unlabeled data in semi-supervised learning, as well as the issue of parameter coupling between the student and teacher networks in the MT architecture, providing therefore a promising approach to semi-supervised medical image segmentation.</p>\",\"PeriodicalId\":94136,\"journal\":{\"name\":\"Medical physics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/mp.17809\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semi-supervised medical image segmentation based on dual swap data mixing and cross EMA strategies.
Background: Semi-supervised medical image segmentation methods based on mean teacher (MT) framework provide a promising means for addressing the dense prediction problems with limited annotated images and numerous unlabeled images. However, the confirmation bias caused by the distribution difference between labeled and unlabeled data and the parameters-coupling problem of MT prevent the model from further improving the segmentation performance.
Purpose: To reduce confirmation bias and alleviate the parameter coupling problem in MT framework, a novel data augmentation strategy and a cross exponential moving averaging (crossEMA) architecture are proposed in this work.
Methods: Specifically, a dual swap mixing data augmentation method was first proposed, which exchanges the patches between labeled and unlabeled images twice to decrease the confirmation bias caused by distribution divergency. Subsequently, a novel architecture for both student and teacher networks was designed with structurally identical dual decoders, one of which adopted a dropout operation. Labeled, unlabeled, and mixed images are fed into this MT architecture. For unlabeled data, the pseudo-labels generated by the dual decoders of the teacher network were used to supervise the predictions of the corresponding decoders of the student network. For mixed data, the real labels of the labeled data are mixed with the pseudo-labels of the unlabeled data predicted by the teacher network to form the supervisory information, which is used to constrain the prediction consistency for mixed data between student and teacher networks. To overcome the parameter coupling problem between the student and teacher networks, the encoder parameters of the teacher network were updated using an exponential moving average (EMA) strategy, while its dual decoder parameters were updated using a cross EMA strategy, which means the perturbed decoder parameters of the student network were updated with the non-perturbed decoder parameters of the student network and vice versa.
Results: By comparing with several state-of-the-art (SOTA) semi-supervised segmentation methods on four publicly available datasets, we validated that the proposed method outperforms existing models. The Dice similarity coefficient (DSC) and volume similarity (VS) were improved by at least 2.33% and 1.86%, respectively, compared to the corresponding sub-optimal methods. Through multiple ablation experiments, we verified that the proposed dual swap strategy can reduce the distributional differences between unlabeled data and labeled+mixed data. In addition, the cross EMA strategy can avoid early convergence of the student and teacher networks.
Conclusions: The proposed strategies can alleviate the confirmation bias caused by the distribution discrepancy between labeled and unlabeled data in semi-supervised learning, as well as the issue of parameter coupling between the student and teacher networks in the MT architecture, providing therefore a promising approach to semi-supervised medical image segmentation.