Jianwu Long, Yuanqin Liu, Shaoyi Wang, Shuang Chen, Qi Luo
{"title":"融合多阶段点击与深度反馈聚合的交互式图像分割","authors":"Jianwu Long, Yuanqin Liu, Shaoyi Wang, Shuang Chen, Qi Luo","doi":"10.1016/j.cag.2025.104445","DOIUrl":null,"url":null,"abstract":"<div><div>The objective of interactive image segmentation is to generate a segmentation mask for the target object using minimal user interaction. During the interaction process, segmentation results from previous iterations are typically used as feedback to guide subsequent user input. However, existing approaches often concatenate user interactions, feedback, and low-level image features as direct inputs to the network, overlooking the high-level semantic information contained in the feedback and the issue of information dilution from click signals. To address these limitations, we propose a novel interactive image segmentation model called Multi-stage Click Fusion with deep Feedback Aggregation(MCFA). MCFA introduces a new information fusion strategy. Specifically, for feedback information, it refines previous-round feedback using deep features and integrates the optimized feedback into the feature representation. For user clicks, MCFA performs multi-stage fusion to enhance click propagation while constraining its direction through the refined feedback. Experimental results demonstrate that MCFA consistently outperforms existing methods across five benchmark datasets: GrabCut, Berkeley, SBD, DAVIS and CVC-ClinicDB.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104445"},"PeriodicalIF":2.8000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fusing multi-stage clicks with deep feedback aggregation for interactive image segmentation\",\"authors\":\"Jianwu Long, Yuanqin Liu, Shaoyi Wang, Shuang Chen, Qi Luo\",\"doi\":\"10.1016/j.cag.2025.104445\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The objective of interactive image segmentation is to generate a segmentation mask for the target object using minimal user interaction. During the interaction process, segmentation results from previous iterations are typically used as feedback to guide subsequent user input. However, existing approaches often concatenate user interactions, feedback, and low-level image features as direct inputs to the network, overlooking the high-level semantic information contained in the feedback and the issue of information dilution from click signals. To address these limitations, we propose a novel interactive image segmentation model called Multi-stage Click Fusion with deep Feedback Aggregation(MCFA). MCFA introduces a new information fusion strategy. Specifically, for feedback information, it refines previous-round feedback using deep features and integrates the optimized feedback into the feature representation. For user clicks, MCFA performs multi-stage fusion to enhance click propagation while constraining its direction through the refined feedback. Experimental results demonstrate that MCFA consistently outperforms existing methods across five benchmark datasets: GrabCut, Berkeley, SBD, DAVIS and CVC-ClinicDB.</div></div>\",\"PeriodicalId\":50628,\"journal\":{\"name\":\"Computers & Graphics-Uk\",\"volume\":\"133 \",\"pages\":\"Article 104445\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Graphics-Uk\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0097849325002869\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325002869","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
摘要
交互式图像分割的目的是使用最少的用户交互为目标对象生成分割掩码。在交互过程中,以前迭代的分割结果通常用作指导后续用户输入的反馈。然而,现有的方法通常将用户交互、反馈和低级图像特征连接起来作为网络的直接输入,忽略了反馈中包含的高级语义信息和点击信号的信息稀释问题。为了解决这些限制,我们提出了一种新的交互式图像分割模型,称为深度反馈聚合的多阶段点击融合(Multi-stage Click Fusion with deep Feedback Aggregation, MCFA)。MCFA引入了一种新的信息融合策略。具体而言,对于反馈信息,它使用深度特征对前一轮反馈进行细化,并将优化后的反馈集成到特征表示中。对于用户点击,MCFA进行多阶段融合,增强点击传播,同时通过精细反馈约束点击传播方向。实验结果表明,MCFA在五个基准数据集(GrabCut、Berkeley、SBD、DAVIS和CVC-ClinicDB)上始终优于现有方法。
Fusing multi-stage clicks with deep feedback aggregation for interactive image segmentation
The objective of interactive image segmentation is to generate a segmentation mask for the target object using minimal user interaction. During the interaction process, segmentation results from previous iterations are typically used as feedback to guide subsequent user input. However, existing approaches often concatenate user interactions, feedback, and low-level image features as direct inputs to the network, overlooking the high-level semantic information contained in the feedback and the issue of information dilution from click signals. To address these limitations, we propose a novel interactive image segmentation model called Multi-stage Click Fusion with deep Feedback Aggregation(MCFA). MCFA introduces a new information fusion strategy. Specifically, for feedback information, it refines previous-round feedback using deep features and integrates the optimized feedback into the feature representation. For user clicks, MCFA performs multi-stage fusion to enhance click propagation while constraining its direction through the refined feedback. Experimental results demonstrate that MCFA consistently outperforms existing methods across five benchmark datasets: GrabCut, Berkeley, SBD, DAVIS and CVC-ClinicDB.
期刊介绍:
Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on:
1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains.
2. State-of-the-art papers on late-breaking, cutting-edge research on CG.
3. Information on innovative uses of graphics principles and technologies.
4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.