多光谱图像分割的领域信息挖掘和状态引导自适应网络。

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-07-22 DOI:10.1109/tnnls.2025.3589574

Boyu Zhao,Mengmeng Zhang,Wei Li,Yunhao Gao,Junjie Wang

{"title":"多光谱图像分割的领域信息挖掘和状态引导自适应网络。","authors":"Boyu Zhao,Mengmeng Zhang,Wei Li,Yunhao Gao,Junjie Wang","doi":"10.1109/tnnls.2025.3589574","DOIUrl":null,"url":null,"abstract":"Segment anything model (SAM), as a prompt-based image segmentation foundation model, demonstrates strong task versatility and domain generalization (DG) capabilities, providing a new direction for solving cross-scene segmentation tasks. However, SAM still has limitations in multispectral cross-domain segmentation tasks, mainly reflected in: 1) insufficient information utilization, which is reflected in the neglect of nonvisible spectral information and the shift information contained in source domain (SD) samples and target domain (TD) samples; and 2) lack of cross-domain strategies, which leads to insufficient cross-domain adaptation (DA) ability in downstream tasks. To address these challenges, we combine the respective advantages of masked autoencoder (MAE) and cross-domain strategies, propose an improved SAM DA network structure called domain information mining and state-guided adaptation network (DSAnet), aiming to enhance SAM's performance in multispectral cross-domain segmentation tasks from both data and task levels. At the data level, DSAnet incorporates a style masking learning component, which randomly masks image features and replaces them with domain-specific learnable tokens, integrated with the image reconstruction task, to mine the style information and domain invariance of the image itself. At the task level, DSAnet introduces domain state learning and style-guided segmentation: domain state learning, through a state sequence modeling approach, designs specific state representations for SD and TD to capture interdomain differences, thereby reducing task shift. Meanwhile, the learned domain state information can be directly applied to the inference stage. Style prompt segmentation guides the segmentation training process of SD images with TD style prompts, improving SAM's adaptability in cross-domain multispectral segmentation downstream tasks. Extensive experiments on three multitemporal multispectral image (MSI) datasets demonstrate the superiority of the proposed method compared to state-of-the-art cross-domain strategies and SAM variant methods.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"14 1","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Domain Information Mining and State-Guided Adaptation Network for Multispectral Image Segmentation.\",\"authors\":\"Boyu Zhao,Mengmeng Zhang,Wei Li,Yunhao Gao,Junjie Wang\",\"doi\":\"10.1109/tnnls.2025.3589574\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Segment anything model (SAM), as a prompt-based image segmentation foundation model, demonstrates strong task versatility and domain generalization (DG) capabilities, providing a new direction for solving cross-scene segmentation tasks. However, SAM still has limitations in multispectral cross-domain segmentation tasks, mainly reflected in: 1) insufficient information utilization, which is reflected in the neglect of nonvisible spectral information and the shift information contained in source domain (SD) samples and target domain (TD) samples; and 2) lack of cross-domain strategies, which leads to insufficient cross-domain adaptation (DA) ability in downstream tasks. To address these challenges, we combine the respective advantages of masked autoencoder (MAE) and cross-domain strategies, propose an improved SAM DA network structure called domain information mining and state-guided adaptation network (DSAnet), aiming to enhance SAM's performance in multispectral cross-domain segmentation tasks from both data and task levels. At the data level, DSAnet incorporates a style masking learning component, which randomly masks image features and replaces them with domain-specific learnable tokens, integrated with the image reconstruction task, to mine the style information and domain invariance of the image itself. At the task level, DSAnet introduces domain state learning and style-guided segmentation: domain state learning, through a state sequence modeling approach, designs specific state representations for SD and TD to capture interdomain differences, thereby reducing task shift. Meanwhile, the learned domain state information can be directly applied to the inference stage. Style prompt segmentation guides the segmentation training process of SD images with TD style prompts, improving SAM's adaptability in cross-domain multispectral segmentation downstream tasks. Extensive experiments on three multitemporal multispectral image (MSI) datasets demonstrate the superiority of the proposed method compared to state-of-the-art cross-domain strategies and SAM variant methods.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"14 1\",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tnnls.2025.3589574\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3589574","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

分割任意模型（SAM）作为一种基于提示的图像分割基础模型，具有较强的任务通用性和领域泛化能力，为解决跨场景分割任务提供了新的方向。但是，SAM在多光谱跨域分割任务中仍然存在局限性，主要表现在：1)信息利用不足，主要表现在忽略了不可见光谱信息以及源域（SD）样本和目标域（TD）样本中包含的偏移信息；2)缺乏跨域策略，导致下游任务的跨域适应能力不足。为了解决这些问题，我们结合掩膜自编码器（MAE）和跨域策略各自的优势，提出了一种改进的SAM DA网络结构，称为领域信息挖掘和状态引导自适应网络（DSAnet），旨在从数据和任务两个层面提高SAM在多光谱跨域分割任务中的性能。在数据层面，DSAnet集成了一个样式屏蔽学习组件，该组件随机屏蔽图像特征，并用特定于领域的可学习令牌替换它们，与图像重建任务集成，以挖掘图像本身的样式信息和领域不变性。在任务层面，DSAnet引入了领域状态学习和风格引导的分割：领域状态学习通过状态序列建模方法，为SD和TD设计特定的状态表示，以捕获域间差异，从而减少任务转移。同时，学习到的领域状态信息可以直接应用到推理阶段。风格提示符分割用TD风格提示符指导SD图像的分割训练过程，提高SAM在跨域多光谱分割下游任务中的适应性。在三个多时相多光谱图像（MSI）数据集上进行的大量实验表明，与最先进的跨域策略和SAM变体方法相比，该方法具有优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Domain Information Mining and State-Guided Adaptation Network for Multispectral Image Segmentation.

Segment anything model (SAM), as a prompt-based image segmentation foundation model, demonstrates strong task versatility and domain generalization (DG) capabilities, providing a new direction for solving cross-scene segmentation tasks. However, SAM still has limitations in multispectral cross-domain segmentation tasks, mainly reflected in: 1) insufficient information utilization, which is reflected in the neglect of nonvisible spectral information and the shift information contained in source domain (SD) samples and target domain (TD) samples; and 2) lack of cross-domain strategies, which leads to insufficient cross-domain adaptation (DA) ability in downstream tasks. To address these challenges, we combine the respective advantages of masked autoencoder (MAE) and cross-domain strategies, propose an improved SAM DA network structure called domain information mining and state-guided adaptation network (DSAnet), aiming to enhance SAM's performance in multispectral cross-domain segmentation tasks from both data and task levels. At the data level, DSAnet incorporates a style masking learning component, which randomly masks image features and replaces them with domain-specific learnable tokens, integrated with the image reconstruction task, to mine the style information and domain invariance of the image itself. At the task level, DSAnet introduces domain state learning and style-guided segmentation: domain state learning, through a state sequence modeling approach, designs specific state representations for SD and TD to capture interdomain differences, thereby reducing task shift. Meanwhile, the learned domain state information can be directly applied to the inference stage. Style prompt segmentation guides the segmentation training process of SD images with TD style prompts, improving SAM's adaptability in cross-domain multispectral segmentation downstream tasks. Extensive experiments on three multitemporal multispectral image (MSI) datasets demonstrate the superiority of the proposed method compared to state-of-the-art cross-domain strategies and SAM variant methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.