Boyu Zhao,Mengmeng Zhang,Wei Li,Yunhao Gao,Junjie Wang
{"title":"多光谱图像分割的领域信息挖掘和状态引导自适应网络。","authors":"Boyu Zhao,Mengmeng Zhang,Wei Li,Yunhao Gao,Junjie Wang","doi":"10.1109/tnnls.2025.3589574","DOIUrl":null,"url":null,"abstract":"Segment anything model (SAM), as a prompt-based image segmentation foundation model, demonstrates strong task versatility and domain generalization (DG) capabilities, providing a new direction for solving cross-scene segmentation tasks. However, SAM still has limitations in multispectral cross-domain segmentation tasks, mainly reflected in: 1) insufficient information utilization, which is reflected in the neglect of nonvisible spectral information and the shift information contained in source domain (SD) samples and target domain (TD) samples; and 2) lack of cross-domain strategies, which leads to insufficient cross-domain adaptation (DA) ability in downstream tasks. To address these challenges, we combine the respective advantages of masked autoencoder (MAE) and cross-domain strategies, propose an improved SAM DA network structure called domain information mining and state-guided adaptation network (DSAnet), aiming to enhance SAM's performance in multispectral cross-domain segmentation tasks from both data and task levels. At the data level, DSAnet incorporates a style masking learning component, which randomly masks image features and replaces them with domain-specific learnable tokens, integrated with the image reconstruction task, to mine the style information and domain invariance of the image itself. At the task level, DSAnet introduces domain state learning and style-guided segmentation: domain state learning, through a state sequence modeling approach, designs specific state representations for SD and TD to capture interdomain differences, thereby reducing task shift. Meanwhile, the learned domain state information can be directly applied to the inference stage. Style prompt segmentation guides the segmentation training process of SD images with TD style prompts, improving SAM's adaptability in cross-domain multispectral segmentation downstream tasks. Extensive experiments on three multitemporal multispectral image (MSI) datasets demonstrate the superiority of the proposed method compared to state-of-the-art cross-domain strategies and SAM variant methods.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"14 1","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Domain Information Mining and State-Guided Adaptation Network for Multispectral Image Segmentation.\",\"authors\":\"Boyu Zhao,Mengmeng Zhang,Wei Li,Yunhao Gao,Junjie Wang\",\"doi\":\"10.1109/tnnls.2025.3589574\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Segment anything model (SAM), as a prompt-based image segmentation foundation model, demonstrates strong task versatility and domain generalization (DG) capabilities, providing a new direction for solving cross-scene segmentation tasks. However, SAM still has limitations in multispectral cross-domain segmentation tasks, mainly reflected in: 1) insufficient information utilization, which is reflected in the neglect of nonvisible spectral information and the shift information contained in source domain (SD) samples and target domain (TD) samples; and 2) lack of cross-domain strategies, which leads to insufficient cross-domain adaptation (DA) ability in downstream tasks. To address these challenges, we combine the respective advantages of masked autoencoder (MAE) and cross-domain strategies, propose an improved SAM DA network structure called domain information mining and state-guided adaptation network (DSAnet), aiming to enhance SAM's performance in multispectral cross-domain segmentation tasks from both data and task levels. At the data level, DSAnet incorporates a style masking learning component, which randomly masks image features and replaces them with domain-specific learnable tokens, integrated with the image reconstruction task, to mine the style information and domain invariance of the image itself. At the task level, DSAnet introduces domain state learning and style-guided segmentation: domain state learning, through a state sequence modeling approach, designs specific state representations for SD and TD to capture interdomain differences, thereby reducing task shift. Meanwhile, the learned domain state information can be directly applied to the inference stage. Style prompt segmentation guides the segmentation training process of SD images with TD style prompts, improving SAM's adaptability in cross-domain multispectral segmentation downstream tasks. Extensive experiments on three multitemporal multispectral image (MSI) datasets demonstrate the superiority of the proposed method compared to state-of-the-art cross-domain strategies and SAM variant methods.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"14 1\",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tnnls.2025.3589574\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3589574","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Domain Information Mining and State-Guided Adaptation Network for Multispectral Image Segmentation.
Segment anything model (SAM), as a prompt-based image segmentation foundation model, demonstrates strong task versatility and domain generalization (DG) capabilities, providing a new direction for solving cross-scene segmentation tasks. However, SAM still has limitations in multispectral cross-domain segmentation tasks, mainly reflected in: 1) insufficient information utilization, which is reflected in the neglect of nonvisible spectral information and the shift information contained in source domain (SD) samples and target domain (TD) samples; and 2) lack of cross-domain strategies, which leads to insufficient cross-domain adaptation (DA) ability in downstream tasks. To address these challenges, we combine the respective advantages of masked autoencoder (MAE) and cross-domain strategies, propose an improved SAM DA network structure called domain information mining and state-guided adaptation network (DSAnet), aiming to enhance SAM's performance in multispectral cross-domain segmentation tasks from both data and task levels. At the data level, DSAnet incorporates a style masking learning component, which randomly masks image features and replaces them with domain-specific learnable tokens, integrated with the image reconstruction task, to mine the style information and domain invariance of the image itself. At the task level, DSAnet introduces domain state learning and style-guided segmentation: domain state learning, through a state sequence modeling approach, designs specific state representations for SD and TD to capture interdomain differences, thereby reducing task shift. Meanwhile, the learned domain state information can be directly applied to the inference stage. Style prompt segmentation guides the segmentation training process of SD images with TD style prompts, improving SAM's adaptability in cross-domain multispectral segmentation downstream tasks. Extensive experiments on three multitemporal multispectral image (MSI) datasets demonstrate the superiority of the proposed method compared to state-of-the-art cross-domain strategies and SAM variant methods.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.