具有协同提示和嵌入的前视声纳微调SAM

Jiayuan Li;Zhen Wang;Nan Xu;Zhuhong You
{"title":"具有协同提示和嵌入的前视声纳微调SAM","authors":"Jiayuan Li;Zhen Wang;Nan Xu;Zhuhong You","doi":"10.1109/LGRS.2025.3562182","DOIUrl":null,"url":null,"abstract":"The segment anything model (SAM) represents a significant advancement in semantic segmentation, particularly for natural images, but encounters notable limitations when applied to forward-looking sonar (FLS) images. The primary challenges lie in the inherent boundary ambiguity of FLS images, which complicates the use of prompt strategies for accurate boundary delineation, and the lack of effective interaction between prompts and image features. In this letter, we introduce a collaborative prompting (CP) strategy to address these issues by generating dense prompt embeddings and sonar tokens that focus on contour and boundary features, thereby replacing the original dense prompt embedding and intersection over union (IoU) token. To further enhance segmentation, we use embedding compensation techniques based on Mamba and Kolmogorov–Arnold network (KAN), which increase boundary information to image embeddings and improve the fusion of prompts within image embeddings. We conducted comprehensive experiments, including comparative analyses and ablation studies, to validate the superiority of our proposed approach. Results show that our method significantly improves segmentation performance for FLS images, effectively addressing boundary ambiguity and optimizing prompt utilization. The source code and dataset will be available on <uri>https://github.com/darkseid-arch/FLSSAM</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fine-Tuning SAM for Forward-Looking Sonar With Collaborative Prompts and Embedding\",\"authors\":\"Jiayuan Li;Zhen Wang;Nan Xu;Zhuhong You\",\"doi\":\"10.1109/LGRS.2025.3562182\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The segment anything model (SAM) represents a significant advancement in semantic segmentation, particularly for natural images, but encounters notable limitations when applied to forward-looking sonar (FLS) images. The primary challenges lie in the inherent boundary ambiguity of FLS images, which complicates the use of prompt strategies for accurate boundary delineation, and the lack of effective interaction between prompts and image features. In this letter, we introduce a collaborative prompting (CP) strategy to address these issues by generating dense prompt embeddings and sonar tokens that focus on contour and boundary features, thereby replacing the original dense prompt embedding and intersection over union (IoU) token. To further enhance segmentation, we use embedding compensation techniques based on Mamba and Kolmogorov–Arnold network (KAN), which increase boundary information to image embeddings and improve the fusion of prompts within image embeddings. We conducted comprehensive experiments, including comparative analyses and ablation studies, to validate the superiority of our proposed approach. Results show that our method significantly improves segmentation performance for FLS images, effectively addressing boundary ambiguity and optimizing prompt utilization. The source code and dataset will be available on <uri>https://github.com/darkseid-arch/FLSSAM</uri>\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10969803/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10969803/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

分段任意模型(SAM)在语义分割方面取得了重大进展,特别是在自然图像中,但在应用于前视声纳(FLS)图像时遇到了明显的局限性。主要的挑战在于FLS图像固有的边界模糊性,这使得提示策略的使用变得复杂,难以准确地描绘边界,并且提示与图像特征之间缺乏有效的交互。在这封信中,我们介绍了一种协作提示(CP)策略,通过生成专注于轮廓和边界特征的密集提示嵌入和声纳令牌来解决这些问题,从而取代了原始的密集提示嵌入和交联(IoU)令牌。为了进一步提高分割效果,我们采用了基于Mamba和Kolmogorov-Arnold网络(KAN)的嵌入补偿技术,增加了图像嵌入的边界信息,并改善了图像嵌入内提示信息的融合。我们进行了全面的实验,包括比较分析和消融研究,以验证我们提出的方法的优越性。结果表明,该方法显著提高了FLS图像的分割性能,有效地解决了边界模糊问题,优化了提示利用率。源代码和数据集可在https://github.com/darkseid-arch/FLSSAM上获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fine-Tuning SAM for Forward-Looking Sonar With Collaborative Prompts and Embedding
The segment anything model (SAM) represents a significant advancement in semantic segmentation, particularly for natural images, but encounters notable limitations when applied to forward-looking sonar (FLS) images. The primary challenges lie in the inherent boundary ambiguity of FLS images, which complicates the use of prompt strategies for accurate boundary delineation, and the lack of effective interaction between prompts and image features. In this letter, we introduce a collaborative prompting (CP) strategy to address these issues by generating dense prompt embeddings and sonar tokens that focus on contour and boundary features, thereby replacing the original dense prompt embedding and intersection over union (IoU) token. To further enhance segmentation, we use embedding compensation techniques based on Mamba and Kolmogorov–Arnold network (KAN), which increase boundary information to image embeddings and improve the fusion of prompts within image embeddings. We conducted comprehensive experiments, including comparative analyses and ablation studies, to validate the superiority of our proposed approach. Results show that our method significantly improves segmentation performance for FLS images, effectively addressing boundary ambiguity and optimizing prompt utilization. The source code and dataset will be available on https://github.com/darkseid-arch/FLSSAM
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信