Enhancing SAM-based digital rock image segmentation via edge-semantics fusion

IF 3.2 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences Pub Date : 2025-09-20 DOI:10.1016/j.acags.2025.100292

Ziqiang Wang , Zhiyu Hou , Danping Cao

{"title":"Enhancing SAM-based digital rock image segmentation via edge-semantics fusion","authors":"Ziqiang Wang , Zhiyu Hou , Danping Cao","doi":"10.1016/j.acags.2025.100292","DOIUrl":null,"url":null,"abstract":"<div><div>The Segment Anything Model (SAM) demonstrates strong segmentation capabilities. However, its application to digital rock images faces challenges from subtle transitions between matrix minerals and pore structures, as well as inherent heterogeneity, which result in mis-segmentation and discontinuities that affect petrophysical characterization and numerical modeling of subsurface reservoir properties. To address these challenges, we introduce ESF-SAM (Edge-Semantics Fusion-SAM), a novel approach that enhances SAM's segmentation fidelity by integrating edge and semantic features. Specifically, in ESF-SAM, semantic features from SAM's image encoder are processed through an edge decoder enhanced by progressive dilated convolutions to extract detailed structural boundaries. The resulting edge and original semantic features are adaptively fused through a dual-attention mechanism, where spatial gating attention dynamically balances their contributions across locations, and channel attention recalibrates feature importance to enrich the representation. This spatial–channel attention framework enriches feature representations, enabling targeted fine-tuning within the SAM decoder and thereby preserving global segmentation capability while significantly improving local boundary delineation in two-phase segmentation tasks. Experimental results demonstrate that ESF-SAM improves segmentation detail, leading to more accurate derivation of key rock properties such as elastic modulus and pore geometry parameters, with results that more closely align with labeled data compared to the original SAM. Trained on only a small number of annotated sandstone images, ESF-SAM effectively adapts to the target domain and exhibits strong generalization when applied to carbonate rock images without additional fine-tuning. This work exemplifies how integrating priors into foundation models can substantially enhance their applicability to complex scientific imaging tasks.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"28 ","pages":"Article 100292"},"PeriodicalIF":3.2000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing and Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590197425000746","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

The Segment Anything Model (SAM) demonstrates strong segmentation capabilities. However, its application to digital rock images faces challenges from subtle transitions between matrix minerals and pore structures, as well as inherent heterogeneity, which result in mis-segmentation and discontinuities that affect petrophysical characterization and numerical modeling of subsurface reservoir properties. To address these challenges, we introduce ESF-SAM (Edge-Semantics Fusion-SAM), a novel approach that enhances SAM's segmentation fidelity by integrating edge and semantic features. Specifically, in ESF-SAM, semantic features from SAM's image encoder are processed through an edge decoder enhanced by progressive dilated convolutions to extract detailed structural boundaries. The resulting edge and original semantic features are adaptively fused through a dual-attention mechanism, where spatial gating attention dynamically balances their contributions across locations, and channel attention recalibrates feature importance to enrich the representation. This spatial–channel attention framework enriches feature representations, enabling targeted fine-tuning within the SAM decoder and thereby preserving global segmentation capability while significantly improving local boundary delineation in two-phase segmentation tasks. Experimental results demonstrate that ESF-SAM improves segmentation detail, leading to more accurate derivation of key rock properties such as elastic modulus and pore geometry parameters, with results that more closely align with labeled data compared to the original SAM. Trained on only a small number of annotated sandstone images, ESF-SAM effectively adapts to the target domain and exhibits strong generalization when applied to carbonate rock images without additional fine-tuning. This work exemplifies how integrating priors into foundation models can substantially enhance their applicability to complex scientific imaging tasks.

查看原文本刊更多论文

利用边缘语义融合增强基于sam的数字岩石图像分割

分段任意模型（SAM）展示了强大的分段能力。然而，将其应用于数字岩石图像面临着基质矿物和孔隙结构之间的微妙过渡以及固有的非均质性的挑战，这些挑战导致了错误的分割和不连续性，从而影响了岩石物理表征和地下储层性质的数值模拟。为了解决这些挑战，我们引入了ESF-SAM（边缘语义融合-SAM），这是一种通过整合边缘和语义特征来提高SAM分割保真度的新方法。具体而言，在ESF-SAM中，来自SAM图像编码器的语义特征通过渐进式扩展卷积增强的边缘解码器进行处理，以提取详细的结构边界。通过双注意机制自适应融合生成的边缘和原始语义特征，其中空间门控注意动态平衡其在不同位置上的贡献，通道注意重新校准特征的重要性以丰富表征。这种空间通道注意框架丰富了特征表示，使SAM解码器能够进行有针对性的微调，从而在保留全局分割能力的同时显著改善了两阶段分割任务中的局部边界描绘。实验结果表明，与原始的SAM相比，ESF-SAM改善了分割细节，可以更准确地推导出关键的岩石属性，如弹性模量和孔隙几何参数，结果与标记数据更接近。仅在少量带注释的砂岩图像上进行训练，ESF-SAM可以有效地适应目标域，并且在无需额外微调的情况下应用于碳酸盐岩图像时表现出很强的泛化能力。这项工作举例说明了如何将先验整合到基础模型中可以大大提高它们对复杂科学成像任务的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊