PDSAM：用于跟踪缺陷检测的提示驱动SAM

IF 5.9 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Instrumentation and Measurement Pub Date : 2025-06-26 DOI:10.1109/TIM.2025.3583378

Yu Fang;Pan Tao;Tianrui Li;Fan Min

{"title":"PDSAM：用于跟踪缺陷检测的提示驱动SAM","authors":"Yu Fang;Pan Tao;Tianrui Li;Fan Min","doi":"10.1109/TIM.2025.3583378","DOIUrl":null,"url":null,"abstract":"The track defect detection is critical for ensuring the safety and reliability of railway systems. Existing machine vision-based approaches are hindered by three key issues: high-time complexity stemming from end-to-end network training, limited availability of training data (with only a few hundred labeled images), and suboptimal prediction precision. To address these challenges, this article introduces the prompt-driven segment anything model (PDSAM), a novel image semantic segmentation framework that introduces a paradigm shift in problem formulation. The core contribution lies in reformulating the segmentation task as a prompt generation problem, which offers two correlated advantages. First, a simplified prompt generation network reduces both training time and data requirements compared with standalone segmentation networks. Second, an upscaling and visual prompting technique restores spatial resolution and mitigates the risk of local optima in feature optimization, enabling more precise and fine-grained segmentation outputs. Experimental evaluations on benchmark datasets demonstrate that PDSAM outperforms state-of-the-art methods in both prediction accuracy and computational efficiency for railway track defect detection. The proposed framework’s source code and pretrained models (PTMs) are publicly available to facilitate reproducibility and further research, accessible at: <uri>https://github.com/FreddyDylan/PDSAM/</uri>","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-17"},"PeriodicalIF":5.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PDSAM: Prompt-Driven SAM for Track Defect Detection\",\"authors\":\"Yu Fang;Pan Tao;Tianrui Li;Fan Min\",\"doi\":\"10.1109/TIM.2025.3583378\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The track defect detection is critical for ensuring the safety and reliability of railway systems. Existing machine vision-based approaches are hindered by three key issues: high-time complexity stemming from end-to-end network training, limited availability of training data (with only a few hundred labeled images), and suboptimal prediction precision. To address these challenges, this article introduces the prompt-driven segment anything model (PDSAM), a novel image semantic segmentation framework that introduces a paradigm shift in problem formulation. The core contribution lies in reformulating the segmentation task as a prompt generation problem, which offers two correlated advantages. First, a simplified prompt generation network reduces both training time and data requirements compared with standalone segmentation networks. Second, an upscaling and visual prompting technique restores spatial resolution and mitigates the risk of local optima in feature optimization, enabling more precise and fine-grained segmentation outputs. Experimental evaluations on benchmark datasets demonstrate that PDSAM outperforms state-of-the-art methods in both prediction accuracy and computational efficiency for railway track defect detection. The proposed framework’s source code and pretrained models (PTMs) are publicly available to facilitate reproducibility and further research, accessible at: <uri>https://github.com/FreddyDylan/PDSAM/</uri>\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":\"74 \",\"pages\":\"1-17\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11052718/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11052718/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

轨道缺陷检测是保证铁路系统安全可靠运行的关键。现有的基于机器视觉的方法受到三个关键问题的阻碍：端到端网络训练产生的高时间复杂性，训练数据的有限可用性（只有几百个标记的图像），以及次优的预测精度。为了应对这些挑战，本文介绍了提示驱动的任意分割模型（PDSAM），这是一种新的图像语义分割框架，在问题表述中引入了范式转变。其核心贡献在于将分词任务重新表述为提示生成问题，这提供了两个相关的优势。首先，与独立分割网络相比，简化的提示生成网络减少了训练时间和数据需求。其次，利用升级和视觉提示技术恢复空间分辨率，降低特征优化中的局部最优风险，实现更精确、更细粒度的分割输出。在基准数据集上的实验评估表明，PDSAM在铁路轨道缺陷检测的预测精度和计算效率方面都优于目前最先进的方法。拟议框架的源代码和预训练模型（ptm）是公开的，以促进再现性和进一步的研究，可访问：https://github.com/FreddyDylan/PDSAM/

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PDSAM: Prompt-Driven SAM for Track Defect Detection

The track defect detection is critical for ensuring the safety and reliability of railway systems. Existing machine vision-based approaches are hindered by three key issues: high-time complexity stemming from end-to-end network training, limited availability of training data (with only a few hundred labeled images), and suboptimal prediction precision. To address these challenges, this article introduces the prompt-driven segment anything model (PDSAM), a novel image semantic segmentation framework that introduces a paradigm shift in problem formulation. The core contribution lies in reformulating the segmentation task as a prompt generation problem, which offers two correlated advantages. First, a simplified prompt generation network reduces both training time and data requirements compared with standalone segmentation networks. Second, an upscaling and visual prompting technique restores spatial resolution and mitigates the risk of local optima in feature optimization, enabling more precise and fine-grained segmentation outputs. Experimental evaluations on benchmark datasets demonstrate that PDSAM outperforms state-of-the-art methods in both prediction accuracy and computational efficiency for railway track defect detection. The proposed framework’s source code and pretrained models (PTMs) are publicly available to facilitate reproducibility and further research, accessible at: https://github.com/FreddyDylan/PDSAM/

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Instrumentation and Measurement 工程技术-工程：电子与电气

CiteScore

9.00

自引率

23.20%

发文量

1294

审稿时长

3.9 months

期刊介绍： Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.