Huajun Liu;Yuan Jiang;Cailing Wang;Suting Chen;Hui Kong
{"title":"裂缝mdm:基于DCT域的掩模路面裂缝有效分割","authors":"Huajun Liu;Yuan Jiang;Cailing Wang;Suting Chen;Hui Kong","doi":"10.1109/TIM.2025.3604145","DOIUrl":null,"url":null,"abstract":"Existing pavement crack inspection methods heavily rely on a large amount of annotated samples and require overburdened computational power, which is not affordable on edge devices. Recent methods mostly focus on spatial features, which cannot effectively capture the long, continuous, tender, and thin road features. The masked image modeling (MIM) is an effective way to rebuild the crack primitive features by masking strategy on unlabeled data to reduce the dependence on annotated data, and the convolution on the frequency domain provides a flexible and efficient path to capture continuous and tender road features. Inspired by the thoughts of masked frequency modeling (MFM), we proposed a masked discrete cosine transform (DCT)-domain modeling strategy, named crack masked DCT-domain modeling (CrackMDM), for efficient pavement crack segmentation. Specifically, we propose a DCT-domain masked modeling method in the CrackMDM model, which combines the advantages of separable convolutions and spectral convolutions (SP-Convs) in the DCT domain to extract continuous and tender crack structures. Additionally, we introduce the self-supervised pretraining with a masking strategy in the DCT domain using unlabeled crack samples to build crack primitives and to fine-tune the encoder and decoder parameters on labeled crack data to refine crack features in the fine-tuning phase. The CrackMDM model is evaluated on three public benchmarks: CFD, YCD, and GAPs, and achieves state-of-the-art (SOTA) performance with superior inference speed. Codes are available at <uri>https://github.com/Jyuan357/CrackMDM</uri>","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-15"},"PeriodicalIF":5.9000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CrackMDM: Masked Modeling on DCT Domain for Efficient Pavement Crack Segmentation\",\"authors\":\"Huajun Liu;Yuan Jiang;Cailing Wang;Suting Chen;Hui Kong\",\"doi\":\"10.1109/TIM.2025.3604145\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing pavement crack inspection methods heavily rely on a large amount of annotated samples and require overburdened computational power, which is not affordable on edge devices. Recent methods mostly focus on spatial features, which cannot effectively capture the long, continuous, tender, and thin road features. The masked image modeling (MIM) is an effective way to rebuild the crack primitive features by masking strategy on unlabeled data to reduce the dependence on annotated data, and the convolution on the frequency domain provides a flexible and efficient path to capture continuous and tender road features. Inspired by the thoughts of masked frequency modeling (MFM), we proposed a masked discrete cosine transform (DCT)-domain modeling strategy, named crack masked DCT-domain modeling (CrackMDM), for efficient pavement crack segmentation. Specifically, we propose a DCT-domain masked modeling method in the CrackMDM model, which combines the advantages of separable convolutions and spectral convolutions (SP-Convs) in the DCT domain to extract continuous and tender crack structures. Additionally, we introduce the self-supervised pretraining with a masking strategy in the DCT domain using unlabeled crack samples to build crack primitives and to fine-tune the encoder and decoder parameters on labeled crack data to refine crack features in the fine-tuning phase. The CrackMDM model is evaluated on three public benchmarks: CFD, YCD, and GAPs, and achieves state-of-the-art (SOTA) performance with superior inference speed. Codes are available at <uri>https://github.com/Jyuan357/CrackMDM</uri>\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":\"74 \",\"pages\":\"1-15\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11145882/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11145882/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
CrackMDM: Masked Modeling on DCT Domain for Efficient Pavement Crack Segmentation
Existing pavement crack inspection methods heavily rely on a large amount of annotated samples and require overburdened computational power, which is not affordable on edge devices. Recent methods mostly focus on spatial features, which cannot effectively capture the long, continuous, tender, and thin road features. The masked image modeling (MIM) is an effective way to rebuild the crack primitive features by masking strategy on unlabeled data to reduce the dependence on annotated data, and the convolution on the frequency domain provides a flexible and efficient path to capture continuous and tender road features. Inspired by the thoughts of masked frequency modeling (MFM), we proposed a masked discrete cosine transform (DCT)-domain modeling strategy, named crack masked DCT-domain modeling (CrackMDM), for efficient pavement crack segmentation. Specifically, we propose a DCT-domain masked modeling method in the CrackMDM model, which combines the advantages of separable convolutions and spectral convolutions (SP-Convs) in the DCT domain to extract continuous and tender crack structures. Additionally, we introduce the self-supervised pretraining with a masking strategy in the DCT domain using unlabeled crack samples to build crack primitives and to fine-tune the encoder and decoder parameters on labeled crack data to refine crack features in the fine-tuning phase. The CrackMDM model is evaluated on three public benchmarks: CFD, YCD, and GAPs, and achieves state-of-the-art (SOTA) performance with superior inference speed. Codes are available at https://github.com/Jyuan357/CrackMDM
期刊介绍:
Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.