{"title":"用于裂纹检测的通用合成数据生成框架","authors":"Jiawei Xie , Baolin Chen , Anna Giacomini , Hongyu Guo , Umair Iqbal , Jinsong Huang","doi":"10.1016/j.engstruct.2025.121428","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning has shown remarkable promise in automating crack detection for civil infrastructure inspection, yet collecting and annotating large, high-quality datasets remains a major bottleneck. To address this challenge, we propose CrackGen—a versatile, fully controllable framework for synthetic data generation. By integrating fine-tuned Stable Diffusion models with ControlNet, CrackGen synthesizes unlimited crack images that closely mimic real-world variability in material textures, crack patterns, and morphological features. A pivotal innovation of our approach is that the same control conditions used to steer image creation naturally serve as ground truth labels, eliminating the need for labor-intensive post-generation annotation. This feature is especially critical for large-scale dataset construction. Through fine-tuning on extremely small training sets, CrackGen inherits the learned knowledge of Stable Diffusion, enabling users to adjust crack widths, orientations, and background materials through simple prompt adjustments. Our experiments systematically analyze how different parameter settings affect the fidelity and diversity of generated images. Additionally, we propose novel crack sketching methods, including a Rapidly-exploring Random Trees (RRT) algorithm that emulates real-world crack propagation paths to produce complex, fractal-like crack networks. Extensive experiments demonstrate that models trained purely on CrackGen-generated data achieve consistent, high-quality results on real-world crack detection tasks, validating the synthetic data’s robustness and practicality. This robustness was further confirmed through cross-domain validation on rock surfaces. The source code and datasets are freely available from GitHub repository (<span><span>https://github.com/GEO-ATLAS/CrackGen</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":11763,"journal":{"name":"Engineering Structures","volume":"344 ","pages":"Article 121428"},"PeriodicalIF":6.4000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A versatile synthetic data generation framework for crack detection\",\"authors\":\"Jiawei Xie , Baolin Chen , Anna Giacomini , Hongyu Guo , Umair Iqbal , Jinsong Huang\",\"doi\":\"10.1016/j.engstruct.2025.121428\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deep learning has shown remarkable promise in automating crack detection for civil infrastructure inspection, yet collecting and annotating large, high-quality datasets remains a major bottleneck. To address this challenge, we propose CrackGen—a versatile, fully controllable framework for synthetic data generation. By integrating fine-tuned Stable Diffusion models with ControlNet, CrackGen synthesizes unlimited crack images that closely mimic real-world variability in material textures, crack patterns, and morphological features. A pivotal innovation of our approach is that the same control conditions used to steer image creation naturally serve as ground truth labels, eliminating the need for labor-intensive post-generation annotation. This feature is especially critical for large-scale dataset construction. Through fine-tuning on extremely small training sets, CrackGen inherits the learned knowledge of Stable Diffusion, enabling users to adjust crack widths, orientations, and background materials through simple prompt adjustments. Our experiments systematically analyze how different parameter settings affect the fidelity and diversity of generated images. Additionally, we propose novel crack sketching methods, including a Rapidly-exploring Random Trees (RRT) algorithm that emulates real-world crack propagation paths to produce complex, fractal-like crack networks. Extensive experiments demonstrate that models trained purely on CrackGen-generated data achieve consistent, high-quality results on real-world crack detection tasks, validating the synthetic data’s robustness and practicality. This robustness was further confirmed through cross-domain validation on rock surfaces. The source code and datasets are freely available from GitHub repository (<span><span>https://github.com/GEO-ATLAS/CrackGen</span><svg><path></path></svg></span>).</div></div>\",\"PeriodicalId\":11763,\"journal\":{\"name\":\"Engineering Structures\",\"volume\":\"344 \",\"pages\":\"Article 121428\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Structures\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S014102962501819X\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Structures","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S014102962501819X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
A versatile synthetic data generation framework for crack detection
Deep learning has shown remarkable promise in automating crack detection for civil infrastructure inspection, yet collecting and annotating large, high-quality datasets remains a major bottleneck. To address this challenge, we propose CrackGen—a versatile, fully controllable framework for synthetic data generation. By integrating fine-tuned Stable Diffusion models with ControlNet, CrackGen synthesizes unlimited crack images that closely mimic real-world variability in material textures, crack patterns, and morphological features. A pivotal innovation of our approach is that the same control conditions used to steer image creation naturally serve as ground truth labels, eliminating the need for labor-intensive post-generation annotation. This feature is especially critical for large-scale dataset construction. Through fine-tuning on extremely small training sets, CrackGen inherits the learned knowledge of Stable Diffusion, enabling users to adjust crack widths, orientations, and background materials through simple prompt adjustments. Our experiments systematically analyze how different parameter settings affect the fidelity and diversity of generated images. Additionally, we propose novel crack sketching methods, including a Rapidly-exploring Random Trees (RRT) algorithm that emulates real-world crack propagation paths to produce complex, fractal-like crack networks. Extensive experiments demonstrate that models trained purely on CrackGen-generated data achieve consistent, high-quality results on real-world crack detection tasks, validating the synthetic data’s robustness and practicality. This robustness was further confirmed through cross-domain validation on rock surfaces. The source code and datasets are freely available from GitHub repository (https://github.com/GEO-ATLAS/CrackGen).
期刊介绍:
Engineering Structures provides a forum for a broad blend of scientific and technical papers to reflect the evolving needs of the structural engineering and structural mechanics communities. Particularly welcome are contributions dealing with applications of structural engineering and mechanics principles in all areas of technology. The journal aspires to a broad and integrated coverage of the effects of dynamic loadings and of the modelling techniques whereby the structural response to these loadings may be computed.
The scope of Engineering Structures encompasses, but is not restricted to, the following areas: infrastructure engineering; earthquake engineering; structure-fluid-soil interaction; wind engineering; fire engineering; blast engineering; structural reliability/stability; life assessment/integrity; structural health monitoring; multi-hazard engineering; structural dynamics; optimization; expert systems; experimental modelling; performance-based design; multiscale analysis; value engineering.
Topics of interest include: tall buildings; innovative structures; environmentally responsive structures; bridges; stadiums; commercial and public buildings; transmission towers; television and telecommunication masts; foldable structures; cooling towers; plates and shells; suspension structures; protective structures; smart structures; nuclear reactors; dams; pressure vessels; pipelines; tunnels.
Engineering Structures also publishes review articles, short communications and discussions, book reviews, and a diary on international events related to any aspect of structural engineering.