M A Mohammed Eltoum, Ehtesham Iqbal, Yahya Zweiri, Brain Moyo, Yusra Abdulrahman
{"title":"基于渲染的航空发动机叶片缺陷检测合成数据集。","authors":"M A Mohammed Eltoum, Ehtesham Iqbal, Yahya Zweiri, Brain Moyo, Yusra Abdulrahman","doi":"10.1038/s41597-025-05563-y","DOIUrl":null,"url":null,"abstract":"<p><p>The integration of artificial intelligence in industry is crucial for realizing Industry 4.0; however, the lack of industrial datasets remains a significant challenge. While several generative AI methods have been proposed to create synthetic data, these approaches are often inefficient and require a large volume of training data to function effectively. In this study, we utilize a physics-based rendering procedure to generate a synthetic dataset of aeroengine blades. This dataset is then used to train a defect inspection model, thereby addressing data scarcity and enhancing defect detection accuracy in industrial applications. The dataset generation process begins with preparing Computer-Aided Design (CAD) models and material textures, then constructing a realistic inspection scene incorporating domain-randomized camera settings, lighting, and background elements. The generated data is assessed for effectiveness in both supervised and unsupervised defect detection tasks. Additionally, sim-to-real transferability is examined, demonstrating that models trained on the generated synthetic data can effectively detect and classify defects in real blade images.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1268"},"PeriodicalIF":6.9000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12276216/pdf/","citationCount":"0","resultStr":"{\"title\":\"BladeSynth: A High-Quality Rendering-Based Synthetic Dataset for Aero Engine Blade Defect Inspection.\",\"authors\":\"M A Mohammed Eltoum, Ehtesham Iqbal, Yahya Zweiri, Brain Moyo, Yusra Abdulrahman\",\"doi\":\"10.1038/s41597-025-05563-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The integration of artificial intelligence in industry is crucial for realizing Industry 4.0; however, the lack of industrial datasets remains a significant challenge. While several generative AI methods have been proposed to create synthetic data, these approaches are often inefficient and require a large volume of training data to function effectively. In this study, we utilize a physics-based rendering procedure to generate a synthetic dataset of aeroengine blades. This dataset is then used to train a defect inspection model, thereby addressing data scarcity and enhancing defect detection accuracy in industrial applications. The dataset generation process begins with preparing Computer-Aided Design (CAD) models and material textures, then constructing a realistic inspection scene incorporating domain-randomized camera settings, lighting, and background elements. The generated data is assessed for effectiveness in both supervised and unsupervised defect detection tasks. Additionally, sim-to-real transferability is examined, demonstrating that models trained on the generated synthetic data can effectively detect and classify defects in real blade images.</p>\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"1268\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12276216/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-05563-y\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-05563-y","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
BladeSynth: A High-Quality Rendering-Based Synthetic Dataset for Aero Engine Blade Defect Inspection.
The integration of artificial intelligence in industry is crucial for realizing Industry 4.0; however, the lack of industrial datasets remains a significant challenge. While several generative AI methods have been proposed to create synthetic data, these approaches are often inefficient and require a large volume of training data to function effectively. In this study, we utilize a physics-based rendering procedure to generate a synthetic dataset of aeroengine blades. This dataset is then used to train a defect inspection model, thereby addressing data scarcity and enhancing defect detection accuracy in industrial applications. The dataset generation process begins with preparing Computer-Aided Design (CAD) models and material textures, then constructing a realistic inspection scene incorporating domain-randomized camera settings, lighting, and background elements. The generated data is assessed for effectiveness in both supervised and unsupervised defect detection tasks. Additionally, sim-to-real transferability is examined, demonstrating that models trained on the generated synthetic data can effectively detect and classify defects in real blade images.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.