The Creation of Artificial Data for Training a Neural Network Using the Example of a Conveyor Production Line for Flooring.

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging Pub Date : 2025-05-20 DOI:10.3390/jimaging11050168

Alexey Zaripov, Roman Kulshin, Anatoly Sidorov

{"title":"The Creation of Artificial Data for Training a Neural Network Using the Example of a Conveyor Production Line for Flooring.","authors":"Alexey Zaripov, Roman Kulshin, Anatoly Sidorov","doi":"10.3390/jimaging11050168","DOIUrl":null,"url":null,"abstract":"<p><p>This work is dedicated to the development of a system for generating artificial data for training neural networks used within a conveyor-based technology framework. It presents an overview of the application areas of computer vision (CV) and establishes that traditional methods of data collection and annotation-such as video recording and manual image labeling-are associated with high time and financial costs, which limits their efficiency. In this context, synthetic data represents an alternative capable of significantly reducing the time and financial expenses involved in forming training datasets. Modern methods for generating synthetic images using various tools-from game engines to generative neural networks-are reviewed. As a tool-platform solution, the concept of digital twins for simulating technological processes was considered, within which synthetic data is utilized. Based on the review findings, a generalized model for synthetic data generation was proposed and tested on the example of quality control for floor coverings on a conveyor line. The developed system provided the generation of photorealistic and diverse images suitable for training neural network models. A comparative analysis showed that the YOLOv8 model trained on synthetic data significantly outperformed the model trained on real images: the mAP50 metric reached 0.95 versus 0.36, respectively. This result demonstrates the high adequacy of the model built on the synthetic dataset and highlights the potential of using synthetic data to improve the quality of computer vision models when access to real data is limited.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112862/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging11050168","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

This work is dedicated to the development of a system for generating artificial data for training neural networks used within a conveyor-based technology framework. It presents an overview of the application areas of computer vision (CV) and establishes that traditional methods of data collection and annotation-such as video recording and manual image labeling-are associated with high time and financial costs, which limits their efficiency. In this context, synthetic data represents an alternative capable of significantly reducing the time and financial expenses involved in forming training datasets. Modern methods for generating synthetic images using various tools-from game engines to generative neural networks-are reviewed. As a tool-platform solution, the concept of digital twins for simulating technological processes was considered, within which synthetic data is utilized. Based on the review findings, a generalized model for synthetic data generation was proposed and tested on the example of quality control for floor coverings on a conveyor line. The developed system provided the generation of photorealistic and diverse images suitable for training neural network models. A comparative analysis showed that the YOLOv8 model trained on synthetic data significantly outperformed the model trained on real images: the mAP50 metric reached 0.95 versus 0.36, respectively. This result demonstrates the high adequacy of the model built on the synthetic dataset and highlights the potential of using synthetic data to improve the quality of computer vision models when access to real data is limited.

查看原文本刊更多论文

神经网络训练人工数据的建立——以地板输送生产线为例。

这项工作致力于开发一个系统，用于在基于输送机的技术框架内生成用于训练神经网络的人工数据。它概述了计算机视觉（CV）的应用领域，并确定了传统的数据收集和注释方法（如视频录制和手动图像标记）与高时间和财务成本相关，这限制了它们的效率。在这种情况下，合成数据代表了一种能够显著减少形成训练数据集所涉及的时间和财务费用的替代方案。现代方法生成合成图像使用各种工具-从游戏引擎生成神经网络-进行了审查。作为一种工具平台解决方案，考虑了模拟工艺过程的数字孪生概念，其中使用了合成数据。在此基础上，提出了一种综合数据生成的通用模型，并以某传送带地板覆盖物质量控制为例进行了验证。所开发的系统提供了适合训练神经网络模型的逼真和多样化的图像生成。对比分析表明，在合成数据上训练的YOLOv8模型明显优于在真实图像上训练的模型：mAP50度量分别达到0.95和0.36。这一结果证明了在合成数据集上建立的模型的高度充分性，并突出了在访问真实数据有限的情况下使用合成数据来提高计算机视觉模型质量的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊