{"title":"An end to end workflow for synthetic data generation for robust object detection*","authors":"Johannes Metzler, Fouad Bahrpeyma, Dirk Reichelt","doi":"10.1109/INDIN51400.2023.10218035","DOIUrl":null,"url":null,"abstract":"Object detection is a task in computer vision that involves detecting instances of visual objects of a particular class in digital images. Numerous computer vision tasks highly depend on object detection such as instance segmentation, image captioning and object tracking. A major purpose of object detection is to develop computational models that provide inputs crucial to computer vision applications. Convolutional Neural Networks (CNNs) have recently become popular due to their key roles in enabling object detection. However, the performance of CNNs is largely dependent upon the quality and quantity of training datasets, which are often difficult to obtain in real-world applications. In order to ensure the robustness of such models, it is vital that training instances are provided under various randomized conditions. These conditions are typically a combination of a variety of factors, including lighting conditions, object location, the presence of multiple objects in the scene, varieties of backgrounds, and the angle of the camera. In particular, companies, depending on their applications (such as fault detection, anomaly detection, condition monitoring, predictive quality and so on), require specialized models for their custom products and so always face difficulties in providing a large number of randomized conditioned instances of their objects. The primary reason is that the process of capturing randomized conditioned images of real objects is usually costly, time-consuming, and challenging in practice. Due to the efficiency gained so far via the use of synthetic data for training such systems, synthetic data has recently attracted considerable attention. This paper presents an end-to-end synthetic data generation method for building a robust object detection model for customized products using NVIDIA Omniverse and CNNs. In this paper, we demonstrate and evaluate our contribution to the modeling of chess pieces, where a total accuracy of 98.8 % was obtained.","PeriodicalId":174443,"journal":{"name":"2023 IEEE 21st International Conference on Industrial Informatics (INDIN)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 21st International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN51400.2023.10218035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Object detection is a task in computer vision that involves detecting instances of visual objects of a particular class in digital images. Numerous computer vision tasks highly depend on object detection such as instance segmentation, image captioning and object tracking. A major purpose of object detection is to develop computational models that provide inputs crucial to computer vision applications. Convolutional Neural Networks (CNNs) have recently become popular due to their key roles in enabling object detection. However, the performance of CNNs is largely dependent upon the quality and quantity of training datasets, which are often difficult to obtain in real-world applications. In order to ensure the robustness of such models, it is vital that training instances are provided under various randomized conditions. These conditions are typically a combination of a variety of factors, including lighting conditions, object location, the presence of multiple objects in the scene, varieties of backgrounds, and the angle of the camera. In particular, companies, depending on their applications (such as fault detection, anomaly detection, condition monitoring, predictive quality and so on), require specialized models for their custom products and so always face difficulties in providing a large number of randomized conditioned instances of their objects. The primary reason is that the process of capturing randomized conditioned images of real objects is usually costly, time-consuming, and challenging in practice. Due to the efficiency gained so far via the use of synthetic data for training such systems, synthetic data has recently attracted considerable attention. This paper presents an end-to-end synthetic data generation method for building a robust object detection model for customized products using NVIDIA Omniverse and CNNs. In this paper, we demonstrate and evaluate our contribution to the modeling of chess pieces, where a total accuracy of 98.8 % was obtained.