Yubin Huangfu, Weiwen Deng, Bingtao Ren, Juan Ding
{"title":"汽车检测中缩小域间隙合成图像的生成方法","authors":"Yubin Huangfu, Weiwen Deng, Bingtao Ren, Juan Ding","doi":"10.1109/CVCI54083.2021.9661221","DOIUrl":null,"url":null,"abstract":"Deep learning has become the main way of the object detection task for autonomous vehicles. Meanwhile, this method typically requires vast amounts of training data to reach their full potential. However, collecting the data from real world and labeling manually is an expensive, time-consuming and error-prone process. Synthetic image has the potential to replace real image for training neural networks, because image creation and labeling annotations are free in this way. For the network trained by synthetic images, the reality gap between real and synthetic images is the main obstacle to use it in the real world. And most previous works are only devoted to generate synthetic images with a good performance on model training, but lack of analysis of the domain gap that affects the performance. This work designs a method of generating the real and synthetic images to analyze and reduce the reality gap between synthetic and real images for car detection. Firstly, this work put one single car with no-background in a random background image to generate real single car and synthetic single car images. In order to further reduce the domain gap in content level, this method keeps the car distribution in synthetic images is similar with the distribution of car in real world. For the purpose of reducing the domain gap in appearance level, the parameters of camera model are same as the camera parameters of image collecting cars and the image is rendered by using the PBRT(Physically Based Ray Tracing) when we generated the synthetic images. Secondly, by training the neural network of instance segmentation with different datasets, the across validation result proves that the reality gap between synthetic and real images is no more than the domain gap between real images. Thirdly, the training results of datasets with different samples diversity show that the diversity of the samples yields better generalization between different datasets for car detection which can effectively reduce the domain gap.","PeriodicalId":419836,"journal":{"name":"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Generation Method of Synthetic Images with Reduced Domain Gap for Car Detection\",\"authors\":\"Yubin Huangfu, Weiwen Deng, Bingtao Ren, Juan Ding\",\"doi\":\"10.1109/CVCI54083.2021.9661221\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning has become the main way of the object detection task for autonomous vehicles. Meanwhile, this method typically requires vast amounts of training data to reach their full potential. However, collecting the data from real world and labeling manually is an expensive, time-consuming and error-prone process. Synthetic image has the potential to replace real image for training neural networks, because image creation and labeling annotations are free in this way. For the network trained by synthetic images, the reality gap between real and synthetic images is the main obstacle to use it in the real world. And most previous works are only devoted to generate synthetic images with a good performance on model training, but lack of analysis of the domain gap that affects the performance. This work designs a method of generating the real and synthetic images to analyze and reduce the reality gap between synthetic and real images for car detection. Firstly, this work put one single car with no-background in a random background image to generate real single car and synthetic single car images. In order to further reduce the domain gap in content level, this method keeps the car distribution in synthetic images is similar with the distribution of car in real world. For the purpose of reducing the domain gap in appearance level, the parameters of camera model are same as the camera parameters of image collecting cars and the image is rendered by using the PBRT(Physically Based Ray Tracing) when we generated the synthetic images. Secondly, by training the neural network of instance segmentation with different datasets, the across validation result proves that the reality gap between synthetic and real images is no more than the domain gap between real images. Thirdly, the training results of datasets with different samples diversity show that the diversity of the samples yields better generalization between different datasets for car detection which can effectively reduce the domain gap.\",\"PeriodicalId\":419836,\"journal\":{\"name\":\"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVCI54083.2021.9661221\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVCI54083.2021.9661221","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
深度学习已经成为自动驾驶汽车目标检测任务的主要方式。同时,这种方法通常需要大量的训练数据才能充分发挥其潜力。然而,从现实世界中收集数据并手动标记是一个昂贵、耗时且容易出错的过程。合成图像有可能取代真实图像用于训练神经网络,因为图像创建和标记注释是免费的。对于用合成图像训练的网络来说,真实图像与合成图像之间的真实感差距是其在现实世界中使用的主要障碍。而以往的工作大多只致力于生成具有良好模型训练性能的合成图像,而缺乏对影响性能的域间隙的分析。本工作设计了一种生成真实图像和合成图像的方法,用于分析和减少合成图像与真实图像之间的真实差距,用于汽车检测。首先,将一辆无背景的单车放在随机背景图像中,生成真实的单车图像和合成的单车图像。为了进一步减小内容层次上的域差距,该方法使合成图像中的汽车分布与真实世界中的汽车分布保持相似。为了减少外观层次上的域差距,相机模型参数与图像采集车的相机参数相同,在生成合成图像时使用PBRT(physical Based Ray Tracing)进行渲染。其次,通过对不同数据集的实例分割神经网络进行训练,交叉验证结果证明合成图像与真实图像之间的真实差距不大于真实图像之间的领域差距;第三,不同样本多样性数据集的训练结果表明,样本多样性对不同数据集之间的汽车检测有更好的泛化效果,可以有效地减小域间隙。
A Generation Method of Synthetic Images with Reduced Domain Gap for Car Detection
Deep learning has become the main way of the object detection task for autonomous vehicles. Meanwhile, this method typically requires vast amounts of training data to reach their full potential. However, collecting the data from real world and labeling manually is an expensive, time-consuming and error-prone process. Synthetic image has the potential to replace real image for training neural networks, because image creation and labeling annotations are free in this way. For the network trained by synthetic images, the reality gap between real and synthetic images is the main obstacle to use it in the real world. And most previous works are only devoted to generate synthetic images with a good performance on model training, but lack of analysis of the domain gap that affects the performance. This work designs a method of generating the real and synthetic images to analyze and reduce the reality gap between synthetic and real images for car detection. Firstly, this work put one single car with no-background in a random background image to generate real single car and synthetic single car images. In order to further reduce the domain gap in content level, this method keeps the car distribution in synthetic images is similar with the distribution of car in real world. For the purpose of reducing the domain gap in appearance level, the parameters of camera model are same as the camera parameters of image collecting cars and the image is rendered by using the PBRT(Physically Based Ray Tracing) when we generated the synthetic images. Secondly, by training the neural network of instance segmentation with different datasets, the across validation result proves that the reality gap between synthetic and real images is no more than the domain gap between real images. Thirdly, the training results of datasets with different samples diversity show that the diversity of the samples yields better generalization between different datasets for car detection which can effectively reduce the domain gap.