Investigating the Sim-to-Real Generalizability of Deep Learning Object Detection Models.

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY
Joachim Rüter, Umut Durak, Johann C Dauer
{"title":"Investigating the Sim-to-Real Generalizability of Deep Learning Object Detection Models.","authors":"Joachim Rüter, Umut Durak, Johann C Dauer","doi":"10.3390/jimaging10100259","DOIUrl":null,"url":null,"abstract":"<p><p>State-of-the-art object detection models need large and diverse datasets for training. As these are hard to acquire for many practical applications, training images from simulation environments gain more and more attention. A problem arises as deep learning models trained on simulation images usually have problems generalizing to real-world images shown by a sharp performance drop. Definite reasons and influences for this performance drop are not yet found. While previous work mostly investigated the influence of the data as well as the use of domain adaptation, this work provides a novel perspective by investigating the influence of the object detection model itself. Against this background, first, a corresponding measure called <i>sim-to-real generalizability</i> is defined, comprising the capability of an object detection model to generalize from simulation training images to real-world evaluation images. Second, 12 different deep learning-based object detection models are trained and their sim-to-real generalizability is evaluated. The models are trained with a variation of hyperparameters resulting in a total of 144 trained and evaluated versions. The results show a clear influence of the feature extractor and offer further insights and correlations. They open up future research on investigating influences on the sim-to-real generalizability of deep learning-based object detection models as well as on developing feature extractors that have better sim-to-real generalizability capabilities.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 10","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11509078/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging10100259","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

State-of-the-art object detection models need large and diverse datasets for training. As these are hard to acquire for many practical applications, training images from simulation environments gain more and more attention. A problem arises as deep learning models trained on simulation images usually have problems generalizing to real-world images shown by a sharp performance drop. Definite reasons and influences for this performance drop are not yet found. While previous work mostly investigated the influence of the data as well as the use of domain adaptation, this work provides a novel perspective by investigating the influence of the object detection model itself. Against this background, first, a corresponding measure called sim-to-real generalizability is defined, comprising the capability of an object detection model to generalize from simulation training images to real-world evaluation images. Second, 12 different deep learning-based object detection models are trained and their sim-to-real generalizability is evaluated. The models are trained with a variation of hyperparameters resulting in a total of 144 trained and evaluated versions. The results show a clear influence of the feature extractor and offer further insights and correlations. They open up future research on investigating influences on the sim-to-real generalizability of deep learning-based object detection models as well as on developing feature extractors that have better sim-to-real generalizability capabilities.

研究深度学习物体检测模型从模拟到现实的通用性。
最先进的物体检测模型需要大量不同的数据集进行训练。由于在许多实际应用中很难获得这些数据集,因此来自模拟环境的训练图像受到越来越多的关注。问题来了,在模拟图像上训练的深度学习模型在泛化到真实世界图像时通常会出现问题,表现为性能急剧下降。这种性能下降的明确原因和影响因素尚未找到。以往的工作主要研究了数据的影响以及领域适应的使用,而本研究则提供了一个新的视角,即研究物体检测模型本身的影响。在此背景下,首先定义了一种称为 "模拟到真实泛化能力 "的相应测量方法,包括物体检测模型从模拟训练图像泛化到真实世界评估图像的能力。其次,对 12 种不同的基于深度学习的物体检测模型进行了训练,并评估了它们的仿真-真实泛化能力。这些模型在训练时使用了不同的超参数,从而产生了总共 144 个训练和评估版本。结果显示了特征提取器的明显影响,并提供了进一步的见解和相关性。这些研究开启了未来的研究方向,即调查基于深度学习的物体检测模型的仿真-真实泛化能力的影响因素,以及开发具有更好仿真-真实泛化能力的特征提取器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Imaging
Journal of Imaging Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
5.90
自引率
6.20%
发文量
303
审稿时长
7 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信