Comprehensive review of recent developments in visual object detection based on deep learning

IF 13.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2025-06-12 DOI:10.1007/s10462-025-11284-w

Enerst Edozie, Aliyu Nuhu Shuaibu, Ukagwu Kelechi John, Bashir Olaniyi Sadiq

{"title":"Comprehensive review of recent developments in visual object detection based on deep learning","authors":"Enerst Edozie, Aliyu Nuhu Shuaibu, Ukagwu Kelechi John, Bashir Olaniyi Sadiq","doi":"10.1007/s10462-025-11284-w","DOIUrl":null,"url":null,"abstract":"<div><p>This comprehensive review looks into the recent developments in visual object detection, focusing on the transformative effect of deep learning (DL) technologies. In object detection, computer vision is a basic issue. This involves object detection and location in the video and image frames, which has notable advantages in robotics, autonomous driving, medical imaging, and surveillance. This review, therefore, presents a thorough integration analysis in visual object detection of the latest developments, providing both the historical context and state-of-the-art analysis. This review categorizes current methods into one-stage and two-stage frameworks, studying their architectural innovations, detection accuracy, computational speed, and deployment readiness. This review further scrutinizes the performance measures, emphasizes the inevitability of large-scale annotated datasets, and provides a curated overview of the widely used datasets in the field. Notable features include a discussion of practical applications and current research trends, and a comprehensive comparative analysis that compares models based on accuracy, speed, and trade-offs. A unique addition of this work is a thorough comparative analysis table that benchmarks traditional and modern models in terms of mean Average Precision (mAP), frames per second (FPS), advantages, limitations, and the coverage of transformer-based models and real-time deployments. The review’s holistic approach provides significant insights for researchers and practitioners seeking to understand, benchmark, develop, or benchmark object detection systems.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 9","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11284-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11284-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This comprehensive review looks into the recent developments in visual object detection, focusing on the transformative effect of deep learning (DL) technologies. In object detection, computer vision is a basic issue. This involves object detection and location in the video and image frames, which has notable advantages in robotics, autonomous driving, medical imaging, and surveillance. This review, therefore, presents a thorough integration analysis in visual object detection of the latest developments, providing both the historical context and state-of-the-art analysis. This review categorizes current methods into one-stage and two-stage frameworks, studying their architectural innovations, detection accuracy, computational speed, and deployment readiness. This review further scrutinizes the performance measures, emphasizes the inevitability of large-scale annotated datasets, and provides a curated overview of the widely used datasets in the field. Notable features include a discussion of practical applications and current research trends, and a comprehensive comparative analysis that compares models based on accuracy, speed, and trade-offs. A unique addition of this work is a thorough comparative analysis table that benchmarks traditional and modern models in terms of mean Average Precision (mAP), frames per second (FPS), advantages, limitations, and the coverage of transformer-based models and real-time deployments. The review’s holistic approach provides significant insights for researchers and practitioners seeking to understand, benchmark, develop, or benchmark object detection systems.

查看原文本刊更多论文

基于深度学习的视觉目标检测的最新发展综述

这篇全面的综述着眼于视觉目标检测的最新发展，重点关注深度学习（DL）技术的变革效应。在目标检测中，计算机视觉是一个基本问题。这涉及到视频和图像帧中的物体检测和定位，这在机器人、自动驾驶、医学成像和监视方面具有显著的优势。因此，本文对视觉目标检测的最新发展进行了全面的综合分析，提供了历史背景和最新的分析。这篇综述将当前的方法分为单阶段和两阶段框架，研究了它们的架构创新、检测精度、计算速度和部署准备。这篇综述进一步审查了性能指标，强调了大规模注释数据集的必然性，并提供了该领域广泛使用的数据集的精心概述。值得注意的特点包括对实际应用和当前研究趋势的讨论，以及基于准确性、速度和权衡比较模型的全面比较分析。这项工作的一个独特之处是一个全面的比较分析表，该表根据平均平均精度（mAP）、每秒帧数（FPS）、优势、局限性以及基于变压器的模型和实时部署的覆盖范围对传统模型和现代模型进行基准测试。该综述的整体方法为寻求理解、基准、开发或基准对象检测系统的研究人员和实践者提供了重要的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.