BiFPN-YOLO: One-stage object detection integrating Bi-Directional Feature Pyramid Networks

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2024-11-18 DOI:10.1016/j.patcog.2024.111209

John Doherty , Bryan Gardiner , Emmett Kerr , Nazmul Siddique

{"title":"BiFPN-YOLO: One-stage object detection integrating Bi-Directional Feature Pyramid Networks","authors":"John Doherty , Bryan Gardiner , Emmett Kerr , Nazmul Siddique","doi":"10.1016/j.patcog.2024.111209","DOIUrl":null,"url":null,"abstract":"<div><div>Object detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO implementation while being built from scratch in Python. In this paper, BiFPN-YOLO is proposed, featuring clear improvements over the existing range of YOLOv5 object detection models; these include replacing the traditional Path-Aggregation Network (PANet) with a higher performing Bi-Directional Feature Pyramid Network (BiFPN), requiring complex adaptation from its original implementation to function with YOLOv5, as well as exploring a replacement to the standard Swish activation function by evaluating the performance against a number of other activation functions. The proposed model showcases state-of-the-art performance, benchmarking against well-known datasets such as the German Traffic Sign Detection Benchmark (GTSDB), improving mAP by 3.1 %, and the RoboFEI@Home dataset, where Mean Average Precision (mAP) is improved by 2 % compared to the base YOLOv5 model. Performance was also improved on MSCOCO by 1.1 % and a custom subset of the OpenImagesV6 dataset by 2.4 %.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"160 ","pages":"Article 111209"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324009609","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Object detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO implementation while being built from scratch in Python. In this paper, BiFPN-YOLO is proposed, featuring clear improvements over the existing range of YOLOv5 object detection models; these include replacing the traditional Path-Aggregation Network (PANet) with a higher performing Bi-Directional Feature Pyramid Network (BiFPN), requiring complex adaptation from its original implementation to function with YOLOv5, as well as exploring a replacement to the standard Swish activation function by evaluating the performance against a number of other activation functions. The proposed model showcases state-of-the-art performance, benchmarking against well-known datasets such as the German Traffic Sign Detection Benchmark (GTSDB), improving mAP by 3.1 %, and the RoboFEI@Home dataset, where Mean Average Precision (mAP) is improved by 2 % compared to the base YOLOv5 model. Performance was also improved on MSCOCO by 1.1 % and a custom subset of the OpenImagesV6 dataset by 2.4 %.

查看原文本刊更多论文

BiFPN-YOLO：整合双向特征金字塔网络的单阶段物体检测

物体检测是计算机视觉研究的关键组成部分，它允许系统确定任何给定场景中物体的位置和类型。YOLOv5 是一种现代物体检测模型，它利用了原始 YOLO 实现的优点，并用 Python 从头开始构建。本文提出的 BiFPN-YOLO 与现有的 YOLOv5 物体检测模型相比有明显的改进，包括用性能更高的双向特征金字塔网络（BiFPN）取代传统的路径聚合网络（PANet），这需要对其原始实现进行复杂的调整才能与 YOLOv5 配合使用，以及通过评估与其他激活函数的性能来探索标准 Swish 激活函数的替代方法。所提出的模型展示了最先进的性能，与德国交通标志检测基准（GTSDB）和 RoboFEI@Home 数据集等知名数据集进行了基准测试，前者的 mAP 提高了 3.1%，后者的平均精度（mAP）与基本 YOLOv5 模型相比提高了 2%。MSCOCO 的性能也提高了 1.1%，OpenImagesV6 数据集的自定义子集的性能提高了 2.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.