BiFPN-YOLO: One-stage object detection integrating Bi-Directional Feature Pyramid Networks

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
John Doherty , Bryan Gardiner , Emmett Kerr , Nazmul Siddique
{"title":"BiFPN-YOLO: One-stage object detection integrating Bi-Directional Feature Pyramid Networks","authors":"John Doherty ,&nbsp;Bryan Gardiner ,&nbsp;Emmett Kerr ,&nbsp;Nazmul Siddique","doi":"10.1016/j.patcog.2024.111209","DOIUrl":null,"url":null,"abstract":"<div><div>Object detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO implementation while being built from scratch in Python. In this paper, BiFPN-YOLO is proposed, featuring clear improvements over the existing range of YOLOv5 object detection models; these include replacing the traditional Path-Aggregation Network (PANet) with a higher performing Bi-Directional Feature Pyramid Network (BiFPN), requiring complex adaptation from its original implementation to function with YOLOv5, as well as exploring a replacement to the standard Swish activation function by evaluating the performance against a number of other activation functions. The proposed model showcases state-of-the-art performance, benchmarking against well-known datasets such as the German Traffic Sign Detection Benchmark (GTSDB), improving mAP by 3.1 %, and the RoboFEI@Home dataset, where Mean Average Precision (mAP) is improved by 2 % compared to the base YOLOv5 model. Performance was also improved on MSCOCO by 1.1 % and a custom subset of the OpenImagesV6 dataset by 2.4 %.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"160 ","pages":"Article 111209"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324009609","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Object detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO implementation while being built from scratch in Python. In this paper, BiFPN-YOLO is proposed, featuring clear improvements over the existing range of YOLOv5 object detection models; these include replacing the traditional Path-Aggregation Network (PANet) with a higher performing Bi-Directional Feature Pyramid Network (BiFPN), requiring complex adaptation from its original implementation to function with YOLOv5, as well as exploring a replacement to the standard Swish activation function by evaluating the performance against a number of other activation functions. The proposed model showcases state-of-the-art performance, benchmarking against well-known datasets such as the German Traffic Sign Detection Benchmark (GTSDB), improving mAP by 3.1 %, and the RoboFEI@Home dataset, where Mean Average Precision (mAP) is improved by 2 % compared to the base YOLOv5 model. Performance was also improved on MSCOCO by 1.1 % and a custom subset of the OpenImagesV6 dataset by 2.4 %.
BiFPN-YOLO:整合双向特征金字塔网络的单阶段物体检测
物体检测是计算机视觉研究的关键组成部分,它允许系统确定任何给定场景中物体的位置和类型。YOLOv5 是一种现代物体检测模型,它利用了原始 YOLO 实现的优点,并用 Python 从头开始构建。本文提出的 BiFPN-YOLO 与现有的 YOLOv5 物体检测模型相比有明显的改进,包括用性能更高的双向特征金字塔网络(BiFPN)取代传统的路径聚合网络(PANet),这需要对其原始实现进行复杂的调整才能与 YOLOv5 配合使用,以及通过评估与其他激活函数的性能来探索标准 Swish 激活函数的替代方法。所提出的模型展示了最先进的性能,与德国交通标志检测基准(GTSDB)和 RoboFEI@Home 数据集等知名数据集进行了基准测试,前者的 mAP 提高了 3.1%,后者的平均精度(mAP)与基本 YOLOv5 模型相比提高了 2%。MSCOCO 的性能也提高了 1.1%,OpenImagesV6 数据集的自定义子集的性能提高了 2.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信