{"title":"BiFPN-YOLO: One-stage object detection integrating Bi-Directional Feature Pyramid Networks","authors":"John Doherty , Bryan Gardiner , Emmett Kerr , Nazmul Siddique","doi":"10.1016/j.patcog.2024.111209","DOIUrl":null,"url":null,"abstract":"<div><div>Object detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO implementation while being built from scratch in Python. In this paper, BiFPN-YOLO is proposed, featuring clear improvements over the existing range of YOLOv5 object detection models; these include replacing the traditional Path-Aggregation Network (PANet) with a higher performing Bi-Directional Feature Pyramid Network (BiFPN), requiring complex adaptation from its original implementation to function with YOLOv5, as well as exploring a replacement to the standard Swish activation function by evaluating the performance against a number of other activation functions. The proposed model showcases state-of-the-art performance, benchmarking against well-known datasets such as the German Traffic Sign Detection Benchmark (GTSDB), improving mAP by 3.1 %, and the RoboFEI@Home dataset, where Mean Average Precision (mAP) is improved by 2 % compared to the base YOLOv5 model. Performance was also improved on MSCOCO by 1.1 % and a custom subset of the OpenImagesV6 dataset by 2.4 %.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"160 ","pages":"Article 111209"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324009609","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Object detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO implementation while being built from scratch in Python. In this paper, BiFPN-YOLO is proposed, featuring clear improvements over the existing range of YOLOv5 object detection models; these include replacing the traditional Path-Aggregation Network (PANet) with a higher performing Bi-Directional Feature Pyramid Network (BiFPN), requiring complex adaptation from its original implementation to function with YOLOv5, as well as exploring a replacement to the standard Swish activation function by evaluating the performance against a number of other activation functions. The proposed model showcases state-of-the-art performance, benchmarking against well-known datasets such as the German Traffic Sign Detection Benchmark (GTSDB), improving mAP by 3.1 %, and the RoboFEI@Home dataset, where Mean Average Precision (mAP) is improved by 2 % compared to the base YOLOv5 model. Performance was also improved on MSCOCO by 1.1 % and a custom subset of the OpenImagesV6 dataset by 2.4 %.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.