Aref Miri Rekavandi, Shima Rashidi, Farid Boussaid, Stephen Hoefs, Emre Akbas, Mohammed Bennamoun
{"title":"Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art","authors":"Aref Miri Rekavandi, Shima Rashidi, Farid Boussaid, Stephen Hoefs, Emre Akbas, Mohammed Bennamoun","doi":"10.1145/3758090","DOIUrl":null,"url":null,"abstract":"Transformers have rapidly gained popularity in computer vision, especially in the field of object detection. Upon examining the outcomes of state-of-the-art object detection methods, we noticed that transformers consistently outperformed well-established CNN-based detectors in almost every video or image dataset. Small objects have been identified as one of the most challenging object types in detection frameworks due to their low visibility. This paper aims to explore the performance benefits offered by such extensive networks and identify potential reasons for their Small Object Detection (SOD) superiority. We aim to investigate potential strategies that could further enhance transformers’ performance in SOD. This survey presents a taxonomy of over 60 research studies on developed transformers for the task of SOD, spanning the years 2020 to 2023. These studies encompass a variety of detection applications, including small object detection in generic images, aerial images, medical images, active millimeter images, underwater images, and videos. We also compile and present a list of 12 large-scale datasets suitable for SOD that were overlooked in previous studies and compare the performance of the reviewed studies using popular metrics such as mean Average Precision (mAP), Frames Per Second (FPS) and number of parameters. Researchers can keep track of newer studies on our web page, which is available at: https://github.com/arekavandi/Transformer-SOD.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"39 1","pages":""},"PeriodicalIF":28.0000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3758090","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Transformers have rapidly gained popularity in computer vision, especially in the field of object detection. Upon examining the outcomes of state-of-the-art object detection methods, we noticed that transformers consistently outperformed well-established CNN-based detectors in almost every video or image dataset. Small objects have been identified as one of the most challenging object types in detection frameworks due to their low visibility. This paper aims to explore the performance benefits offered by such extensive networks and identify potential reasons for their Small Object Detection (SOD) superiority. We aim to investigate potential strategies that could further enhance transformers’ performance in SOD. This survey presents a taxonomy of over 60 research studies on developed transformers for the task of SOD, spanning the years 2020 to 2023. These studies encompass a variety of detection applications, including small object detection in generic images, aerial images, medical images, active millimeter images, underwater images, and videos. We also compile and present a list of 12 large-scale datasets suitable for SOD that were overlooked in previous studies and compare the performance of the reviewed studies using popular metrics such as mean Average Precision (mAP), Frames Per Second (FPS) and number of parameters. Researchers can keep track of newer studies on our web page, which is available at: https://github.com/arekavandi/Transformer-SOD.
期刊介绍:
ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods.
ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.