Zeyu Xu , Tiejun Wang , Andrew K. Skidmore , Richard Lamprey , Shadrack Ngene
{"title":"Bounding box versus point annotation: The impact on deep learning performance for animal detection in aerial images","authors":"Zeyu Xu , Tiejun Wang , Andrew K. Skidmore , Richard Lamprey , Shadrack Ngene","doi":"10.1016/j.isprsjprs.2025.02.017","DOIUrl":null,"url":null,"abstract":"<div><div>Bounding box and point annotations are widely used in deep learning-based animal detection from remote sensing imagery, yet their impact on model performance and training efficiency remains insufficiently explored. This study systematically evaluates the influence of these two annotation methods using aerial survey datasets of African elephants and antelopes across three commonly employed deep learning networks: YOLO, CenterNet, and U-Net. In addition, we assess the effect of image spatial resolution and the training efficiency associated with each annotation method. Our findings indicate that when using YOLO, there is no statistically significant difference in model accuracy between bounding box and point annotations. However, for CenterNet and U-Net, bounding box annotations consistently yield significantly higher accuracy compared to point-based annotations, with these trends remaining consistent across different spatial resolution ranges. Furthermore, training efficiency varies depending on the network and annotation method. While YOLO exhibits similar convergence speeds for both annotation types, U-Net models trained with bounding box annotations converge significantly faster, followed by CenterNet, where bounding box-based models also show improved convergence. These findings demonstrate that the choice of annotation method should be guided by the specific deep learning architecture employed. While point-based annotations are more cost-effective, their lower training efficiency in U-Net and CenterNet suggests that bounding box annotations are preferable when maximizing both accuracy and computational efficiency. Therefore, when selecting annotation strategies for animal detection in remote sensing applications, researchers should carefully balance detection accuracy, annotation cost, and training efficiency to optimize performance for specific task requirements.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"222 ","pages":"Pages 99-111"},"PeriodicalIF":10.6000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092427162500067X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Bounding box and point annotations are widely used in deep learning-based animal detection from remote sensing imagery, yet their impact on model performance and training efficiency remains insufficiently explored. This study systematically evaluates the influence of these two annotation methods using aerial survey datasets of African elephants and antelopes across three commonly employed deep learning networks: YOLO, CenterNet, and U-Net. In addition, we assess the effect of image spatial resolution and the training efficiency associated with each annotation method. Our findings indicate that when using YOLO, there is no statistically significant difference in model accuracy between bounding box and point annotations. However, for CenterNet and U-Net, bounding box annotations consistently yield significantly higher accuracy compared to point-based annotations, with these trends remaining consistent across different spatial resolution ranges. Furthermore, training efficiency varies depending on the network and annotation method. While YOLO exhibits similar convergence speeds for both annotation types, U-Net models trained with bounding box annotations converge significantly faster, followed by CenterNet, where bounding box-based models also show improved convergence. These findings demonstrate that the choice of annotation method should be guided by the specific deep learning architecture employed. While point-based annotations are more cost-effective, their lower training efficiency in U-Net and CenterNet suggests that bounding box annotations are preferable when maximizing both accuracy and computational efficiency. Therefore, when selecting annotation strategies for animal detection in remote sensing applications, researchers should carefully balance detection accuracy, annotation cost, and training efficiency to optimize performance for specific task requirements.
期刊介绍:
The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive.
P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields.
In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.