Diego Villatoro-Geronimo, Gildardo Sanchez-Ante, Luis E. Falcon-Morales
{"title":"Bit-STED: A lightweight transformer for accurate agave counting with UAV imagery","authors":"Diego Villatoro-Geronimo, Gildardo Sanchez-Ante, Luis E. Falcon-Morales","doi":"10.1016/j.compag.2025.111047","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presented Bit-STED, a novel and simplified transformer encoder architecture for efficient agave plant detection and accurate counting using unmanned aerial vehicle (UAV) imagery. Addressing the critical need for accessible and cost-efficient solutions in agricultural monitoring, this approach automates a process that is typically time-consuming, labor-intensive, and prone to human error in manual practices. The Bit-STED model features a lightweight transformer design that incorporates innovative techniques for efficient feature extraction, model compression through quantization, and shape-aware object localization using circular bounding boxes for the roughly circular shape of the agave rosettes. To complement the detection model, a novel counting algorithm was developed to manage plants spanning multiple image tiles accurately. The experimental results demonstrated that the Bit-STED model outperformed the baseline models in terms of detection and agave plant count performance. Specifically, the Bit-STED nano model achieved F1 scores of 96.66% on a map with younger plants and 96.43% on a map with larger, highly overlapping plants. These scores surpassed state-of-the-art baselines, such as YOLOv8 Nano (F1 scores of 96.42% and 96.38%, respectively) and DETR (F1 scores of 93.03% and 85.61%, respectively). Furthermore, the Bit-STED nano model was significantly smaller, being less than one-eighth the size of the YOLOv8 nano model (1.4 MB compared to 12.0 MB), had fewer trainable parameters (0.35M compared to 3.01M), and was faster in average inference times (14.62 ms compared to 18.28 ms).</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"239 ","pages":"Article 111047"},"PeriodicalIF":8.9000,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925011536","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presented Bit-STED, a novel and simplified transformer encoder architecture for efficient agave plant detection and accurate counting using unmanned aerial vehicle (UAV) imagery. Addressing the critical need for accessible and cost-efficient solutions in agricultural monitoring, this approach automates a process that is typically time-consuming, labor-intensive, and prone to human error in manual practices. The Bit-STED model features a lightweight transformer design that incorporates innovative techniques for efficient feature extraction, model compression through quantization, and shape-aware object localization using circular bounding boxes for the roughly circular shape of the agave rosettes. To complement the detection model, a novel counting algorithm was developed to manage plants spanning multiple image tiles accurately. The experimental results demonstrated that the Bit-STED model outperformed the baseline models in terms of detection and agave plant count performance. Specifically, the Bit-STED nano model achieved F1 scores of 96.66% on a map with younger plants and 96.43% on a map with larger, highly overlapping plants. These scores surpassed state-of-the-art baselines, such as YOLOv8 Nano (F1 scores of 96.42% and 96.38%, respectively) and DETR (F1 scores of 93.03% and 85.61%, respectively). Furthermore, the Bit-STED nano model was significantly smaller, being less than one-eighth the size of the YOLOv8 nano model (1.4 MB compared to 12.0 MB), had fewer trainable parameters (0.35M compared to 3.01M), and was faster in average inference times (14.62 ms compared to 18.28 ms).
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.