Jiangjian Xie , Shanshan Xie , Baican Li , Yujie Zhong , Chunhe Hu , Junguo Zhang , Björn W. Schuller
{"title":"A lightweight self-attention metric network for bird species recognition in intelligent bird repellent equipment","authors":"Jiangjian Xie , Shanshan Xie , Baican Li , Yujie Zhong , Chunhe Hu , Junguo Zhang , Björn W. Schuller","doi":"10.1016/j.engappai.2025.112546","DOIUrl":null,"url":null,"abstract":"<div><div>Bird damages to power transmission lines pose significant operational risks, and intelligent bird repellent equipment (IBRE) requires accurate species recognition for effective long-term repellent. We propose a novel lightweight self-attention metric network (LSAM-Net) for few-shot bird species recognition in the vicinity of power transmission lines, aiming to enhance the performance of IBRE. LSAM-Net integrates a simple attention mechanism (SimAM) to emphasize critical spatial and channel features, thereby enhancing the extraction of key semantic information from bird images. Additionally, a self-correlation representation (SCR) module is employed to capture local structural patterns, effectively mitigating the impact of pseudo-features and improving the network’s capacity to learn discriminative representations. To promote the utilization of local discriminative information in few-shot classification, LSAM-Net leverages earth mover’s distance (EMD) to compute structural similarity between images. For efficient deployment, we apply knowledge distillation to further reduce model complexity. Extensive experiments conducted on Bird-65, CUB200, 2011, miniImageNet, and Fewshot-CIFAR100 demonstrate that LSAM-Net achieves superior performance compared to state-of-the-art methods, while maintaining a compact architecture. On the Bird-65 and CUB200-2011 datasets, LSAM-Net requires only 4.75 and 1.18 giga floating-point operations (GFLOPs), and achieves inference speed improvements of 52.9 % and 48.9 %, respectively, over the self-attention metric network (SAM-Net). Further optimization with TensorRT yields additional reductions in inference time by 43.6 ms and 53.7 ms, respectively. These improvements significantly support species-specific repellent strategies, thereby enhancing the long-term effectiveness of IBRE systems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"162 ","pages":"Article 112546"},"PeriodicalIF":8.0000,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625025771","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Bird damages to power transmission lines pose significant operational risks, and intelligent bird repellent equipment (IBRE) requires accurate species recognition for effective long-term repellent. We propose a novel lightweight self-attention metric network (LSAM-Net) for few-shot bird species recognition in the vicinity of power transmission lines, aiming to enhance the performance of IBRE. LSAM-Net integrates a simple attention mechanism (SimAM) to emphasize critical spatial and channel features, thereby enhancing the extraction of key semantic information from bird images. Additionally, a self-correlation representation (SCR) module is employed to capture local structural patterns, effectively mitigating the impact of pseudo-features and improving the network’s capacity to learn discriminative representations. To promote the utilization of local discriminative information in few-shot classification, LSAM-Net leverages earth mover’s distance (EMD) to compute structural similarity between images. For efficient deployment, we apply knowledge distillation to further reduce model complexity. Extensive experiments conducted on Bird-65, CUB200, 2011, miniImageNet, and Fewshot-CIFAR100 demonstrate that LSAM-Net achieves superior performance compared to state-of-the-art methods, while maintaining a compact architecture. On the Bird-65 and CUB200-2011 datasets, LSAM-Net requires only 4.75 and 1.18 giga floating-point operations (GFLOPs), and achieves inference speed improvements of 52.9 % and 48.9 %, respectively, over the self-attention metric network (SAM-Net). Further optimization with TensorRT yields additional reductions in inference time by 43.6 ms and 53.7 ms, respectively. These improvements significantly support species-specific repellent strategies, thereby enhancing the long-term effectiveness of IBRE systems.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.