Shanshan Xie , Jiangjian Xie , Yang Liu , Lianshuai Sha , Ye Tian , Jiahua Dong , Diwen Liang , Kaijun Pan , Junguo Zhang
{"title":"Step-by-step to success: Multi-stage learning driven robust audiovisual fusion network for fine-grained bird species classification","authors":"Shanshan Xie , Jiangjian Xie , Yang Liu , Lianshuai Sha , Ye Tian , Jiahua Dong , Diwen Liang , Kaijun Pan , Junguo Zhang","doi":"10.1016/j.avrs.2025.100280","DOIUrl":null,"url":null,"abstract":"<div><div>Bird monitoring and protection are essential for maintaining biodiversity, and fine-grained bird classification has become a key focus in this field. Audio-visual modalities provide critical cues for this task, but robust feature extraction and efficient fusion remain major challenges. We introduce a multi-stage fine-grained audiovisual fusion network (MSFG-AVFNet) for fine-grained bird species classification, which addresses these challenges through two key components: (1) the audiovisual feature extraction module, which adopts a multi-stage fine-tuning strategy to provide high-quality unimodal features, laying a solid foundation for modality fusion; (2) the audiovisual feature fusion module, which combines a max pooling aggregation strategy with a novel audiovisual loss function to achieve effective and robust feature fusion. Experiments were conducted on the self-built AVB81 and the publicly available SSW60 datasets, which contain data from 81 and 60 bird species, respectively. Comprehensive experiments demonstrate that our approach achieves notable performance gains, outperforming existing state-of-the-art methods. These results highlight its effectiveness in leveraging audiovisual modalities for fine-grained bird classification and its potential to support ecological monitoring and biodiversity research.</div></div>","PeriodicalId":51311,"journal":{"name":"Avian Research","volume":"16 4","pages":"Article 100280"},"PeriodicalIF":1.7000,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Avian Research","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2053716625000593","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORNITHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Bird monitoring and protection are essential for maintaining biodiversity, and fine-grained bird classification has become a key focus in this field. Audio-visual modalities provide critical cues for this task, but robust feature extraction and efficient fusion remain major challenges. We introduce a multi-stage fine-grained audiovisual fusion network (MSFG-AVFNet) for fine-grained bird species classification, which addresses these challenges through two key components: (1) the audiovisual feature extraction module, which adopts a multi-stage fine-tuning strategy to provide high-quality unimodal features, laying a solid foundation for modality fusion; (2) the audiovisual feature fusion module, which combines a max pooling aggregation strategy with a novel audiovisual loss function to achieve effective and robust feature fusion. Experiments were conducted on the self-built AVB81 and the publicly available SSW60 datasets, which contain data from 81 and 60 bird species, respectively. Comprehensive experiments demonstrate that our approach achieves notable performance gains, outperforming existing state-of-the-art methods. These results highlight its effectiveness in leveraging audiovisual modalities for fine-grained bird classification and its potential to support ecological monitoring and biodiversity research.
期刊介绍:
Avian Research is an open access, peer-reviewed journal publishing high quality research and review articles on all aspects of ornithology from all over the world. It aims to report the latest and most significant progress in ornithology and to encourage exchange of ideas among international ornithologists. As an open access journal, Avian Research provides a unique opportunity to publish high quality contents that will be internationally accessible to any reader at no cost.