A trio-based feature extraction framework for bird sounds classification

IF 3.4 2区物理与天体物理 Q1 ACOUSTICS

Applied Acoustics Pub Date : 2025-09-17 DOI:10.1016/j.apacoust.2025.111064

Burak Celik , Ayhan Akbal

{"title":"A trio-based feature extraction framework for bird sounds classification","authors":"Burak Celik , Ayhan Akbal","doi":"10.1016/j.apacoust.2025.111064","DOIUrl":null,"url":null,"abstract":"<div><div>Bird species identification is crucial for environmental monitoring, ecological studies, and species tracking. Automated bird sound classification systems have been developed to achieve precise species detection. While deep learning models offer high accuracy, their computational complexity poses challenges for resource-limited environments. To address this, we propose a novel lightweight and highly accurate bird sound classification model utilizing a multilevel feature generation framework named AvisPat, derived from the Latin term “Avis” (bird), emphasizing its focus on avian bioacoustics. The AvisPat model leverages a 7-level discrete wavelet transform (DWT) to decompose audio signals, extracting signum, upper ternary, and lower ternary features to capture diverse signal attributes. For feature selection, an enhanced iterative Neighborhood Component Analysis (NCA) and ReliefF methods are applied iteratively to select the most discriminative features, generating multiple feature subsets. These features are classified using k-Nearest Neighbor (k-NN) and Support Vector Machine (SVM) classifiers. In addition, the proposed model achieved 96.72% accuracy on a separate Xeno-Canto dataset containing 10 bird species from diverse geographic regions, demonstrating strong generalization capability. The ’trio’ in AvisPat is chosen because the combination of signum, ternary features extracted via 7-level discrete wavelet transform comprehensively captures the time, frequency, and amplitude aspects of bird sounds, enhancing the model’s ability to distinguish between species with high accuracy.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"242 ","pages":"Article 111064"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Acoustics","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003682X25005365","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Bird species identification is crucial for environmental monitoring, ecological studies, and species tracking. Automated bird sound classification systems have been developed to achieve precise species detection. While deep learning models offer high accuracy, their computational complexity poses challenges for resource-limited environments. To address this, we propose a novel lightweight and highly accurate bird sound classification model utilizing a multilevel feature generation framework named AvisPat, derived from the Latin term “Avis” (bird), emphasizing its focus on avian bioacoustics. The AvisPat model leverages a 7-level discrete wavelet transform (DWT) to decompose audio signals, extracting signum, upper ternary, and lower ternary features to capture diverse signal attributes. For feature selection, an enhanced iterative Neighborhood Component Analysis (NCA) and ReliefF methods are applied iteratively to select the most discriminative features, generating multiple feature subsets. These features are classified using k-Nearest Neighbor (k-NN) and Support Vector Machine (SVM) classifiers. In addition, the proposed model achieved 96.72% accuracy on a separate Xeno-Canto dataset containing 10 bird species from diverse geographic regions, demonstrating strong generalization capability. The ’trio’ in AvisPat is chosen because the combination of signum, ternary features extracted via 7-level discrete wavelet transform comprehensively captures the time, frequency, and amplitude aspects of bird sounds, enhancing the model’s ability to distinguish between species with high accuracy.

查看原文本刊更多论文

基于三元特征提取的鸟类叫声分类框架

鸟类物种鉴定对环境监测、生态研究和物种追踪具有重要意义。自动化的鸟类声音分类系统已经开发出来，以实现精确的物种检测。虽然深度学习模型提供了很高的准确性，但它们的计算复杂性对资源有限的环境提出了挑战。为了解决这个问题，我们提出了一种新的轻量级和高精度的鸟类声音分类模型，该模型利用多级特征生成框架命名为AvisPat，该框架源自拉丁语“Avis”（鸟），强调其对鸟类生物声学的关注。AvisPat模型利用7级离散小波变换（DWT）来分解音频信号，提取sgn、上三进制和下三进制特征，以捕获各种信号属性。在特征选择方面，采用改进的迭代邻域分量分析（NCA）和ReliefF方法迭代选择最具判别性的特征，生成多个特征子集。这些特征使用k-最近邻（k-NN）和支持向量机（SVM）分类器进行分类。此外，该模型在包含不同地理区域的10种鸟类的Xeno-Canto数据集上的准确率达到96.72%，显示出较强的泛化能力。选择AvisPat中的“三重奏”，是因为通过7级离散小波变换提取的sgum，三元特征的组合全面捕获了鸟类声音的时间，频率和幅度方面，增强了模型区分物种的高精度能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Acoustics 物理-声学

CiteScore

7.40

自引率

11.80%

发文量

618

审稿时长

7.5 months

期刊介绍： Since its launch in 1968, Applied Acoustics has been publishing high quality research papers providing state-of-the-art coverage of research findings for engineers and scientists involved in applications of acoustics in the widest sense. Applied Acoustics looks not only at recent developments in the understanding of acoustics but also at ways of exploiting that understanding. The Journal aims to encourage the exchange of practical experience through publication and in so doing creates a fund of technological information that can be used for solving related problems. The presentation of information in graphical or tabular form is especially encouraged. If a report of a mathematical development is a necessary part of a paper it is important to ensure that it is there only as an integral part of a practical solution to a problem and is supported by data. Applied Acoustics encourages the exchange of practical experience in the following ways: • Complete Papers • Short Technical Notes • Review Articles; and thereby provides a wealth of technological information that can be used to solve related problems. Manuscripts that address all fields of applications of acoustics ranging from medicine and NDT to the environment and buildings are welcome.