Paul Best, Marcelo Araya-Salas, Axel G Ekström, Bárbara Freitas, Frants H Jensen, Arik Kershenbaum, Adriano R Lameira, Kenna D S Lehmann, Pavel Linhart, Robert C Liu, Malavika Madhavan, Andrew Markham, Marie A Roch, Holly Root-Gutteridge, Martin Šálek, Grace Smith-Vidaurre, Ariana Strandburg-Peshkin, Megan R Warren, Matthew Wijers, Ricard Marxer
{"title":"生物声学基频估计:跨物种数据集和深度学习基线。","authors":"Paul Best, Marcelo Araya-Salas, Axel G Ekström, Bárbara Freitas, Frants H Jensen, Arik Kershenbaum, Adriano R Lameira, Kenna D S Lehmann, Pavel Linhart, Robert C Liu, Malavika Madhavan, Andrew Markham, Marie A Roch, Holly Root-Gutteridge, Martin Šálek, Grace Smith-Vidaurre, Ariana Strandburg-Peshkin, Megan R Warren, Matthew Wijers, Ricard Marxer","doi":"10.1080/09524622.2025.2500380","DOIUrl":null,"url":null,"abstract":"<p><p>The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales ( <i>e.g</i> population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 14 taxa, each paired with ground truth F0 values. These vocalisations range from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. Testing different algorithms on these signals, we demonstrate the potential of neural networks for F0 estimation, even for taxa not seen in training, or when trained without labels. Also, to inform on the applicability of algorithms to analyse signals, we propose spectral measurements of F0 quality which correlate well with performance. While current performance results are not satisfying for all studied taxa, they suggest that deep learning could bring a more generic and reliable bioacoustic F0 tracker, helping the community to analyse vocalisations via their F0 contours.</p>","PeriodicalId":55385,"journal":{"name":"Bioacoustics-The International Journal of Animal Sound and Its Recording","volume":"34 4","pages":"419-446"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12387860/pdf/","citationCount":"0","resultStr":"{\"title\":\"Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline.\",\"authors\":\"Paul Best, Marcelo Araya-Salas, Axel G Ekström, Bárbara Freitas, Frants H Jensen, Arik Kershenbaum, Adriano R Lameira, Kenna D S Lehmann, Pavel Linhart, Robert C Liu, Malavika Madhavan, Andrew Markham, Marie A Roch, Holly Root-Gutteridge, Martin Šálek, Grace Smith-Vidaurre, Ariana Strandburg-Peshkin, Megan R Warren, Matthew Wijers, Ricard Marxer\",\"doi\":\"10.1080/09524622.2025.2500380\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales ( <i>e.g</i> population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 14 taxa, each paired with ground truth F0 values. These vocalisations range from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. Testing different algorithms on these signals, we demonstrate the potential of neural networks for F0 estimation, even for taxa not seen in training, or when trained without labels. Also, to inform on the applicability of algorithms to analyse signals, we propose spectral measurements of F0 quality which correlate well with performance. While current performance results are not satisfying for all studied taxa, they suggest that deep learning could bring a more generic and reliable bioacoustic F0 tracker, helping the community to analyse vocalisations via their F0 contours.</p>\",\"PeriodicalId\":55385,\"journal\":{\"name\":\"Bioacoustics-The International Journal of Animal Sound and Its Recording\",\"volume\":\"34 4\",\"pages\":\"419-446\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12387860/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioacoustics-The International Journal of Animal Sound and Its Recording\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1080/09524622.2025.2500380\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/2 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ZOOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioacoustics-The International Journal of Animal Sound and Its Recording","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1080/09524622.2025.2500380","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/2 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ZOOLOGY","Score":null,"Total":0}
Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline.
The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales ( e.g population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 14 taxa, each paired with ground truth F0 values. These vocalisations range from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. Testing different algorithms on these signals, we demonstrate the potential of neural networks for F0 estimation, even for taxa not seen in training, or when trained without labels. Also, to inform on the applicability of algorithms to analyse signals, we propose spectral measurements of F0 quality which correlate well with performance. While current performance results are not satisfying for all studied taxa, they suggest that deep learning could bring a more generic and reliable bioacoustic F0 tracker, helping the community to analyse vocalisations via their F0 contours.
期刊介绍:
Bioacoustics primarily publishes high-quality original research papers and reviews on sound communication in birds, mammals, amphibians, reptiles, fish, insects and other invertebrates, including the following topics :
-Communication and related behaviour-
Sound production-
Hearing-
Ontogeny and learning-
Bioacoustics in taxonomy and systematics-
Impacts of noise-
Bioacoustics in environmental monitoring-
Identification techniques and applications-
Recording and analysis-
Equipment and techniques-
Ultrasound and infrasound-
Underwater sound-
Bioacoustical sound structures, patterns, variation and repertoires