IBERBIRDS: A dataset of flying bird species present in the Iberian Peninsula

IF 1 Q3 MULTIDISCIPLINARY SCIENCES
Paula Rodríguez , Rubén Parte , Guillermo A. González , Alejandra Gacho , Darío Santos , Rubén Usamentiaga , Oscar D. Pedrayes
{"title":"IBERBIRDS: A dataset of flying bird species present in the Iberian Peninsula","authors":"Paula Rodríguez ,&nbsp;Rubén Parte ,&nbsp;Guillermo A. González ,&nbsp;Alejandra Gacho ,&nbsp;Darío Santos ,&nbsp;Rubén Usamentiaga ,&nbsp;Oscar D. Pedrayes","doi":"10.1016/j.dib.2025.111610","DOIUrl":null,"url":null,"abstract":"<div><div>Advancements in computer vision and deep learning have transformed ecological monitoring and species identification, enabling automated and accurate data labelling. Despite these advancements, robust AI-driven solutions for avian species recognition remain limited, primarily due to the scarcity of high-quality annotated datasets. To address this gap, this article introduces IBERBIRDS—a comprehensive and publicly accessible dataset specifically designed to facilitate automatic detection and classification of flying bird species in the Iberian Peninsula under real-world conditions.</div><div>The dataset comprises 4000 images representing 10 ecologically significant medium to large-sized bird species, with each image annotated using bounding box coordinates in the YOLO detection format. Unlike existing datasets that typically feature close-up or ideal-condition imagery, IBERBIRDS focuses on mid-to-long range photographs of birds in flight, providing a more realistic and challenging representation of scenarios commonly encountered in birdwatching, conservation, and ecological monitoring. Images were sourced from publicly available, expert-validated ornithology platforms and underwent rigorous quality control to ensure annotation accuracy and consistency. This process included homogenizing color profiles and formats, as well as manual refinement to ensure that each image contains a single bird specimen. Additionally, detailed provenance and taxonomic metadata for each image has been systematically integrated into the dataset.</div><div>The lack of pre-annotated datasets has significantly restricted large-scale ecological analysis and the development of automated techniques in avian research, hindering the progress of AI-driven solutions tailored for bird species recognition. By addressing this gap, this dataset serves as a comprehensive benchmark for avian studies, fostering advancements in various applications such as conservation initiatives, environmental impact assessments, biodiversity preservation strategies, real-time tracking systems, and video-based analysis. Additionally, IBERBIRDS constitutes a resource for computer vision applications, supporting educational programs tailored to ornithologists and birdwatching communities. By openly providing this dataset, IBERBIRDS promotes scientific collaboration and technological advancements, ultimately contributing to the preservation and understanding of avian biodiversity.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111610"},"PeriodicalIF":1.0000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925003427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Advancements in computer vision and deep learning have transformed ecological monitoring and species identification, enabling automated and accurate data labelling. Despite these advancements, robust AI-driven solutions for avian species recognition remain limited, primarily due to the scarcity of high-quality annotated datasets. To address this gap, this article introduces IBERBIRDS—a comprehensive and publicly accessible dataset specifically designed to facilitate automatic detection and classification of flying bird species in the Iberian Peninsula under real-world conditions.
The dataset comprises 4000 images representing 10 ecologically significant medium to large-sized bird species, with each image annotated using bounding box coordinates in the YOLO detection format. Unlike existing datasets that typically feature close-up or ideal-condition imagery, IBERBIRDS focuses on mid-to-long range photographs of birds in flight, providing a more realistic and challenging representation of scenarios commonly encountered in birdwatching, conservation, and ecological monitoring. Images were sourced from publicly available, expert-validated ornithology platforms and underwent rigorous quality control to ensure annotation accuracy and consistency. This process included homogenizing color profiles and formats, as well as manual refinement to ensure that each image contains a single bird specimen. Additionally, detailed provenance and taxonomic metadata for each image has been systematically integrated into the dataset.
The lack of pre-annotated datasets has significantly restricted large-scale ecological analysis and the development of automated techniques in avian research, hindering the progress of AI-driven solutions tailored for bird species recognition. By addressing this gap, this dataset serves as a comprehensive benchmark for avian studies, fostering advancements in various applications such as conservation initiatives, environmental impact assessments, biodiversity preservation strategies, real-time tracking systems, and video-based analysis. Additionally, IBERBIRDS constitutes a resource for computer vision applications, supporting educational programs tailored to ornithologists and birdwatching communities. By openly providing this dataset, IBERBIRDS promotes scientific collaboration and technological advancements, ultimately contributing to the preservation and understanding of avian biodiversity.
伊比利亚鸟类:伊比利亚半岛上现存的飞禽物种的数据集
计算机视觉和深度学习的进步已经改变了生态监测和物种识别,实现了自动化和准确的数据标记。尽管取得了这些进展,但用于鸟类物种识别的强大的人工智能驱动解决方案仍然有限,主要原因是缺乏高质量的注释数据集。为了解决这一差距,本文介绍了iberbirds——一个全面且可公开访问的数据集,专门用于促进在现实世界条件下对伊比利亚半岛飞禽物种的自动检测和分类。该数据集包括4000幅图像,代表10种具有生态意义的大中型鸟类,每张图像都使用YOLO检测格式的边界框坐标进行注释。与现有的数据集通常具有特写或理想条件图像不同,IBERBIRDS侧重于飞行中的鸟类的中远距离照片,为鸟类观察,保护和生态监测中常见的场景提供更现实和更具挑战性的表现。图像来自公开可用的、经过专家验证的鸟类学平台,并经过严格的质量控制,以确保注释的准确性和一致性。这个过程包括均匀化颜色配置文件和格式,以及手动细化,以确保每张图像包含单个鸟类标本。此外,每个图像的详细来源和分类元数据已系统地集成到数据集中。缺乏预先注释的数据集严重限制了鸟类研究中大规模生态分析和自动化技术的发展,阻碍了为鸟类物种识别量身定制的人工智能驱动解决方案的进展。通过解决这一差距,该数据集可作为鸟类研究的综合基准,促进各种应用的进步,如保护倡议,环境影响评估,生物多样性保护策略,实时跟踪系统和基于视频的分析。此外,IBERBIRDS构成了计算机视觉应用的资源,支持为鸟类学家和观鸟社区量身定制的教育计划。通过公开提供这个数据集,IBERBIRDS促进了科学合作和技术进步,最终有助于保护和了解鸟类的生物多样性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信