An annotated image dataset of urban insects for the development of computer vision and deep learning models with detection tasks

IF 1 Q3 MULTIDISCIPLINARY SCIENCES
Min Hui Lim , Hiang Hao Chan , Song-Quan Ong
{"title":"An annotated image dataset of urban insects for the development of computer vision and deep learning models with detection tasks","authors":"Min Hui Lim ,&nbsp;Hiang Hao Chan ,&nbsp;Song-Quan Ong","doi":"10.1016/j.dib.2025.111673","DOIUrl":null,"url":null,"abstract":"<div><div>A large image dataset with the aim of developing an insect recognition algorithm like YOLO. The dataset contains more than 25,000 annotations on the taxonomy of urban insects according to their order and the localization of the insect (as a bounding box) on a scanned image. This annotated image dataset of flying insects was collected using UV light traps placed in food warehouses, manufacturers and grocery stores in urban environments. The traps, equipped with UVA lamps (365 nm), captured a variety of insect species on sticky cards over 7–10 days. The sticky traps with all captured insects were used to create high-resolution scanned images (1200 dpi, 48-bit colour), with the resolution preserving fine morphological details of the insect, such as the antenna. To annotate the dataset for computer vision and deep learning models with detection tasks, annotation was performed using CVAT, with bounding boxes labelled by entomology experts at the order level. The dataset was intended to serve as a dataset for computer scientists or entomologists to compare the performance of deep learning models that can be used to build an automatic detection system for urban insect diversity or pest control studies.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111673"},"PeriodicalIF":1.0000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925004032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

A large image dataset with the aim of developing an insect recognition algorithm like YOLO. The dataset contains more than 25,000 annotations on the taxonomy of urban insects according to their order and the localization of the insect (as a bounding box) on a scanned image. This annotated image dataset of flying insects was collected using UV light traps placed in food warehouses, manufacturers and grocery stores in urban environments. The traps, equipped with UVA lamps (365 nm), captured a variety of insect species on sticky cards over 7–10 days. The sticky traps with all captured insects were used to create high-resolution scanned images (1200 dpi, 48-bit colour), with the resolution preserving fine morphological details of the insect, such as the antenna. To annotate the dataset for computer vision and deep learning models with detection tasks, annotation was performed using CVAT, with bounding boxes labelled by entomology experts at the order level. The dataset was intended to serve as a dataset for computer scientists or entomologists to compare the performance of deep learning models that can be used to build an automatic detection system for urban insect diversity or pest control studies.
城市昆虫的注释图像数据集,用于开发具有检测任务的计算机视觉和深度学习模型
一个大型图像数据集,目的是开发像YOLO这样的昆虫识别算法。该数据集包含超过25,000个关于城市昆虫分类的注释,根据它们的顺序和昆虫在扫描图像上的定位(作为边界框)。这个带注释的飞虫图像数据集是通过放置在城市环境中的食品仓库、制造商和杂货店的紫外线诱捕器收集的。这些陷阱配备了UVA灯(365纳米),在7-10天内捕获了粘卡上的多种昆虫。所有捕获昆虫的粘捕器被用于创建高分辨率扫描图像(1200 dpi, 48位彩色),分辨率保留了昆虫的精细形态学细节,如天线。为了为具有检测任务的计算机视觉和深度学习模型注释数据集,使用CVAT进行注释,并用昆虫学专家在顺序级别标记的边界框进行注释。该数据集旨在为计算机科学家或昆虫学家提供数据集,以比较深度学习模型的性能,这些模型可用于建立城市昆虫多样性或害虫控制研究的自动检测系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信