Maksim Kukushkin, Martin Bogdan, Simon Goertz, Jan-Ole Callsen, Eric Oldenburg, Matthias Enders, Thomas Schmid
{"title":"种子分类的可见光和近红外光谱双峰图像数据集。","authors":"Maksim Kukushkin, Martin Bogdan, Simon Goertz, Jan-Ole Callsen, Eric Oldenburg, Matthias Enders, Thomas Schmid","doi":"10.1038/s41597-025-05979-6","DOIUrl":null,"url":null,"abstract":"<p><p>The success of deep learning in image classification has been largely underpinned by large-scale datasets, such as ImageNet, which have significantly advanced multi-class classification for RGB and grayscale images. However, datasets that capture spectral information beyond the visible spectrum remain scarce, despite their high potential, especially in agriculture, medicine and remote sensing. To address this gap in the agricultural domain, we present a thoroughly curated bimodal seed image dataset comprising paired RGB and hyperspectral images for 10 plant species, making it one of the largest bimodal seed datasets available. We describe the methodology for data collection and preprocessing and benchmark several deep learning models on the dataset to evaluate their multi-class classification performance. By contributing a high-quality dataset, our manuscript offers a valuable resource for studying spectral, spatial and morphological properties of seeds, thereby opening new avenues for research and applications.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1629"},"PeriodicalIF":6.9000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A bimodal image dataset for seed classification from the visible and near-infrared spectrum.\",\"authors\":\"Maksim Kukushkin, Martin Bogdan, Simon Goertz, Jan-Ole Callsen, Eric Oldenburg, Matthias Enders, Thomas Schmid\",\"doi\":\"10.1038/s41597-025-05979-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The success of deep learning in image classification has been largely underpinned by large-scale datasets, such as ImageNet, which have significantly advanced multi-class classification for RGB and grayscale images. However, datasets that capture spectral information beyond the visible spectrum remain scarce, despite their high potential, especially in agriculture, medicine and remote sensing. To address this gap in the agricultural domain, we present a thoroughly curated bimodal seed image dataset comprising paired RGB and hyperspectral images for 10 plant species, making it one of the largest bimodal seed datasets available. We describe the methodology for data collection and preprocessing and benchmark several deep learning models on the dataset to evaluate their multi-class classification performance. By contributing a high-quality dataset, our manuscript offers a valuable resource for studying spectral, spatial and morphological properties of seeds, thereby opening new avenues for research and applications.</p>\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"1629\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-05979-6\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-05979-6","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
A bimodal image dataset for seed classification from the visible and near-infrared spectrum.
The success of deep learning in image classification has been largely underpinned by large-scale datasets, such as ImageNet, which have significantly advanced multi-class classification for RGB and grayscale images. However, datasets that capture spectral information beyond the visible spectrum remain scarce, despite their high potential, especially in agriculture, medicine and remote sensing. To address this gap in the agricultural domain, we present a thoroughly curated bimodal seed image dataset comprising paired RGB and hyperspectral images for 10 plant species, making it one of the largest bimodal seed datasets available. We describe the methodology for data collection and preprocessing and benchmark several deep learning models on the dataset to evaluate their multi-class classification performance. By contributing a high-quality dataset, our manuscript offers a valuable resource for studying spectral, spatial and morphological properties of seeds, thereby opening new avenues for research and applications.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.