{"title":"Irish potato imagery dataset for detection of early and late blight diseases","authors":"Hudson Laizer, Neema Mduma","doi":"10.1016/j.dib.2025.111549","DOIUrl":null,"url":null,"abstract":"<div><div>This dataset comprises of 58,709 annotated images of irish potato leaves, categorized into three classes (healthy, early blight and late blight). The data was collected over six months from smallholder farms in Southern Highlands Tanzania, using Samsung Galaxy A03 smartphones with 8-megapixel camera. Researchers, farmers and agricultural extension officers were trained to capture images under diverse conditions, including varying lighting, angles and backgrounds to ensure the dataset is diverse and representative. Plant pathologists were used to validate the images to ensure and enhance the reliability of the labels. Pre-processing steps such as duplicate removal, filtering of irrelevant images, annotation and metadata integration were applied resulting in a high-quality dataset. The dataset is organized into three folders (healthy, early blight and late blight) and is freely available on the Zenodo repository to promote accessibility for researchers working in the field of plant diseases. This dataset holds significant potential for reuse in training machine learning models for crop disease detection, transfer learning and data augmentation studies. By enabling early detection and classification of potato diseases, the dataset supports the development of innovative agricultural tools aimed at reducing crop losses and enhancing food security in Sub-Saharan Africa. Its robust design and regional specificity make it a valuable resource for advancing research and innovation in sustainable farming practices.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111549"},"PeriodicalIF":1.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925002811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
This dataset comprises of 58,709 annotated images of irish potato leaves, categorized into three classes (healthy, early blight and late blight). The data was collected over six months from smallholder farms in Southern Highlands Tanzania, using Samsung Galaxy A03 smartphones with 8-megapixel camera. Researchers, farmers and agricultural extension officers were trained to capture images under diverse conditions, including varying lighting, angles and backgrounds to ensure the dataset is diverse and representative. Plant pathologists were used to validate the images to ensure and enhance the reliability of the labels. Pre-processing steps such as duplicate removal, filtering of irrelevant images, annotation and metadata integration were applied resulting in a high-quality dataset. The dataset is organized into three folders (healthy, early blight and late blight) and is freely available on the Zenodo repository to promote accessibility for researchers working in the field of plant diseases. This dataset holds significant potential for reuse in training machine learning models for crop disease detection, transfer learning and data augmentation studies. By enabling early detection and classification of potato diseases, the dataset supports the development of innovative agricultural tools aimed at reducing crop losses and enhancing food security in Sub-Saharan Africa. Its robust design and regional specificity make it a valuable resource for advancing research and innovation in sustainable farming practices.
期刊介绍:
Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.