Thierry Tchokogoué , Auguste Vigny Noumsi , Marcellin Atemkeng , Louis Aimé Fono
{"title":"Towards precision agriculture: A dataset for early detection of corn leaf pests","authors":"Thierry Tchokogoué , Auguste Vigny Noumsi , Marcellin Atemkeng , Louis Aimé Fono","doi":"10.1016/j.dib.2025.111394","DOIUrl":null,"url":null,"abstract":"<div><div>Corn (<em>Zea mays</em>), commonly referred to as Indian wheat, is a widely cultivated tropical annual herbaceous plant of the Poaceae family. It is primarily grown for its starch-rich grains and as a forage crop. In Cameroon, corn is the most consumed cereal, surpassing rice and sorghum, with an estimated production of 2.2 million tons annually. However, corn production is frequently threatened by insect infestations, which hinder crop development, reduce yields, and degrade its quality. Early detection of insect attacks is essential for farmers, as timely intervention can prevent widespread damage, reduce pesticide usage, and improve production yields. Insect infestations on corn manifest through various symptoms on leaves, stems, and seeds. Among these, foliar attacks are particularly detrimental, disrupting plant growth and significantly reducing yields. Symptoms of these attacks include leaf perforations, yellowing, and white spot deposits, ultimately altering the leaf texture. To address these challenges, machine learning models offer a promising solution for early detection of foliar attacks, enabling farmers to take timely and effective action. This paper introduces a dataset focused on three major pests: Spodoptera frugiperda (Fall Armyworm), <em>Helminthosporium</em> leaf blight, and Zonocerus variegatus (Variegated Grasshopper), which are among the most frequent and destructive agents affecting corn crops. The dataset comprises images of corn leaves captured in natural environments at various growth stages and field locations. Images were taken using smartphone cameras at different times of the day, providing diverse lighting conditions, and in various fields, which introduced several background contaminations, ensuring a realistic representation of field conditions. The dataset comprises eight directories: two containing healthy leaf images (1308 without augmentation and 11,772 with augmentation), two containing manually segmented backgrounds of healthy leaves (1308 without augmentation and 11,772 with augmentation), two containing healthy leaves with CNDVI algorithm-segmented backgrounds (1308 without augmentation and 11,772 with augmentation), one containing 848 infected images with manually segmented backgrounds and highlighted infected areas, and one containing 7632 augmented versions of the infected images. This dataset serves as a valuable resource for researchers and students, providing opportunities to develop machine learning and deep learning models for corn disease detection, classification, natural image segmentation, and model interpretability and explainability. By facilitating advancements in precision agriculture and automated pest detection, the dataset contributes to sustainable agricultural practices and the broader field of agroinformatics.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"59 ","pages":"Article 111394"},"PeriodicalIF":1.0000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235234092500126X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Corn (Zea mays), commonly referred to as Indian wheat, is a widely cultivated tropical annual herbaceous plant of the Poaceae family. It is primarily grown for its starch-rich grains and as a forage crop. In Cameroon, corn is the most consumed cereal, surpassing rice and sorghum, with an estimated production of 2.2 million tons annually. However, corn production is frequently threatened by insect infestations, which hinder crop development, reduce yields, and degrade its quality. Early detection of insect attacks is essential for farmers, as timely intervention can prevent widespread damage, reduce pesticide usage, and improve production yields. Insect infestations on corn manifest through various symptoms on leaves, stems, and seeds. Among these, foliar attacks are particularly detrimental, disrupting plant growth and significantly reducing yields. Symptoms of these attacks include leaf perforations, yellowing, and white spot deposits, ultimately altering the leaf texture. To address these challenges, machine learning models offer a promising solution for early detection of foliar attacks, enabling farmers to take timely and effective action. This paper introduces a dataset focused on three major pests: Spodoptera frugiperda (Fall Armyworm), Helminthosporium leaf blight, and Zonocerus variegatus (Variegated Grasshopper), which are among the most frequent and destructive agents affecting corn crops. The dataset comprises images of corn leaves captured in natural environments at various growth stages and field locations. Images were taken using smartphone cameras at different times of the day, providing diverse lighting conditions, and in various fields, which introduced several background contaminations, ensuring a realistic representation of field conditions. The dataset comprises eight directories: two containing healthy leaf images (1308 without augmentation and 11,772 with augmentation), two containing manually segmented backgrounds of healthy leaves (1308 without augmentation and 11,772 with augmentation), two containing healthy leaves with CNDVI algorithm-segmented backgrounds (1308 without augmentation and 11,772 with augmentation), one containing 848 infected images with manually segmented backgrounds and highlighted infected areas, and one containing 7632 augmented versions of the infected images. This dataset serves as a valuable resource for researchers and students, providing opportunities to develop machine learning and deep learning models for corn disease detection, classification, natural image segmentation, and model interpretability and explainability. By facilitating advancements in precision agriculture and automated pest detection, the dataset contributes to sustainable agricultural practices and the broader field of agroinformatics.
期刊介绍:
Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.