{"title":"A Spatial Model of K-Nearest Neighbors for Classification of Cotton (Gossypium) Varieties based on Image Segmentation","authors":"Salman Qadri","doi":"10.54692/lgurjcsit.2021.0501173","DOIUrl":null,"url":null,"abstract":"In this study, we describe a technique that used a machine learning (ML) approach to classify four (4) different cotton leaf varieties namely; BS-15, S-32, Z-31, and Z-32. Each variety of cotton leaves were collected from 500 Farmers. These image datasets are captured by using the cell phone camera in the open agricultural field area, and every image was captured from both sides (Front and Back) of the cotton leaf. Each variety of cotton has used over 300 (150 Front Side and 150 Back Side of the leaves) leaf images and the total calculated cotton leaves are 1200 (300 x 4) as leaf image samples. These sample datasets have analyzed through image preprocessing and image segmentation process. Each image was employing four different non-over-lapping regions of interest (ROI’s) and calculated a total of 4800 (1200 x 4) ROI’s. The acquired datasets are employed different machine learning features such as Scalability, Texture, Spectral, Binary, Histogram, Rotational, and translational (R-S-T). A total of fifty-seven (57) machine learning features were evaluated on each ROI and a total calculated 273,600 (4800 x 57) features. Furthermore, the Correlation-Based Feature Selection (CFS) genetic algorithm technique was employed for feature optimization. It has been evaluated 22 optimized features and applying different machine learning (M-L) classifiers namely; K-Nearest Neighbor (K-NN), K*, Random Forest (RF) Tree, and Naive Bayes (NB) Tree. The resulting accuracy produced by K-NN presented is 98.9167% on (512 x 512) ROI’s. The individually overall result accuracy dataset values by using K-NN classifier on the four varieties of cotton leaf namely; BS-15, S-32, Z-31, and Z-32 were evaluated 97.83%, 99.50%, 99%, and 99.33%, respectively.","PeriodicalId":197260,"journal":{"name":"Lahore Garrison University Research Journal of Computer Science and Information Technology","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lahore Garrison University Research Journal of Computer Science and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54692/lgurjcsit.2021.0501173","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
In this study, we describe a technique that used a machine learning (ML) approach to classify four (4) different cotton leaf varieties namely; BS-15, S-32, Z-31, and Z-32. Each variety of cotton leaves were collected from 500 Farmers. These image datasets are captured by using the cell phone camera in the open agricultural field area, and every image was captured from both sides (Front and Back) of the cotton leaf. Each variety of cotton has used over 300 (150 Front Side and 150 Back Side of the leaves) leaf images and the total calculated cotton leaves are 1200 (300 x 4) as leaf image samples. These sample datasets have analyzed through image preprocessing and image segmentation process. Each image was employing four different non-over-lapping regions of interest (ROI’s) and calculated a total of 4800 (1200 x 4) ROI’s. The acquired datasets are employed different machine learning features such as Scalability, Texture, Spectral, Binary, Histogram, Rotational, and translational (R-S-T). A total of fifty-seven (57) machine learning features were evaluated on each ROI and a total calculated 273,600 (4800 x 57) features. Furthermore, the Correlation-Based Feature Selection (CFS) genetic algorithm technique was employed for feature optimization. It has been evaluated 22 optimized features and applying different machine learning (M-L) classifiers namely; K-Nearest Neighbor (K-NN), K*, Random Forest (RF) Tree, and Naive Bayes (NB) Tree. The resulting accuracy produced by K-NN presented is 98.9167% on (512 x 512) ROI’s. The individually overall result accuracy dataset values by using K-NN classifier on the four varieties of cotton leaf namely; BS-15, S-32, Z-31, and Z-32 were evaluated 97.83%, 99.50%, 99%, and 99.33%, respectively.