Lei Shi, Yaqian Qin, Juanjuan Zhang, Yan Wang, H. Qiao, Haiping Si
{"title":"基于随机森林和特征选择的农业数据多类分类","authors":"Lei Shi, Yaqian Qin, Juanjuan Zhang, Yan Wang, H. Qiao, Haiping Si","doi":"10.4018/jitr.298618","DOIUrl":null,"url":null,"abstract":"Agricultural production and operation produce a large amount of data, which hides valuable knowledge. Data mining technology can effectively explore the connection between various factors from the massive agricultural data. Classification prediction is one of the most valuable agricultural data mining techniques. This paper presents a new algorithm consisting of machine learning algorithms, feature ranking method and instance filter, which aims to enhance the capability of the random forest algorithm and better solve the problem of agricultural multi-class classification. The performance of the new algorithm was tested by using four standard agricultural multi-class datasets, and the experimental results showed that the newly proposed method performed well on all datasets. Among them, substantial rise in classification accuracy is observed for Eucalyptus dataset. Applying random forest algorithm on Eucalyptus dataset results in classification accuracy as 53.4% and after applying the new algorithm (rough set) the classification accuracy significantly increases to 83.7%.","PeriodicalId":296080,"journal":{"name":"J. Inf. Technol. Res.","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Class Classification of Agricultural Data Based on Random Forest and Feature Selection\",\"authors\":\"Lei Shi, Yaqian Qin, Juanjuan Zhang, Yan Wang, H. Qiao, Haiping Si\",\"doi\":\"10.4018/jitr.298618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Agricultural production and operation produce a large amount of data, which hides valuable knowledge. Data mining technology can effectively explore the connection between various factors from the massive agricultural data. Classification prediction is one of the most valuable agricultural data mining techniques. This paper presents a new algorithm consisting of machine learning algorithms, feature ranking method and instance filter, which aims to enhance the capability of the random forest algorithm and better solve the problem of agricultural multi-class classification. The performance of the new algorithm was tested by using four standard agricultural multi-class datasets, and the experimental results showed that the newly proposed method performed well on all datasets. Among them, substantial rise in classification accuracy is observed for Eucalyptus dataset. Applying random forest algorithm on Eucalyptus dataset results in classification accuracy as 53.4% and after applying the new algorithm (rough set) the classification accuracy significantly increases to 83.7%.\",\"PeriodicalId\":296080,\"journal\":{\"name\":\"J. Inf. Technol. Res.\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Inf. Technol. Res.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/jitr.298618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Inf. Technol. Res.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jitr.298618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-Class Classification of Agricultural Data Based on Random Forest and Feature Selection
Agricultural production and operation produce a large amount of data, which hides valuable knowledge. Data mining technology can effectively explore the connection between various factors from the massive agricultural data. Classification prediction is one of the most valuable agricultural data mining techniques. This paper presents a new algorithm consisting of machine learning algorithms, feature ranking method and instance filter, which aims to enhance the capability of the random forest algorithm and better solve the problem of agricultural multi-class classification. The performance of the new algorithm was tested by using four standard agricultural multi-class datasets, and the experimental results showed that the newly proposed method performed well on all datasets. Among them, substantial rise in classification accuracy is observed for Eucalyptus dataset. Applying random forest algorithm on Eucalyptus dataset results in classification accuracy as 53.4% and after applying the new algorithm (rough set) the classification accuracy significantly increases to 83.7%.