Data mining for ranking sorghum seed lots

IF 0.9 4区 农林科学 Q3 AGRONOMY
Luciana D. Rocha, G. I. Gadotti, Ruan Bernardy, R. D. Pinheiro, R. D. C. M. Monteiro
{"title":"Data mining for ranking sorghum seed lots","authors":"Luciana D. Rocha, G. I. Gadotti, Ruan Bernardy, R. D. Pinheiro, R. D. C. M. Monteiro","doi":"10.1590/1983-21252023v36n224rc","DOIUrl":null,"url":null,"abstract":"ABSTRACT The ranking of seed lots is a fundamental process for all companies in the seed industry. This work aims to demonstrate data mining methods for ranking sorghum seed lots during the seed processing through analysis of quality control data. Germination and cold tests were performed to verify the physiological quality of the lots. Seed samples from each lot were evaluated in two moments: post-cleaning and finished product (ready for marketing). The results after pre-processing totaled 188 rows of data with six attributes, encompassing 150 lots accepted for marketing, 6 rejected, and 32 intermediate lots. The classifiers used were J48, Random Forest, Classification Via Regression, Naive Bayes, Multilayer Perceptron, and IBk. The Resample filter was used for adjustment of the data. The k-fold technique was used for training, with ten folds. The metrics of Accuracy, Precision, Recall, F-measure, and ROC Area were used to verify the accuracy of the algorithms. The results obtained were used to determine the best machine-learning algorithm. IBk and J48 presented the highest accuracy of data; the IBk technique presented the best results. The Resample filter was essential for solving the data imbalance problem. Sorghum seed lots can be classified with great accuracy and precision through artificial intelligence and machine learning technique.","PeriodicalId":21558,"journal":{"name":"Revista Caatinga","volume":" ","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Caatinga","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1590/1983-21252023v36n224rc","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0

Abstract

ABSTRACT The ranking of seed lots is a fundamental process for all companies in the seed industry. This work aims to demonstrate data mining methods for ranking sorghum seed lots during the seed processing through analysis of quality control data. Germination and cold tests were performed to verify the physiological quality of the lots. Seed samples from each lot were evaluated in two moments: post-cleaning and finished product (ready for marketing). The results after pre-processing totaled 188 rows of data with six attributes, encompassing 150 lots accepted for marketing, 6 rejected, and 32 intermediate lots. The classifiers used were J48, Random Forest, Classification Via Regression, Naive Bayes, Multilayer Perceptron, and IBk. The Resample filter was used for adjustment of the data. The k-fold technique was used for training, with ten folds. The metrics of Accuracy, Precision, Recall, F-measure, and ROC Area were used to verify the accuracy of the algorithms. The results obtained were used to determine the best machine-learning algorithm. IBk and J48 presented the highest accuracy of data; the IBk technique presented the best results. The Resample filter was essential for solving the data imbalance problem. Sorghum seed lots can be classified with great accuracy and precision through artificial intelligence and machine learning technique.
高粱种子批次排序的数据挖掘
种子批次排序是种子行业所有公司的基本流程。本工作旨在通过对质量控制数据的分析,展示高粱种子加工过程中种子批次排序的数据挖掘方法。通过萌发试验和低温试验来验证这些批次的生理品质。每个批次的种子样本在两个时刻进行评估:清洗后和成品(准备销售)。预处理后的结果共有188行数据,包含6个属性,其中150批接受营销,6批拒绝,32批中间。使用的分类器有J48、随机森林、回归分类、朴素贝叶斯、多层感知器和IBk。使用样本过滤器对数据进行调整。训练采用k-fold技术,共10次。准确度、精密度、召回率、f值和ROC面积等指标被用来验证算法的准确性。所得结果用于确定最佳机器学习算法。IBk和J48的数据精度最高;IBk技术效果最好。ressample过滤器对于解决数据不平衡问题至关重要。通过人工智能和机器学习技术,可以对高粱种子进行非常准确和精确的分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Revista Caatinga
Revista Caatinga AGRONOMY-
CiteScore
2.10
自引率
11.10%
发文量
67
审稿时长
6-12 weeks
期刊介绍: A Revista Caatinga é uma publicação científica que apresenta periodicidade trimestral, publicada pela Pró-Reitoria de Pesquisa e Pós-Graduação da Universidade Federal Rural do Semi-Árido – UFERSA, desde 1976. Objetiva proporcionar à comunidade científica, publicações de alto nível nas áreas de Ciências Agrárias e Recursos Naturais, disponibilizando, integral e gratuitamente, resultados relevantes das pesquisas publicadas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信