Prediction and importance of predictors in approaches based on computational intelligence and machine learning

Agronomy Science and Biotechnology Pub Date : 2023-03-08 DOI:10.33158/asb.r179.v9.2023

Antônio Carlos da Silva Júnior, Waldênia Melo Moura, L. L. Bhering, Michele Jorge Silva Siqueira, W. G. Costa, M. Nascimento, C. Cruz

{"title":"Prediction and importance of predictors in approaches based on computational intelligence and machine learning","authors":"Antônio Carlos da Silva Júnior, Waldênia Melo Moura, L. L. Bhering, Michele Jorge Silva Siqueira, W. G. Costa, M. Nascimento, C. Cruz","doi":"10.33158/asb.r179.v9.2023","DOIUrl":null,"url":null,"abstract":"Machine learning and computational intelligence are rapidly emerging in plant breeding, allowing the exploration of big data concepts and predicting the importance of predictors. In this context, the main challenges are how to analyze datasets and extract new knowledge at all levels of research. Predicting the importance of variables in genetic improvement programs allows for faster progress, carrying out an extensive phenotypic evaluation of the germplasm, and selecting and predicting traits that present low heritability and/or measurement difficulties. Although, simultaneous evaluation of traits provides a wide variety of information, identifying which predictor variable is most important is a challenge for the breeder. The traditional approach to variable selection is based on multiple linear regression. It evaluates the relationship between a response variable and two or more independent variables. However, this approach has limitations regarding its ability to analyze high-dimensional data and not capture complex and multivariate relationships between traits. In summary, machine learning and computational intelligence approaches allow inferences about complex interactions in plant breeding. Given this, a systematic review to disentangle machine learning and computational intelligence approaches is relevant to breeders and was considered in this review. We present the main steps for developing each strategy (from data selection to evaluating classification/prediction models and quantifying the best predictor). \n ","PeriodicalId":297313,"journal":{"name":"Agronomy Science and Biotechnology","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agronomy Science and Biotechnology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33158/asb.r179.v9.2023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Machine learning and computational intelligence are rapidly emerging in plant breeding, allowing the exploration of big data concepts and predicting the importance of predictors. In this context, the main challenges are how to analyze datasets and extract new knowledge at all levels of research. Predicting the importance of variables in genetic improvement programs allows for faster progress, carrying out an extensive phenotypic evaluation of the germplasm, and selecting and predicting traits that present low heritability and/or measurement difficulties. Although, simultaneous evaluation of traits provides a wide variety of information, identifying which predictor variable is most important is a challenge for the breeder. The traditional approach to variable selection is based on multiple linear regression. It evaluates the relationship between a response variable and two or more independent variables. However, this approach has limitations regarding its ability to analyze high-dimensional data and not capture complex and multivariate relationships between traits. In summary, machine learning and computational intelligence approaches allow inferences about complex interactions in plant breeding. Given this, a systematic review to disentangle machine learning and computational intelligence approaches is relevant to breeders and was considered in this review. We present the main steps for developing each strategy (from data selection to evaluating classification/prediction models and quantifying the best predictor).

查看原文本刊更多论文

基于计算智能和机器学习的方法中的预测和预测器的重要性

机器学习和计算智能在植物育种中迅速兴起，允许探索大数据概念并预测预测因子的重要性。在这种情况下，主要的挑战是如何分析数据集，并在各个层面的研究中提取新的知识。预测遗传改良项目中变量的重要性可以加快进展，对种质进行广泛的表型评估，选择和预测具有低遗传力和/或测量困难的性状。虽然性状的同时评估提供了各种各样的信息，但确定哪个预测变量是最重要的对育种者来说是一个挑战。传统的变量选择方法是基于多元线性回归的。它评估一个响应变量与两个或多个自变量之间的关系。然而，这种方法在分析高维数据的能力方面存在局限性，并且不能捕获特征之间复杂和多元的关系。总之，机器学习和计算智能方法可以推断植物育种中复杂的相互作用。鉴于此，本综述考虑了对机器学习和计算智能方法的系统综述，这与育种有关。我们介绍了开发每种策略的主要步骤(从数据选择到评估分类/预测模型和量化最佳预测器)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Agronomy Science and Biotechnology

自引率

0.00%

发文量