GEFormer: A genotype-environment interaction-based genomic prediction method that integrates the gating multilayer perceptron and linear attention mechanisms.
Zhou Yao, Mengting Yao, Chuang Wang, Ke Li, Junhao Guo, Yingjie Xiao, Jianbing Yan, Jianxiao Liu
{"title":"GEFormer: A genotype-environment interaction-based genomic prediction method that integrates the gating multilayer perceptron and linear attention mechanisms.","authors":"Zhou Yao, Mengting Yao, Chuang Wang, Ke Li, Junhao Guo, Yingjie Xiao, Jianbing Yan, Jianxiao Liu","doi":"10.1016/j.molp.2025.01.020","DOIUrl":null,"url":null,"abstract":"<p><p>The integration of genotypic and environmental data can enhance genomic prediction accuracy for crop field traits. Existing genomic prediction methods fail to consider environmental factors and the real growth environments of crops, resulting in low genomic prediction accuracy. In this work, we developed GEFormer, a genotype-environment interaction genomic prediction method that integrates gating multilayer perceptron (gMLP) and linear attention mechanisms. First, GEFormer uses gMLP to extract local and global features among SNPs. Then, Omni-dimensional Dynamic Convolution is used to extract the dynamic and comprehensive features of multiple environmental factors within each day, taking into consideration the real growth pattern of crops. A linear attention mechanism is used to capture the temporal features of environmental changes. Finally, GEFormer uses a gating mechanism to effectively fuse the genomic and environmental features. We examined the accuracy of GEFormer for predicting important agronomic traits of maize, rice, and wheat under three experimental scenarios: untested genotypes in tested environments, tested genotypes in untested environments, and untested genotypes in untested environments. The results showed that GEFormer outperforms six cutting-edge statistical learning methods and four machine learning methods, especially with great advantages under the scenario of untested genotypes in untested environments. In addition, we used GEFormer for three real-world breeding applications: phenotype prediction in unknown environments, hybrid phenotype prediction using an inbred population, and cross-population phenotype prediction. The results showed that GEFormer had better prediction performance in actual breeding scenarios and could be used to assist in crop breeding.</p>","PeriodicalId":19012,"journal":{"name":"Molecular Plant","volume":" ","pages":"527-549"},"PeriodicalIF":17.1000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Plant","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.molp.2025.01.020","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The integration of genotypic and environmental data can enhance genomic prediction accuracy for crop field traits. Existing genomic prediction methods fail to consider environmental factors and the real growth environments of crops, resulting in low genomic prediction accuracy. In this work, we developed GEFormer, a genotype-environment interaction genomic prediction method that integrates gating multilayer perceptron (gMLP) and linear attention mechanisms. First, GEFormer uses gMLP to extract local and global features among SNPs. Then, Omni-dimensional Dynamic Convolution is used to extract the dynamic and comprehensive features of multiple environmental factors within each day, taking into consideration the real growth pattern of crops. A linear attention mechanism is used to capture the temporal features of environmental changes. Finally, GEFormer uses a gating mechanism to effectively fuse the genomic and environmental features. We examined the accuracy of GEFormer for predicting important agronomic traits of maize, rice, and wheat under three experimental scenarios: untested genotypes in tested environments, tested genotypes in untested environments, and untested genotypes in untested environments. The results showed that GEFormer outperforms six cutting-edge statistical learning methods and four machine learning methods, especially with great advantages under the scenario of untested genotypes in untested environments. In addition, we used GEFormer for three real-world breeding applications: phenotype prediction in unknown environments, hybrid phenotype prediction using an inbred population, and cross-population phenotype prediction. The results showed that GEFormer had better prediction performance in actual breeding scenarios and could be used to assist in crop breeding.
期刊介绍:
Molecular Plant is dedicated to serving the plant science community by publishing novel and exciting findings with high significance in plant biology. The journal focuses broadly on cellular biology, physiology, biochemistry, molecular biology, genetics, development, plant-microbe interaction, genomics, bioinformatics, and molecular evolution.
Molecular Plant publishes original research articles, reviews, Correspondence, and Spotlights on the most important developments in plant biology.