Machine Learning-Based identification of resistance genes associated with sunflower broomrape.

IF 4.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Yingxue Che, Congzi Zhang, Jixiang Xing, Qilemuge Xi, Ying Shao, Lingmin Zhao, Shuchun Guo, Yongchun Zuo
{"title":"Machine Learning-Based identification of resistance genes associated with sunflower broomrape.","authors":"Yingxue Che, Congzi Zhang, Jixiang Xing, Qilemuge Xi, Ying Shao, Lingmin Zhao, Shuchun Guo, Yongchun Zuo","doi":"10.1186/s13007-025-01383-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Sunflowers (Helianthus annuus L.), a vital oil crop, are facing a severe challenge from broomrape (Orobanche cumana), a parasitic plant that seriously jeopardizes the growth and development of sunflowers, limits global production and leads to substantial economic losses, which urges the development of resistant sunflower varieties.</p><p><strong>Results: </strong>This study aims to identify resistance genes from a comprehensive transcriptomic profile of 103 sunflower varieties based on gene expression data and then constructs predictive models with the key resistant genes. The least absolute shrinkage and selection operator (LASSO) regression and random forest feature importance ranking method were used to identify resistance genes. These genes were considered as biomarkers in constructing machine learning models with Support Vector Machine (SVM), K-Nearest Neighbours (KNN), Logistic Regression (LR), and Gaussian Naive Bayes (GaussianNB). The SVM model constructed with the 24 key genes selected by the LASSO method demonstrated high classification accuracy (0.9514) and a robust AUC value (0.9865), effectively distinguishing between resistant and susceptible varieties based on gene expression data. Furthermore, we discovered a correlation between key genes and differential metabolites, particularly jasmonic acid (JA).</p><p><strong>Conclusion: </strong>Our study highlights a novel perspective on screening sunflower varieties for broomrape resistance, which is anticipated to guide future biological research and breeding strategies.</p>","PeriodicalId":20100,"journal":{"name":"Plant Methods","volume":"21 1","pages":"62"},"PeriodicalIF":4.7000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12082884/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Methods","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13007-025-01383-8","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Sunflowers (Helianthus annuus L.), a vital oil crop, are facing a severe challenge from broomrape (Orobanche cumana), a parasitic plant that seriously jeopardizes the growth and development of sunflowers, limits global production and leads to substantial economic losses, which urges the development of resistant sunflower varieties.

Results: This study aims to identify resistance genes from a comprehensive transcriptomic profile of 103 sunflower varieties based on gene expression data and then constructs predictive models with the key resistant genes. The least absolute shrinkage and selection operator (LASSO) regression and random forest feature importance ranking method were used to identify resistance genes. These genes were considered as biomarkers in constructing machine learning models with Support Vector Machine (SVM), K-Nearest Neighbours (KNN), Logistic Regression (LR), and Gaussian Naive Bayes (GaussianNB). The SVM model constructed with the 24 key genes selected by the LASSO method demonstrated high classification accuracy (0.9514) and a robust AUC value (0.9865), effectively distinguishing between resistant and susceptible varieties based on gene expression data. Furthermore, we discovered a correlation between key genes and differential metabolites, particularly jasmonic acid (JA).

Conclusion: Our study highlights a novel perspective on screening sunflower varieties for broomrape resistance, which is anticipated to guide future biological research and breeding strategies.

基于机器学习的向日葵扫花相关抗性基因鉴定。
背景:向日葵(Helianthus annuus L.)作为一种重要的油作物,正面临着寄生植物雀花(Orobanche cumana)的严重威胁,这种寄生植物严重危害了向日葵的生长发育,限制了全球产量,并造成了巨大的经济损失,迫切需要开发具有抗性的向日葵品种。结果:基于基因表达数据,从103个向日葵品种的综合转录组谱中鉴定出抗性基因,并构建关键抗性基因的预测模型。采用最小绝对收缩和选择算子(LASSO)回归和随机森林特征重要性排序法鉴定抗性基因。在使用支持向量机(SVM)、k近邻(KNN)、逻辑回归(LR)和高斯朴素贝叶斯(GaussianNB)构建机器学习模型时,这些基因被视为生物标志物。利用LASSO方法选择的24个关键基因构建的SVM模型具有较高的分类准确率(0.9514)和鲁棒的AUC值(0.9865),能够根据基因表达数据有效区分抗性和易感品种。此外,我们还发现了关键基因与差异代谢物,特别是茉莉酸(JA)之间的相关性。结论:本研究为向日葵品种抗性筛选提供了新的视角,可为今后的生物学研究和育种策略提供指导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Plant Methods
Plant Methods 生物-植物科学
CiteScore
9.20
自引率
3.90%
发文量
121
审稿时长
2 months
期刊介绍: Plant Methods is an open access, peer-reviewed, online journal for the plant research community that encompasses all aspects of technological innovation in the plant sciences. There is no doubt that we have entered an exciting new era in plant biology. The completion of the Arabidopsis genome sequence, and the rapid progress being made in other plant genomics projects are providing unparalleled opportunities for progress in all areas of plant science. Nevertheless, enormous challenges lie ahead if we are to understand the function of every gene in the genome, and how the individual parts work together to make the whole organism. Achieving these goals will require an unprecedented collaborative effort, combining high-throughput, system-wide technologies with more focused approaches that integrate traditional disciplines such as cell biology, biochemistry and molecular genetics. Technological innovation is probably the most important catalyst for progress in any scientific discipline. Plant Methods’ goal is to stimulate the development and adoption of new and improved techniques and research tools and, where appropriate, to promote consistency of methodologies for better integration of data from different laboratories.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信