Construction of SNP feature library for the identification of chicken breeds

IF 4.2 1区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE
Boxuan Zhang , Xiaochang Li , Xinwei Jiang , Conghao Zhong , Ning Yang , Congjiao Sun
{"title":"Construction of SNP feature library for the identification of chicken breeds","authors":"Boxuan Zhang ,&nbsp;Xiaochang Li ,&nbsp;Xinwei Jiang ,&nbsp;Conghao Zhong ,&nbsp;Ning Yang ,&nbsp;Congjiao Sun","doi":"10.1016/j.psj.2025.105844","DOIUrl":null,"url":null,"abstract":"<div><div>Breed identification is an important prerequisite for the protection, development and utilization of animal genetic resources. This study developed an accurate identification strategy for chicken breeds using whole-genome sequencing data from 492 individuals belonging to 14 chicken breeds. These breeds include eight local Chinese breeds (Tibetan chicken, Chahua chicken, Daweishan chicken, Liyang chicken, Lindian chicken, Silky chicken, Dongxiang blue-shell egg chicken, and WenChang chicken), three standard chicken breeds (Rhode Island Red, Leghorn, and Light Sussex chicken), two commercial breeds (Cobb broiler and Yellow Plumage Dwarf chicken) and the Red Jungle fowl. We compared three ancestry informative marker (AIM) detection methods (Fst, <em>I<sub>n</sub></em>, and PCA-correlated SNPs) and four machine learning classifiers (K-NearestNeighbor, Support Vector Machine, Random Forest, and XGBoost) to identify the best breed identification model.</div><div>A total of 30,831 high-information SNPs (Single nucleotide polymorphism) were detected and selected from these breeds using the three AIM detection methods. We found that several AIM methods performed well, but <em>I<sub>n</sub></em> was the best. Machine learning classifiers were implemented to fit the important SNP loci, and ROC (receiver operating characteristic curve) curves were generated to evaluate the performance of these machine learning classifiers. The ROC curves and 5-fold cross-validation results indicated that XGBoost was the best machine learning classifier, with the largest AUC (Area Under Curve) (macro-AUC=0.9996). In addition, XGBoost achieved 100% accuracy using only 238 SNPs.</div><div>In this study, it was observed that utilizing only 238 SNPs was effective for breed identification. We found that the combination of XGBoost and <em>I<sub>n</sub></em> was the optimal strategy for breed identification. This study provides a new method for breed identification, which is highly important for the breeding and preservation of animal genetic resources.</div></div>","PeriodicalId":20459,"journal":{"name":"Poultry Science","volume":"104 11","pages":"Article 105844"},"PeriodicalIF":4.2000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Poultry Science","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0032579125010855","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Breed identification is an important prerequisite for the protection, development and utilization of animal genetic resources. This study developed an accurate identification strategy for chicken breeds using whole-genome sequencing data from 492 individuals belonging to 14 chicken breeds. These breeds include eight local Chinese breeds (Tibetan chicken, Chahua chicken, Daweishan chicken, Liyang chicken, Lindian chicken, Silky chicken, Dongxiang blue-shell egg chicken, and WenChang chicken), three standard chicken breeds (Rhode Island Red, Leghorn, and Light Sussex chicken), two commercial breeds (Cobb broiler and Yellow Plumage Dwarf chicken) and the Red Jungle fowl. We compared three ancestry informative marker (AIM) detection methods (Fst, In, and PCA-correlated SNPs) and four machine learning classifiers (K-NearestNeighbor, Support Vector Machine, Random Forest, and XGBoost) to identify the best breed identification model.
A total of 30,831 high-information SNPs (Single nucleotide polymorphism) were detected and selected from these breeds using the three AIM detection methods. We found that several AIM methods performed well, but In was the best. Machine learning classifiers were implemented to fit the important SNP loci, and ROC (receiver operating characteristic curve) curves were generated to evaluate the performance of these machine learning classifiers. The ROC curves and 5-fold cross-validation results indicated that XGBoost was the best machine learning classifier, with the largest AUC (Area Under Curve) (macro-AUC=0.9996). In addition, XGBoost achieved 100% accuracy using only 238 SNPs.
In this study, it was observed that utilizing only 238 SNPs was effective for breed identification. We found that the combination of XGBoost and In was the optimal strategy for breed identification. This study provides a new method for breed identification, which is highly important for the breeding and preservation of animal genetic resources.
用于鸡品种鉴定的SNP特征库构建
品种鉴定是保护、开发和利用动物遗传资源的重要前提。本研究利用来自14个鸡品种的492只个体的全基因组测序数据,开发了一种准确的鸡品种鉴定策略。我们比较了三种祖先信息标记(AIM)检测方法(Fst、In和pca相关snp)和四种机器学习分类器(K-NearestNeighbor、支持向量机、随机森林和XGBoost),以确定最佳的品种识别模型。3种AIM检测方法共检测到30,831个高信息snp(单核苷酸多态性)。我们发现几种AIM方法都表现良好,但In是最好的。使用机器学习分类器拟合重要的SNP位点,并生成ROC (receiver operating characteristic curve)曲线来评估这些机器学习分类器的性能。ROC曲线和5倍交叉验证结果表明,XGBoost是最好的机器学习分类器,其曲线下面积(Area Under Curve)最大(宏观AUC=0.9996)。此外,XGBoost仅使用238个snp就实现了100%的准确性。本研究发现,仅利用238个snp就能有效地进行品种鉴定。结果表明,XGBoost和In组合是品种鉴定的最佳策略。本研究为品种鉴定提供了一种新的方法,对动物遗传资源的育种和保护具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Poultry Science
Poultry Science 农林科学-奶制品与动物科学
CiteScore
7.60
自引率
15.90%
发文量
0
审稿时长
94 days
期刊介绍: First self-published in 1921, Poultry Science is an internationally renowned monthly journal, known as the authoritative source for a broad range of poultry information and high-caliber research. The journal plays a pivotal role in the dissemination of preeminent poultry-related knowledge across all disciplines. As of January 2020, Poultry Science will become an Open Access journal with no subscription charges, meaning authors who publish here can make their research immediately, permanently, and freely accessible worldwide while retaining copyright to their work. Papers submitted for publication after October 1, 2019 will be published as Open Access papers. An international journal, Poultry Science publishes original papers, research notes, symposium papers, and reviews of basic science as applied to poultry. This authoritative source of poultry information is consistently ranked by ISI Impact Factor as one of the top 10 agriculture, dairy and animal science journals to deliver high-caliber research. Currently it is the highest-ranked (by Impact Factor and Eigenfactor) journal dedicated to publishing poultry research. Subject areas include breeding, genetics, education, production, management, environment, health, behavior, welfare, immunology, molecular biology, metabolism, nutrition, physiology, reproduction, processing, and products.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信