Characterization of microbiota signatures in Iberian pig strains using machine learning algorithms.

IF 4.9 Q1 MICROBIOLOGY
Lamiae Azouggagh, Noelia Ibáñez-Escriche, Marina Martínez-Álvaro, Luis Varona, Joaquim Casellas, Sara Negro, Cristina Casto-Rebollo
{"title":"Characterization of microbiota signatures in Iberian pig strains using machine learning algorithms.","authors":"Lamiae Azouggagh, Noelia Ibáñez-Escriche, Marina Martínez-Álvaro, Luis Varona, Joaquim Casellas, Sara Negro, Cristina Casto-Rebollo","doi":"10.1186/s42523-025-00378-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>There is a growing interest in uncovering the factors that shape microbiome composition due to its association with complex phenotypic traits in livestock. Host genetic variation is increasingly recognized as a major factor influencing the microbiome. The Iberian pig breed, known for its high-quality meat products, includes various strains with recognized genetic and phenotypic variability. However, despite the microbiome's known impact on pigs' productive phenotypes such as meat quality traits, comparative analyses of gut microbial composition across Iberian pig strains are lacking. This study aims to explore the gut microbiota of two Iberian pig strains, Entrepelado (n = 74) and Retinto (n = 63), and their reciprocal crosses (n = 100), using machine learning (ML) models to identify key microbial taxa relevant for distinguishing their genetic backgrounds, which holds potential application in the pig industry. Nine ML algorithms, including tree-based, kernel-based, probabilistic, and linear algorithms, were used.</p><p><strong>Results: </strong>Beta diversity analysis on 16 S rRNA microbiome data revealed compositional divergence among genetic, age and batch groups. ML models exploring maternal, paternal and heterosis effects showed varying levels of classification performance, with the paternal effect scenario being the best, achieving a mean Area Under the ROC curve (AUROC) of 0.74 using the Catboost (CB) algorithm. However, the most genetically distant animals, the purebreds, were more easily discriminated using the ML models. The classification of the two Iberian strains reached the highest mean AUROC of 0.83 using Support Vector Machine (SVM) model. The most relevant genera in this classification performance were Acetitomaculum, Butyricicoccus and Limosilactobacillus. All of which exhibited a relevant differential abundance between purebred animals using a Bayesian linear model.</p><p><strong>Conclusions: </strong>The study confirms variations in gut microbiota among Iberian pig strains and their crosses, influenced by genetic and non-genetic factors. ML models, particularly CB and RF, as well as SVM in certain scenarios, combined with a feature selection process, effectively classified genetic groups based on microbiome data and identified key microbial taxa. These taxa were linked to short-chain fatty acids production and lipid metabolism, suggesting microbial composition differences may contribute to variations in fat-related traits among Iberian genetic groups.</p>","PeriodicalId":72201,"journal":{"name":"Animal microbiome","volume":"7 1","pages":"13"},"PeriodicalIF":4.9000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11789298/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal microbiome","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s42523-025-00378-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: There is a growing interest in uncovering the factors that shape microbiome composition due to its association with complex phenotypic traits in livestock. Host genetic variation is increasingly recognized as a major factor influencing the microbiome. The Iberian pig breed, known for its high-quality meat products, includes various strains with recognized genetic and phenotypic variability. However, despite the microbiome's known impact on pigs' productive phenotypes such as meat quality traits, comparative analyses of gut microbial composition across Iberian pig strains are lacking. This study aims to explore the gut microbiota of two Iberian pig strains, Entrepelado (n = 74) and Retinto (n = 63), and their reciprocal crosses (n = 100), using machine learning (ML) models to identify key microbial taxa relevant for distinguishing their genetic backgrounds, which holds potential application in the pig industry. Nine ML algorithms, including tree-based, kernel-based, probabilistic, and linear algorithms, were used.

Results: Beta diversity analysis on 16 S rRNA microbiome data revealed compositional divergence among genetic, age and batch groups. ML models exploring maternal, paternal and heterosis effects showed varying levels of classification performance, with the paternal effect scenario being the best, achieving a mean Area Under the ROC curve (AUROC) of 0.74 using the Catboost (CB) algorithm. However, the most genetically distant animals, the purebreds, were more easily discriminated using the ML models. The classification of the two Iberian strains reached the highest mean AUROC of 0.83 using Support Vector Machine (SVM) model. The most relevant genera in this classification performance were Acetitomaculum, Butyricicoccus and Limosilactobacillus. All of which exhibited a relevant differential abundance between purebred animals using a Bayesian linear model.

Conclusions: The study confirms variations in gut microbiota among Iberian pig strains and their crosses, influenced by genetic and non-genetic factors. ML models, particularly CB and RF, as well as SVM in certain scenarios, combined with a feature selection process, effectively classified genetic groups based on microbiome data and identified key microbial taxa. These taxa were linked to short-chain fatty acids production and lipid metabolism, suggesting microbial composition differences may contribute to variations in fat-related traits among Iberian genetic groups.

利用机器学习算法表征伊比利亚猪菌株的微生物群特征。
背景:由于其与家畜复杂表型性状的关联,人们对揭示塑造微生物组组成的因素越来越感兴趣。宿主遗传变异越来越被认为是影响微生物组的主要因素。伊比利亚猪品种以其高质量的肉制品而闻名,包括具有公认的遗传和表型变异的各种菌株。然而,尽管已知微生物组对猪的生产表型(如肉质性状)有影响,但缺乏对伊比利亚猪菌株肠道微生物组成的比较分析。本研究旨在探索两种伊比拉猪菌株Entrepelado (n = 74)和Retinto (n = 63)及其互反杂交(n = 100)的肠道微生物群,利用机器学习(ML)模型识别与区分其遗传背景相关的关键微生物类群,这在养猪业中具有潜在的应用前景。使用了9种ML算法,包括基于树的、基于核的、概率的和线性的算法。结果:对16个S rRNA微生物组数据进行Beta多样性分析,发现遗传组、年龄组和批次组在组成上存在差异。探索母系、父系和杂种优势效应的ML模型表现出不同程度的分类性能,其中父系效应场景效果最好,使用Catboost (CB)算法获得的平均ROC曲线下面积(AUROC)为0.74。然而,基因最遥远的动物,纯种动物,更容易使用机器学习模型进行区分。支持向量机(SVM)模型对两株伊比利亚毒株的平均AUROC最高,为0.83。与该分类性能最相关的属为Acetitomaculum、Butyricicoccus和Limosilactobacillus。使用贝叶斯线性模型,所有这些在纯种动物之间表现出相关的差异丰度。结论:本研究证实了伊比利亚猪品系及其杂交猪之间肠道菌群的差异受遗传和非遗传因素的影响。ML模型,特别是CB和RF模型,以及某些场景下的SVM模型,结合特征选择过程,可以有效地基于微生物组数据进行遗传群分类,并识别出关键的微生物类群。这些分类群与短链脂肪酸产生和脂质代谢有关,表明微生物组成的差异可能导致伊比利亚遗传群中脂肪相关性状的差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
0.00%
发文量
0
审稿时长
13 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信