NGSMHC:一个简单的生物信息学工具,利用下一代测序数据全面分型非人类物种的MHC基因。

IF 2.5 2区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE
Mingue Kang, Byeongyong Ahn, Jae Yeol Shin, Jongan Lee, Eun Seok Cho, Chankyu Park
{"title":"NGSMHC:一个简单的生物信息学工具,利用下一代测序数据全面分型非人类物种的MHC基因。","authors":"Mingue Kang, Byeongyong Ahn, Jae Yeol Shin, Jongan Lee, Eun Seok Cho, Chankyu Park","doi":"10.5713/ab.25.0468","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Understanding the individual- and population-level polymorphisms of major histocompatibility complex (MHC) genes is crucial for identifying associations between MHC variations and immune phenotypes. To support this, we developed NGSMHC, a streamlined bioinformatics tool for efficient and accurate MHC genotyping using next-generation sequencing (NGS) data in non-human species.</p><p><strong>Methods: </strong>NGSMHC constructs phased haplotype contigs of selected MHC genes from BAM-format mapping data and determines the best matching MHC alleles and genotypes via nucleotide BLAST analysis against a user-provided reference set of MHC alleles. We evaluated NGSMHC using short-read whole-genome sequencing (WGS) data from 12 pigs, focusing on swine leukocyte antigen (SLA) genes. The typing results from NGSMHC were compared to those obtained using polymerase chain reaction sequence-based typing (PCR-SBT). In addition, we tested NGSMHC on a publicly available long-read WGS dataset with known SLA genotypes.</p><p><strong>Results: </strong>The short-read WGS data showed an average read depth of 20.9× across the SLA region, enabling typing of SLA-2, SLA-3, SLA-DRB1, and SLA-DQB1 using NGSMHC. The concordance rates between NGSMHC and PCR-SBT were 88% for SLA-3, 92% for SLA-DRB1, and 100% for SLA-DQB1. However, SLA-2 typing showed lower concordance (58%), likely due to its high sequence similarity with other SLA class I genes and complex intra-locus polymorphisms. In contrast, NGSMHC accurately identified all tested SLA genotypes-including SLA-1, SLA-2, SLA-3, SLA-DRA, SLA-DRB1, SLA-DQA, and SLA-DQB1-when applied to the long-read WGS data.</p><p><strong>Conclusion: </strong>NGSMHC is a simple and effective tool for MHC genotyping using NGS data, particularly for non-human species. Its accuracy is significantly improved by long-read sequencing, underscoring the importance of read length in precise MHC allele determination.</p>","PeriodicalId":7825,"journal":{"name":"Animal Bioscience","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NGSMHC: A simple bioinformatics tool for comprehensively typing MHC genes in non-human species using next-generation sequencing data.\",\"authors\":\"Mingue Kang, Byeongyong Ahn, Jae Yeol Shin, Jongan Lee, Eun Seok Cho, Chankyu Park\",\"doi\":\"10.5713/ab.25.0468\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Understanding the individual- and population-level polymorphisms of major histocompatibility complex (MHC) genes is crucial for identifying associations between MHC variations and immune phenotypes. To support this, we developed NGSMHC, a streamlined bioinformatics tool for efficient and accurate MHC genotyping using next-generation sequencing (NGS) data in non-human species.</p><p><strong>Methods: </strong>NGSMHC constructs phased haplotype contigs of selected MHC genes from BAM-format mapping data and determines the best matching MHC alleles and genotypes via nucleotide BLAST analysis against a user-provided reference set of MHC alleles. We evaluated NGSMHC using short-read whole-genome sequencing (WGS) data from 12 pigs, focusing on swine leukocyte antigen (SLA) genes. The typing results from NGSMHC were compared to those obtained using polymerase chain reaction sequence-based typing (PCR-SBT). In addition, we tested NGSMHC on a publicly available long-read WGS dataset with known SLA genotypes.</p><p><strong>Results: </strong>The short-read WGS data showed an average read depth of 20.9× across the SLA region, enabling typing of SLA-2, SLA-3, SLA-DRB1, and SLA-DQB1 using NGSMHC. The concordance rates between NGSMHC and PCR-SBT were 88% for SLA-3, 92% for SLA-DRB1, and 100% for SLA-DQB1. However, SLA-2 typing showed lower concordance (58%), likely due to its high sequence similarity with other SLA class I genes and complex intra-locus polymorphisms. In contrast, NGSMHC accurately identified all tested SLA genotypes-including SLA-1, SLA-2, SLA-3, SLA-DRA, SLA-DRB1, SLA-DQA, and SLA-DQB1-when applied to the long-read WGS data.</p><p><strong>Conclusion: </strong>NGSMHC is a simple and effective tool for MHC genotyping using NGS data, particularly for non-human species. Its accuracy is significantly improved by long-read sequencing, underscoring the importance of read length in precise MHC allele determination.</p>\",\"PeriodicalId\":7825,\"journal\":{\"name\":\"Animal Bioscience\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Animal Bioscience\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.5713/ab.25.0468\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, DAIRY & ANIMAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal Bioscience","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.5713/ab.25.0468","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

目的:了解主要组织相容性复合体(MHC)基因的个体和群体水平多态性对于确定MHC变异与免疫表型之间的关系至关重要。为了支持这一点,我们开发了NGSMHC,这是一种简化的生物信息学工具,可以利用非人类物种的下一代测序(NGS)数据高效准确地进行MHC基因分型。方法:NGSMHC从bam格式的作图数据中选取MHC基因构建分阶段单倍型序列,并根据用户提供的MHC等位基因参考集通过核苷酸BLAST分析确定最佳匹配的MHC等位基因和基因型。我们利用12头猪的短读全基因组测序(WGS)数据对NGSMHC进行了评估,重点是猪白细胞抗原(SLA)基因。将NGSMHC的分型结果与聚合酶链反应序列分型(PCR-SBT)进行比较。此外,我们在已知SLA基因型的公开长读WGS数据集上测试了NGSMHC。结果:短读WGS数据显示,SLA区域的平均读取深度为20.9 x,可以使用NGSMHC对SLA-2、SLA-3、SLA- drb1和SLA- dqb1进行分型。NGSMHC与PCR-SBT的符合率分别为SLA-3 88%、SLA-DRB1 92%和SLA-DQB1 100%。然而,SLA-2分型显示出较低的一致性(58%),可能是由于其与其他SLA I类基因的高度序列相似性和复杂的位点内多态性。相比之下,当应用于长读WGS数据时,NGSMHC准确地识别了所有测试的SLA基因型,包括SLA-1、SLA-2、SLA-3、SLA- dra、SLA- drb1、SLA- dqa和SLA- dqb1。结论:NGSMHC是利用NGS数据进行MHC基因分型的一种简单有效的工具,尤其适用于非人类物种。长读段测序显著提高了其准确性,强调了读段长度在精确测定MHC等位基因中的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
NGSMHC: A simple bioinformatics tool for comprehensively typing MHC genes in non-human species using next-generation sequencing data.

Objective: Understanding the individual- and population-level polymorphisms of major histocompatibility complex (MHC) genes is crucial for identifying associations between MHC variations and immune phenotypes. To support this, we developed NGSMHC, a streamlined bioinformatics tool for efficient and accurate MHC genotyping using next-generation sequencing (NGS) data in non-human species.

Methods: NGSMHC constructs phased haplotype contigs of selected MHC genes from BAM-format mapping data and determines the best matching MHC alleles and genotypes via nucleotide BLAST analysis against a user-provided reference set of MHC alleles. We evaluated NGSMHC using short-read whole-genome sequencing (WGS) data from 12 pigs, focusing on swine leukocyte antigen (SLA) genes. The typing results from NGSMHC were compared to those obtained using polymerase chain reaction sequence-based typing (PCR-SBT). In addition, we tested NGSMHC on a publicly available long-read WGS dataset with known SLA genotypes.

Results: The short-read WGS data showed an average read depth of 20.9× across the SLA region, enabling typing of SLA-2, SLA-3, SLA-DRB1, and SLA-DQB1 using NGSMHC. The concordance rates between NGSMHC and PCR-SBT were 88% for SLA-3, 92% for SLA-DRB1, and 100% for SLA-DQB1. However, SLA-2 typing showed lower concordance (58%), likely due to its high sequence similarity with other SLA class I genes and complex intra-locus polymorphisms. In contrast, NGSMHC accurately identified all tested SLA genotypes-including SLA-1, SLA-2, SLA-3, SLA-DRA, SLA-DRB1, SLA-DQA, and SLA-DQB1-when applied to the long-read WGS data.

Conclusion: NGSMHC is a simple and effective tool for MHC genotyping using NGS data, particularly for non-human species. Its accuracy is significantly improved by long-read sequencing, underscoring the importance of read length in precise MHC allele determination.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Animal Bioscience
Animal Bioscience AGRICULTURE, DAIRY & ANIMAL SCIENCE-
CiteScore
5.00
自引率
0.00%
发文量
223
审稿时长
3 months
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信