对 NGS 基因分型管道进行模板特异性优化,揭示了 MHC 基因表达的等位基因特异性变异。

IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Artemis Efstratiou, Arnaud Gaigher, Sven Künzel, Ana Teles, Tobias L. Lenz
{"title":"对 NGS 基因分型管道进行模板特异性优化,揭示了 MHC 基因表达的等位基因特异性变异。","authors":"Artemis Efstratiou,&nbsp;Arnaud Gaigher,&nbsp;Sven Künzel,&nbsp;Ana Teles,&nbsp;Tobias L. Lenz","doi":"10.1111/1755-0998.13935","DOIUrl":null,"url":null,"abstract":"<p>Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the major histocompatibility complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, the DOC method, AmpliSAS and ACACIA, using MHC class IIβ data sets of three-spined stickleback gDNA, cDNA and “artificial” plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the gDNA-optimal parameter configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA typing and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA-typing of MHC-I revealed novel variants linked in haplotype blocks, and a higher-than-previously-reported individual MHC-I allelic diversity. In conclusion, we provide novel genotyping protocols for the three-spined stickleback MHC-I and -II genes, and evaluate the performance of popular NGS-genotyping pipelines. We also show that fine-tuned genotyping of paired gDNA-cDNA samples facilitates amplification bias-corrected MHC allele expression analysis.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 4","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13935","citationCount":"0","resultStr":"{\"title\":\"Template-specific optimization of NGS genotyping pipelines reveals allele-specific variation in MHC gene expression\",\"authors\":\"Artemis Efstratiou,&nbsp;Arnaud Gaigher,&nbsp;Sven Künzel,&nbsp;Ana Teles,&nbsp;Tobias L. Lenz\",\"doi\":\"10.1111/1755-0998.13935\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the major histocompatibility complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, the DOC method, AmpliSAS and ACACIA, using MHC class IIβ data sets of three-spined stickleback gDNA, cDNA and “artificial” plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the gDNA-optimal parameter configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA typing and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA-typing of MHC-I revealed novel variants linked in haplotype blocks, and a higher-than-previously-reported individual MHC-I allelic diversity. In conclusion, we provide novel genotyping protocols for the three-spined stickleback MHC-I and -II genes, and evaluate the performance of popular NGS-genotyping pipelines. We also show that fine-tuned genotyping of paired gDNA-cDNA samples facilitates amplification bias-corrected MHC allele expression analysis.</p>\",\"PeriodicalId\":211,\"journal\":{\"name\":\"Molecular Ecology Resources\",\"volume\":\"24 4\",\"pages\":\"\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13935\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Ecology Resources\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13935\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13935","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

利用高通量测序技术对主要组织相容性复合体(MHC)等多焦点基因家族进行精确基因分型仍然具有挑战性,因为数据非常复杂,而且难以区分真正的变异和错误的变异。针对高通量测序数据(如下一代测序(NGS))开发了几种专用基因分型管道,以解决随之而来的人为夸大多样性的风险。在这里,我们利用已知等位基因多样性的三刺尾蜥 gDNA、cDNA 和 "人工 "质粒样本的 MHC II 类β数据集,对 DOC 方法、AmpliSAS 和 ACACIA 这三种用于 NGS 数据的多焦点基因分型管道进行了全面评估。我们的研究表明,以最佳的管道参数对 gDNA 和质粒样本进行基因分型的准确性很高,而且不同方法的重复性也很好。然而,对于 cDNA 数据,gDNA 最佳参数配置会降低整体基因分型的精确度和不同管道间的一致性。需要进一步调整关键聚类参数,以考虑更高的错误率和更大的等位基因测序深度差异,这凸显了针对特定模板的管道优化对可靠的多焦点基因家族基因分型的重要性。通过精确的成对 gDNA-cDNA 分型和 MHC-II 单倍型推断,我们发现 MHC-II 等位基因特异性表达水平与单倍型中的等位基因数量呈负相关。最后,通过对 MHC-I 进行兄弟姐妹辅助 cDNA 分型,我们发现了单倍型区块中存在新的变异,而且个体 MHC-I 等位基因的多样性高于之前的报道。总之,我们为三刺棍鱼的 MHC-I 和 -II 基因提供了新的基因分型方案,并评估了流行的 NGS 基因分型管道的性能。我们还表明,对配对的 gDNA-cDNA 样品进行微调基因分型有助于扩增偏差校正的 MHC 等位基因表达分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Template-specific optimization of NGS genotyping pipelines reveals allele-specific variation in MHC gene expression

Template-specific optimization of NGS genotyping pipelines reveals allele-specific variation in MHC gene expression

Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the major histocompatibility complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, the DOC method, AmpliSAS and ACACIA, using MHC class IIβ data sets of three-spined stickleback gDNA, cDNA and “artificial” plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the gDNA-optimal parameter configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA typing and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA-typing of MHC-I revealed novel variants linked in haplotype blocks, and a higher-than-previously-reported individual MHC-I allelic diversity. In conclusion, we provide novel genotyping protocols for the three-spined stickleback MHC-I and -II genes, and evaluate the performance of popular NGS-genotyping pipelines. We also show that fine-tuned genotyping of paired gDNA-cDNA samples facilitates amplification bias-corrected MHC allele expression analysis.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular Ecology Resources
Molecular Ecology Resources 生物-进化生物学
CiteScore
15.60
自引率
5.20%
发文量
170
审稿时长
3 months
期刊介绍: Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines. In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信