mtDNA "nomenclutter" and its consequences on the interpretation of genetic data.

IF 2.3 Q2 ECOLOGY
Vladimir Bajić, Vanessa Hava Schulmann, Katja Nowick
{"title":"mtDNA \"nomenclutter\" and its consequences on the interpretation of genetic data.","authors":"Vladimir Bajić, Vanessa Hava Schulmann, Katja Nowick","doi":"10.1186/s12862-024-02288-1","DOIUrl":null,"url":null,"abstract":"<p><p>Population-based studies of human mitochondrial genetic diversity often require the classification of mitochondrial DNA (mtDNA) haplotypes into more than 5400 described haplogroups, and further grouping those into hierarchically higher haplogroups. Such secondary haplogroup groupings (e.g., \"macro-haplogroups\") vary across studies, as they depend on the sample quality, technical factors of haplogroup calling, the aims of the study, and the researchers' understanding of the mtDNA haplogroup nomenclature. Retention of historical nomenclature coupled with a growing number of newly described mtDNA lineages results in increasingly complex and inconsistent nomenclature that does not reflect phylogeny well. This \"clutter\" leaves room for grouping errors and inconsistencies across scientific publications, especially when the haplogroup names are used as a proxy for secondary groupings, and represents a source for scientific misinterpretation. Here we explore the effects of phylogenetically insensitive secondary mtDNA haplogroup groupings, and the lack of standardized secondary haplogroup groupings on downstream analyses and interpretation of genetic data. We demonstrate that frequency-based analyses produce inconsistent results when different secondary mtDNA groupings are applied, and thus allow for vastly different interpretations of the same genetic data. The lack of guidelines and recommendations on how to choose appropriate secondary haplogroup groupings presents an issue for the interpretation of results, as well as their comparison and reproducibility across studies. To reduce biases originating from arbitrarily defined secondary nomenclature-based groupings, we suggest that future updates of mtDNA phylogenies aimed for the use in mtDNA haplogroup nomenclature should also provide well-defined and standardized sets of phylogenetically meaningful algorithm-based secondary haplogroup groupings such as \"macro-haplogroups\", \"meso-haplogroups\", and \"micro-haplogroups\". Ideally, each of the secondary haplogroup grouping levels should be informative about different human population history events. Those phylogenetically informative levels of haplogroup groupings can be easily defined using TreeCluster, and then implemented into haplogroup callers such as HaploGrep3. This would foster reproducibility across studies, provide a grouping standard for population-based studies, and reduce errors associated with haplogroup nomenclatures in future studies.</p>","PeriodicalId":93910,"journal":{"name":"BMC ecology and evolution","volume":"24 1","pages":"110"},"PeriodicalIF":2.3000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11331612/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC ecology and evolution","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12862-024-02288-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Population-based studies of human mitochondrial genetic diversity often require the classification of mitochondrial DNA (mtDNA) haplotypes into more than 5400 described haplogroups, and further grouping those into hierarchically higher haplogroups. Such secondary haplogroup groupings (e.g., "macro-haplogroups") vary across studies, as they depend on the sample quality, technical factors of haplogroup calling, the aims of the study, and the researchers' understanding of the mtDNA haplogroup nomenclature. Retention of historical nomenclature coupled with a growing number of newly described mtDNA lineages results in increasingly complex and inconsistent nomenclature that does not reflect phylogeny well. This "clutter" leaves room for grouping errors and inconsistencies across scientific publications, especially when the haplogroup names are used as a proxy for secondary groupings, and represents a source for scientific misinterpretation. Here we explore the effects of phylogenetically insensitive secondary mtDNA haplogroup groupings, and the lack of standardized secondary haplogroup groupings on downstream analyses and interpretation of genetic data. We demonstrate that frequency-based analyses produce inconsistent results when different secondary mtDNA groupings are applied, and thus allow for vastly different interpretations of the same genetic data. The lack of guidelines and recommendations on how to choose appropriate secondary haplogroup groupings presents an issue for the interpretation of results, as well as their comparison and reproducibility across studies. To reduce biases originating from arbitrarily defined secondary nomenclature-based groupings, we suggest that future updates of mtDNA phylogenies aimed for the use in mtDNA haplogroup nomenclature should also provide well-defined and standardized sets of phylogenetically meaningful algorithm-based secondary haplogroup groupings such as "macro-haplogroups", "meso-haplogroups", and "micro-haplogroups". Ideally, each of the secondary haplogroup grouping levels should be informative about different human population history events. Those phylogenetically informative levels of haplogroup groupings can be easily defined using TreeCluster, and then implemented into haplogroup callers such as HaploGrep3. This would foster reproducibility across studies, provide a grouping standard for population-based studies, and reduce errors associated with haplogroup nomenclatures in future studies.

Abstract Image

mtDNA "命名杂乱 "及其对基因数据解读的影响。
基于人群的人类线粒体遗传多样性研究通常需要将线粒体 DNA(mtDNA)单倍型划分为 5400 多个已描述的单倍群,并进一步将这些单倍群划分为层次更高的单倍群。这些二级单倍群分组(如 "宏单倍群")在不同研究中各不相同,因为它们取决于样本质量、单倍群调用的技术因素、研究目的以及研究人员对 mtDNA 单倍群命名法的理解。历史命名法的保留加上越来越多新描述的 mtDNA 世系,导致命名法越来越复杂和不一致,不能很好地反映系统发育。这种 "杂乱无章 "为科学出版物中的分组错误和不一致留下了空间,尤其是当单倍群名称被用作次级分组的代表时,更是科学误读的根源。在此,我们探讨了对系统发育不敏感的二级 mtDNA 单倍群分组以及缺乏标准化二级单倍群分组对下游分析和遗传数据解读的影响。我们证明,当采用不同的次级 mtDNA 单倍群分组时,基于频率的分析会产生不一致的结果,从而对相同的遗传数据做出截然不同的解释。在如何选择适当的二级单倍群分组方面缺乏指导和建议,这对结果的解释以及不同研究之间的比较和可重复性都是一个问题。为了减少任意定义的基于二级命名法的分组所产生的偏差,我们建议,未来旨在用于 mtDNA 单倍群命名法的 mtDNA 系统发生学更新也应提供定义明确且标准化的基于算法的有系统发生学意义的二级单倍群分组,如 "宏单倍群"、"中单倍群 "和 "微单倍群"。理想情况下,每个二级单倍群分组级别都应能提供不同人类种群历史事件的信息。使用 TreeCluster 可以很容易地定义这些系统发育信息丰富的单倍群分组级别,然后将其应用到单倍群调用器(如 HaploGrep3)中。这将提高各项研究的可重复性,为基于人群的研究提供分组标准,并减少未来研究中与单倍群命名相关的错误。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信