Large-scale selection of highly informative microhaplotypes for ancestry inference and population specific informativeness

IF 3.2 2区 医学 Q2 GENETICS & HEREDITY
Maria Luisa de Barros Rodrigues , Marcelo Porto Rodrigues , Heather L. Norton , Celso Teixeira Mendes-Junior , Aguinaldo Luiz Simões , Daniel John Lawson
{"title":"Large-scale selection of highly informative microhaplotypes for ancestry inference and population specific informativeness","authors":"Maria Luisa de Barros Rodrigues ,&nbsp;Marcelo Porto Rodrigues ,&nbsp;Heather L. Norton ,&nbsp;Celso Teixeira Mendes-Junior ,&nbsp;Aguinaldo Luiz Simões ,&nbsp;Daniel John Lawson","doi":"10.1016/j.fsigen.2024.103153","DOIUrl":null,"url":null,"abstract":"<div><div>Microhaplotypes (MHs) describe physically close genetic markers that are inherited together and are gaining prominence due to their efficiency in forensic, clinical, and population studies. They excel in kinship analysis, DNA mixture detection, and ancestry inference, offering advantages in precision over individual SNPs and STRs. In this study, a pipeline was developed to efficiently select highly informative MHs from large-scale genomic datasets. Over 120,000 MHs were identified from almost a million markers, which allow this non-independent information to be efficiently used for inference. The MHs were compared to SNPs in terms of their informativeness and performance of their subsets in ancestry inference and all the results consistently favored MHs. A method for ranking markers by specific population informativeness was also introduced, which showed improvement in the accuracy of Native American ancestry estimation, overcoming the challenges of its underrepresentation in datasets. In conclusion, this study presents a comprehensive way for selecting highly informative MHs for accurate ancestry inference. The proposed approach and the subsets selected by specific population informativeness offer valuable tools for improving ancestry inference accuracy, particularly for admixed populations as demonstrated for a Brazilian dataset.</div></div>","PeriodicalId":50435,"journal":{"name":"Forensic Science International-Genetics","volume":"74 ","pages":"Article 103153"},"PeriodicalIF":3.2000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Science International-Genetics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1872497324001492","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Microhaplotypes (MHs) describe physically close genetic markers that are inherited together and are gaining prominence due to their efficiency in forensic, clinical, and population studies. They excel in kinship analysis, DNA mixture detection, and ancestry inference, offering advantages in precision over individual SNPs and STRs. In this study, a pipeline was developed to efficiently select highly informative MHs from large-scale genomic datasets. Over 120,000 MHs were identified from almost a million markers, which allow this non-independent information to be efficiently used for inference. The MHs were compared to SNPs in terms of their informativeness and performance of their subsets in ancestry inference and all the results consistently favored MHs. A method for ranking markers by specific population informativeness was also introduced, which showed improvement in the accuracy of Native American ancestry estimation, overcoming the challenges of its underrepresentation in datasets. In conclusion, this study presents a comprehensive way for selecting highly informative MHs for accurate ancestry inference. The proposed approach and the subsets selected by specific population informativeness offer valuable tools for improving ancestry inference accuracy, particularly for admixed populations as demonstrated for a Brazilian dataset.
大规模选择高信息量的微单倍型,以推断祖先和特定人群的信息量。
微单型(MHs)描述的是一起遗传的物理上接近的遗传标记,由于其在法医、临床和人口研究中的高效性而日益受到重视。它们在亲缘关系分析、DNA 混合物检测和祖先推断方面表现出色,与单个 SNP 和 STR 相比,具有精度高的优势。在这项研究中,我们开发了一种从大规模基因组数据集中高效筛选出高信息量 MHs 的方法。从近一百万个标记中识别出了超过 120,000 个 MHs,从而使这些非独立信息被有效地用于推断。在信息量及其子集在祖先推断中的表现方面,MHs 与 SNPs 进行了比较,所有结果都一致倾向于 MHs。此外,还介绍了一种根据特定人群信息度对标记排序的方法,该方法提高了美国本土人祖先估计的准确性,克服了美国本土人在数据集中代表性不足的难题。总之,本研究提出了一种为准确推断祖先而选择高信息量 MHs 的综合方法。所提出的方法和根据特定人群信息量选择的子集为提高祖先推断的准确性提供了有价值的工具,特别是对于巴西数据集所展示的混血人群。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.50
自引率
32.30%
发文量
132
审稿时长
11.3 weeks
期刊介绍: Forensic Science International: Genetics is the premier journal in the field of Forensic Genetics. This branch of Forensic Science can be defined as the application of genetics to human and non-human material (in the sense of a science with the purpose of studying inherited characteristics for the analysis of inter- and intra-specific variations in populations) for the resolution of legal conflicts. The scope of the journal includes: Forensic applications of human polymorphism. Testing of paternity and other family relationships, immigration cases, typing of biological stains and tissues from criminal casework, identification of human remains by DNA testing methodologies. Description of human polymorphisms of forensic interest, with special interest in DNA polymorphisms. Autosomal DNA polymorphisms, mini- and microsatellites (or short tandem repeats, STRs), single nucleotide polymorphisms (SNPs), X and Y chromosome polymorphisms, mtDNA polymorphisms, and any other type of DNA variation with potential forensic applications. Non-human DNA polymorphisms for crime scene investigation. Population genetics of human polymorphisms of forensic interest. Population data, especially from DNA polymorphisms of interest for the solution of forensic problems. DNA typing methodologies and strategies. Biostatistical methods in forensic genetics. Evaluation of DNA evidence in forensic problems (such as paternity or immigration cases, criminal casework, identification), classical and new statistical approaches. Standards in forensic genetics. Recommendations of regulatory bodies concerning methods, markers, interpretation or strategies or proposals for procedural or technical standards. Quality control. Quality control and quality assurance strategies, proficiency testing for DNA typing methodologies. Criminal DNA databases. Technical, legal and statistical issues. General ethical and legal issues related to forensic genetics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信