应用于 Y-STR 图谱的贡献者估计方法数量的不确定性

IF 3.2 2区 医学 Q2 GENETICS & HEREDITY
{"title":"应用于 Y-STR 图谱的贡献者估计方法数量的不确定性","authors":"","doi":"10.1016/j.fsigen.2024.103145","DOIUrl":null,"url":null,"abstract":"<div><p>Maximum allele count (MAC) and total allele count (TAC) methods are widely used for estimating the number of contributors (NoC) of autosomal short tandem repeat (STR) profile in many forensic laboratories. In this study, we applied NoC estimation methods to mixed Y-STR profiles and evaluated its uncertainty and performance. For the MAC method, as recent Y-STR typing kits involve single- and multi-copy loci, we defined “MAC-single” for use across only single-copy loci and “MAC-multi” for use across only multi-copy loci. We generated a dataset containing 120,000 Y-STR profiles for a one to six-person mixture in silico based on previously reported haplotype frequencies of 27 Y-STR loci in Yfiler Plus for the U.S. population (reported by NIST) and the Henan Han population. The dataset was randomly split into a training set and a test set. The training set was used to construct a TAC distribution (TAC curve), whereas the test set was used to calculate the performance metrics (accuracy, precision, recall, and F1-score). In addition, the effect of the upper limit of NoC considered for estimation on overall accuracy was evaluated. The overall accuracies of MAC-single, MAC-multi, and TAC methods when the upper limit of NoC was set to six-person were 0.7920, 0.4329, and 0.7877 for the U.S. population and 0.8207, 0.4609, and 0.8385 for the Henan Han population. Our results suggest that the MAC-single and TAC methods can estimate the NoC for mixed Y-STR profiles with high levels of accuracy.</p></div>","PeriodicalId":50435,"journal":{"name":"Forensic Science International-Genetics","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Uncertainty in the number of contributor estimation methods applied to a Y-STR profile\",\"authors\":\"\",\"doi\":\"10.1016/j.fsigen.2024.103145\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Maximum allele count (MAC) and total allele count (TAC) methods are widely used for estimating the number of contributors (NoC) of autosomal short tandem repeat (STR) profile in many forensic laboratories. In this study, we applied NoC estimation methods to mixed Y-STR profiles and evaluated its uncertainty and performance. For the MAC method, as recent Y-STR typing kits involve single- and multi-copy loci, we defined “MAC-single” for use across only single-copy loci and “MAC-multi” for use across only multi-copy loci. We generated a dataset containing 120,000 Y-STR profiles for a one to six-person mixture in silico based on previously reported haplotype frequencies of 27 Y-STR loci in Yfiler Plus for the U.S. population (reported by NIST) and the Henan Han population. The dataset was randomly split into a training set and a test set. The training set was used to construct a TAC distribution (TAC curve), whereas the test set was used to calculate the performance metrics (accuracy, precision, recall, and F1-score). In addition, the effect of the upper limit of NoC considered for estimation on overall accuracy was evaluated. The overall accuracies of MAC-single, MAC-multi, and TAC methods when the upper limit of NoC was set to six-person were 0.7920, 0.4329, and 0.7877 for the U.S. population and 0.8207, 0.4609, and 0.8385 for the Henan Han population. Our results suggest that the MAC-single and TAC methods can estimate the NoC for mixed Y-STR profiles with high levels of accuracy.</p></div>\",\"PeriodicalId\":50435,\"journal\":{\"name\":\"Forensic Science International-Genetics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Forensic Science International-Genetics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1872497324001418\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Science International-Genetics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1872497324001418","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

在许多法医实验室中,最大等位基因数(MAC)和总等位基因数(TAC)方法被广泛用于估算常染色体短串联重复序列(STR)图谱的贡献者数量(NoC)。在本研究中,我们将 NoC 估算方法应用于混合 Y-STR 图谱,并评估了其不确定性和性能。对于 MAC 方法,由于最近的 Y-STR 分型试剂盒涉及单拷贝和多拷贝位点,我们定义了仅用于单拷贝位点的 "MAC-单 "和仅用于多拷贝位点的 "MAC-多"。我们根据之前报告的美国人群(由 NIST 报告)和河南汉族人群的 Yfiler Plus 中 27 个 Y-STR 位点的单倍型频率,为一到六人的混合物生成了一个包含 120,000 个 Y-STR 图谱的硅学数据集。数据集被随机分成训练集和测试集。训练集用于构建 TAC 分布(TAC 曲线),测试集用于计算性能指标(准确率、精确率、召回率和 F1 分数)。此外,还评估了用于估算的 NoC 上限对总体准确率的影响。当 NoC 上限设为 6 人时,MAC-单、MAC-多和 TAC 方法的总体准确率在美国人口中分别为 0.7920、0.4329 和 0.7877,在河南汉族人口中分别为 0.8207、0.4609 和 0.8385。我们的研究结果表明,MAC-单一法和 TAC 法可以高精度地估计混合 Y-STR 图谱的 NoC。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Uncertainty in the number of contributor estimation methods applied to a Y-STR profile

Maximum allele count (MAC) and total allele count (TAC) methods are widely used for estimating the number of contributors (NoC) of autosomal short tandem repeat (STR) profile in many forensic laboratories. In this study, we applied NoC estimation methods to mixed Y-STR profiles and evaluated its uncertainty and performance. For the MAC method, as recent Y-STR typing kits involve single- and multi-copy loci, we defined “MAC-single” for use across only single-copy loci and “MAC-multi” for use across only multi-copy loci. We generated a dataset containing 120,000 Y-STR profiles for a one to six-person mixture in silico based on previously reported haplotype frequencies of 27 Y-STR loci in Yfiler Plus for the U.S. population (reported by NIST) and the Henan Han population. The dataset was randomly split into a training set and a test set. The training set was used to construct a TAC distribution (TAC curve), whereas the test set was used to calculate the performance metrics (accuracy, precision, recall, and F1-score). In addition, the effect of the upper limit of NoC considered for estimation on overall accuracy was evaluated. The overall accuracies of MAC-single, MAC-multi, and TAC methods when the upper limit of NoC was set to six-person were 0.7920, 0.4329, and 0.7877 for the U.S. population and 0.8207, 0.4609, and 0.8385 for the Henan Han population. Our results suggest that the MAC-single and TAC methods can estimate the NoC for mixed Y-STR profiles with high levels of accuracy.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.50
自引率
32.30%
发文量
132
审稿时长
11.3 weeks
期刊介绍: Forensic Science International: Genetics is the premier journal in the field of Forensic Genetics. This branch of Forensic Science can be defined as the application of genetics to human and non-human material (in the sense of a science with the purpose of studying inherited characteristics for the analysis of inter- and intra-specific variations in populations) for the resolution of legal conflicts. The scope of the journal includes: Forensic applications of human polymorphism. Testing of paternity and other family relationships, immigration cases, typing of biological stains and tissues from criminal casework, identification of human remains by DNA testing methodologies. Description of human polymorphisms of forensic interest, with special interest in DNA polymorphisms. Autosomal DNA polymorphisms, mini- and microsatellites (or short tandem repeats, STRs), single nucleotide polymorphisms (SNPs), X and Y chromosome polymorphisms, mtDNA polymorphisms, and any other type of DNA variation with potential forensic applications. Non-human DNA polymorphisms for crime scene investigation. Population genetics of human polymorphisms of forensic interest. Population data, especially from DNA polymorphisms of interest for the solution of forensic problems. DNA typing methodologies and strategies. Biostatistical methods in forensic genetics. Evaluation of DNA evidence in forensic problems (such as paternity or immigration cases, criminal casework, identification), classical and new statistical approaches. Standards in forensic genetics. Recommendations of regulatory bodies concerning methods, markers, interpretation or strategies or proposals for procedural or technical standards. Quality control. Quality control and quality assurance strategies, proficiency testing for DNA typing methodologies. Criminal DNA databases. Technical, legal and statistical issues. General ethical and legal issues related to forensic genetics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信