Enhanced SNP genotyping with symmetric multinomial logistic regression

IF 3.2 2区 医学 Q2 GENETICS & HEREDITY
Malte B. Nielsen , Poul S. Eriksen , Helle S. Mogensen , Niels Morling , Mikkel M. Andersen
{"title":"Enhanced SNP genotyping with symmetric multinomial logistic regression","authors":"Malte B. Nielsen ,&nbsp;Poul S. Eriksen ,&nbsp;Helle S. Mogensen ,&nbsp;Niels Morling ,&nbsp;Mikkel M. Andersen","doi":"10.1016/j.fsigen.2025.103291","DOIUrl":null,"url":null,"abstract":"<div><div>In genotyping, determining single nucleotide polymorphisms (SNPs) is standard practice, but it becomes difficult when analysing small quantities of input DNA, as is often required in forensic applications. Existing SNP genotyping methods, such as the HID SNP Genotyper Plugin (HSG) from Thermo Fisher Scientific, perform well with adequate DNA input levels but often produce erroneously called genotypes when DNA quantities are low. To mitigate these errors, genotype quality can be checked with the HSG. However, enforcing the HSG’s quality checks decreases the call rate by introducing more no-calls, and it does not eliminate all wrong calls. This study presents and validates a symmetric multinomial logistic regression (SMLR) model designed to enhance genotyping accuracy and call rate with small amounts of DNA. Comprehensive bootstrap and cross-validation analyses across a wide range of DNA quantities demonstrate the robustness and efficiency of the SMLR model in maintaining high call rates without compromising accuracy compared to the HSG. For DNA amounts as low as 31.25<!--> <!-->pg, the SMLR method reduced the rate of no-calls by 50.0% relative to the HSG while maintaining the same rate of wrong calls, resulting in a call rate of 96.0%. Similarly, SMLR reduced the rate of wrong calls by 55.6% while maintaining the same call rate, achieving an accuracy of 99.775%. The no-call and wrong-call rates were significantly reduced at 62.5–250<!--> <!-->pg DNA. The results highlight the SMLR model’s utility in optimising SNP genotyping at suboptimal DNA concentrations, making it a valuable tool for forensic applications where sample quantity and quality may be decreased. This work reinforces the feasibility of statistical approaches in forensic genotyping and provides a framework for implementing the SMLR method in practical forensic settings. The SMLR model applies to genotyping biallelic data with a signal (e.g. reads, counts, or intensity) for each allele. The model can also improve the allele balance quality check.</div></div>","PeriodicalId":50435,"journal":{"name":"Forensic Science International-Genetics","volume":"78 ","pages":"Article 103291"},"PeriodicalIF":3.2000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Science International-Genetics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1872497325000717","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

In genotyping, determining single nucleotide polymorphisms (SNPs) is standard practice, but it becomes difficult when analysing small quantities of input DNA, as is often required in forensic applications. Existing SNP genotyping methods, such as the HID SNP Genotyper Plugin (HSG) from Thermo Fisher Scientific, perform well with adequate DNA input levels but often produce erroneously called genotypes when DNA quantities are low. To mitigate these errors, genotype quality can be checked with the HSG. However, enforcing the HSG’s quality checks decreases the call rate by introducing more no-calls, and it does not eliminate all wrong calls. This study presents and validates a symmetric multinomial logistic regression (SMLR) model designed to enhance genotyping accuracy and call rate with small amounts of DNA. Comprehensive bootstrap and cross-validation analyses across a wide range of DNA quantities demonstrate the robustness and efficiency of the SMLR model in maintaining high call rates without compromising accuracy compared to the HSG. For DNA amounts as low as 31.25 pg, the SMLR method reduced the rate of no-calls by 50.0% relative to the HSG while maintaining the same rate of wrong calls, resulting in a call rate of 96.0%. Similarly, SMLR reduced the rate of wrong calls by 55.6% while maintaining the same call rate, achieving an accuracy of 99.775%. The no-call and wrong-call rates were significantly reduced at 62.5–250 pg DNA. The results highlight the SMLR model’s utility in optimising SNP genotyping at suboptimal DNA concentrations, making it a valuable tool for forensic applications where sample quantity and quality may be decreased. This work reinforces the feasibility of statistical approaches in forensic genotyping and provides a framework for implementing the SMLR method in practical forensic settings. The SMLR model applies to genotyping biallelic data with a signal (e.g. reads, counts, or intensity) for each allele. The model can also improve the allele balance quality check.
对称多项逻辑回归增强SNP基因分型
在基因分型中,确定单核苷酸多态性(snp)是标准做法,但在分析少量输入DNA时变得困难,这在法医应用中经常需要。现有的SNP基因分型方法,如赛默飞世尔科学公司的HID SNP基因分型插件(HSG),在足够的DNA输入水平下表现良好,但当DNA数量较低时,经常产生错误的基因分型。为了减少这些错误,可以用HSG检查基因型质量。然而,强制执行HSG的质量检查通过引入更多的无呼叫来降低通过率,并且它并没有消除所有的错误呼叫。本研究提出并验证了对称多项式逻辑回归(SMLR)模型,该模型旨在提高少量DNA的基因分型准确性和传呼率。在广泛的DNA数量范围内进行的综合引导和交叉验证分析表明,与HSG相比,SMLR模型在保持高呼叫率而不影响准确性方面具有鲁棒性和效率。当DNA含量低至31.25 pg时,SMLR方法将未呼叫率相对HSG降低了50.0%,同时保持了相同的错误呼叫率,从而使呼叫率达到96.0%。同样,SMLR在保持相同的呼叫率的同时,将错误呼叫率降低了55.6%,达到了99.775%的准确率。在62.5 ~ 250 pg DNA范围内,无呼叫率和错误呼叫率显著降低。结果突出了SMLR模型在优化次优DNA浓度下SNP基因分型方面的效用,使其成为样品数量和质量可能降低的法医应用的有价值的工具。这项工作加强了法医基因分型统计方法的可行性,并为在实际法医环境中实施SMLR方法提供了一个框架。SMLR模型适用于具有每个等位基因信号(例如读取、计数或强度)的双等位基因数据的基因分型。该模型还可以提高等位基因平衡质量检查。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.50
自引率
32.30%
发文量
132
审稿时长
11.3 weeks
期刊介绍: Forensic Science International: Genetics is the premier journal in the field of Forensic Genetics. This branch of Forensic Science can be defined as the application of genetics to human and non-human material (in the sense of a science with the purpose of studying inherited characteristics for the analysis of inter- and intra-specific variations in populations) for the resolution of legal conflicts. The scope of the journal includes: Forensic applications of human polymorphism. Testing of paternity and other family relationships, immigration cases, typing of biological stains and tissues from criminal casework, identification of human remains by DNA testing methodologies. Description of human polymorphisms of forensic interest, with special interest in DNA polymorphisms. Autosomal DNA polymorphisms, mini- and microsatellites (or short tandem repeats, STRs), single nucleotide polymorphisms (SNPs), X and Y chromosome polymorphisms, mtDNA polymorphisms, and any other type of DNA variation with potential forensic applications. Non-human DNA polymorphisms for crime scene investigation. Population genetics of human polymorphisms of forensic interest. Population data, especially from DNA polymorphisms of interest for the solution of forensic problems. DNA typing methodologies and strategies. Biostatistical methods in forensic genetics. Evaluation of DNA evidence in forensic problems (such as paternity or immigration cases, criminal casework, identification), classical and new statistical approaches. Standards in forensic genetics. Recommendations of regulatory bodies concerning methods, markers, interpretation or strategies or proposals for procedural or technical standards. Quality control. Quality control and quality assurance strategies, proficiency testing for DNA typing methodologies. Criminal DNA databases. Technical, legal and statistical issues. General ethical and legal issues related to forensic genetics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信