大规模平行测序平台中STRs和SNPs法医遗传分析的全样本随机阈值的计算和实现

IF 0.5 Q4 GENETICS & HEREDITY
Kathryn Stephens, June Snedecor, Bruce Budowle
{"title":"大规模平行测序平台中STRs和SNPs法医遗传分析的全样本随机阈值的计算和实现","authors":"Kathryn Stephens,&nbsp;June Snedecor,&nbsp;Bruce Budowle","doi":"10.1016/j.fsigss.2022.09.032","DOIUrl":null,"url":null,"abstract":"<div><p>Capillary electrophoresis (CE) analysis of short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) use a stochastic threshold to consider the possibility of missing alleles (dropouts) or detecting additional alleles (drop-ins). In CE, this threshold may be approximately 200 RFU, and peak heights are assessed relative to this threshold. In next generation sequencing (NGS), also known as massively parallel sequencing (MPS), STRs are identified by their sequence, and specific alleles are identified by their repeat number and intra-allelic variation. Abundance is approximated by the number of sequence reads for each allele. The total number of reads generated for each marker in a sample depends on factors such as the numbers of samples pooled for sequencing, the number of markers in the assay, the integrity and quantity of the input DNA sample, and the inter-locus balance of the assay. For multiplexes that contain both autosomal and sex-linked markers, the biological sex of the sample also influences total reads per locus. To normalize these variables and better establish a robust stochastic threshold, a sample-wide metric is proposed for estimating the possibility of dropouts or drop-ins based on the variance of the inter-locus balance of the markers across a sample. The intuition is that samples with variable allele balance globally are more likely to have noisier data and therefore require more stringent read count thresholds. This method is robust to sequencing multiplexity, biological sex and manufacturing lot variation.</p></div>","PeriodicalId":56262,"journal":{"name":"Forensic Science International: Genetics Supplement Series","volume":"8 ","pages":"Pages 88-90"},"PeriodicalIF":0.5000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1875176822000324/pdfft?md5=7c65d313e5ccc5aced57667501e730b3&pid=1-s2.0-S1875176822000324-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Calculation and implementation of sample-wide stochastic thresholds for forensic genetic analysis of STRs and SNPs for massively parallel sequencing platforms\",\"authors\":\"Kathryn Stephens,&nbsp;June Snedecor,&nbsp;Bruce Budowle\",\"doi\":\"10.1016/j.fsigss.2022.09.032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Capillary electrophoresis (CE) analysis of short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) use a stochastic threshold to consider the possibility of missing alleles (dropouts) or detecting additional alleles (drop-ins). In CE, this threshold may be approximately 200 RFU, and peak heights are assessed relative to this threshold. In next generation sequencing (NGS), also known as massively parallel sequencing (MPS), STRs are identified by their sequence, and specific alleles are identified by their repeat number and intra-allelic variation. Abundance is approximated by the number of sequence reads for each allele. The total number of reads generated for each marker in a sample depends on factors such as the numbers of samples pooled for sequencing, the number of markers in the assay, the integrity and quantity of the input DNA sample, and the inter-locus balance of the assay. For multiplexes that contain both autosomal and sex-linked markers, the biological sex of the sample also influences total reads per locus. To normalize these variables and better establish a robust stochastic threshold, a sample-wide metric is proposed for estimating the possibility of dropouts or drop-ins based on the variance of the inter-locus balance of the markers across a sample. The intuition is that samples with variable allele balance globally are more likely to have noisier data and therefore require more stringent read count thresholds. This method is robust to sequencing multiplexity, biological sex and manufacturing lot variation.</p></div>\",\"PeriodicalId\":56262,\"journal\":{\"name\":\"Forensic Science International: Genetics Supplement Series\",\"volume\":\"8 \",\"pages\":\"Pages 88-90\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1875176822000324/pdfft?md5=7c65d313e5ccc5aced57667501e730b3&pid=1-s2.0-S1875176822000324-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Forensic Science International: Genetics Supplement Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1875176822000324\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Science International: Genetics Supplement Series","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1875176822000324","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

短串联重复序列(STR)和单核苷酸多态性(SNPs)的毛细管电泳(CE)分析使用随机阈值来考虑缺失等位基因(缺失)或检测额外等位基因的可能性(缺失)。在CE中,该阈值可以是大约200RFU,并且相对于该阈值来评估峰值高度。在下一代测序(NGS)中,也被称为大规模平行测序(MPS),STR是通过其序列来识别的,而特定的等位基因是通过其重复数和等位基因内变异来识别的。丰度是通过每个等位基因的序列读取数来近似的。样本中每个标记物产生的读数总数取决于一些因素,如汇集用于测序的样本数量、测定中标记物的数量、输入DNA样本的完整性和数量以及测定的基因座间平衡。对于同时包含常染色体和性别连锁标记的多重基因,样本的生物学性别也会影响每个基因座的总读数。为了规范化这些变量并更好地建立稳健的随机阈值,提出了一种样本范围的度量,用于基于样本中标记的位点间平衡的方差来估计辍学或辍学的可能性。直觉是,全球等位基因平衡可变的样本更有可能拥有更嘈杂的数据,因此需要更严格的读取计数阈值。这种方法对测序的多样性、生物性别和生产批次的变化都是稳健的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Calculation and implementation of sample-wide stochastic thresholds for forensic genetic analysis of STRs and SNPs for massively parallel sequencing platforms

Capillary electrophoresis (CE) analysis of short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) use a stochastic threshold to consider the possibility of missing alleles (dropouts) or detecting additional alleles (drop-ins). In CE, this threshold may be approximately 200 RFU, and peak heights are assessed relative to this threshold. In next generation sequencing (NGS), also known as massively parallel sequencing (MPS), STRs are identified by their sequence, and specific alleles are identified by their repeat number and intra-allelic variation. Abundance is approximated by the number of sequence reads for each allele. The total number of reads generated for each marker in a sample depends on factors such as the numbers of samples pooled for sequencing, the number of markers in the assay, the integrity and quantity of the input DNA sample, and the inter-locus balance of the assay. For multiplexes that contain both autosomal and sex-linked markers, the biological sex of the sample also influences total reads per locus. To normalize these variables and better establish a robust stochastic threshold, a sample-wide metric is proposed for estimating the possibility of dropouts or drop-ins based on the variance of the inter-locus balance of the markers across a sample. The intuition is that samples with variable allele balance globally are more likely to have noisier data and therefore require more stringent read count thresholds. This method is robust to sequencing multiplexity, biological sex and manufacturing lot variation.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Forensic Science International: Genetics Supplement Series
Forensic Science International: Genetics Supplement Series Medicine-Pathology and Forensic Medicine
CiteScore
0.40
自引率
0.00%
发文量
122
审稿时长
25 days
期刊介绍: The Journal of Forensic Science International Genetics Supplement Series is the perfect publication vehicle for the proceedings of a scientific symposium, commissioned thematic issues, or for disseminating a selection of invited articles. The Forensic Science International Genetics Supplement Series is part of a duo of publications on forensic genetics, published by Elsevier on behalf of the International Society for Forensic Genetics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信