单细胞数据推断对模型构造的依赖性。

IF 3.2 2区医学 Q2 GENETICS & HEREDITY

Forensic Science International-Genetics Pub Date : 2025-01-03 DOI:10.1016/j.fsigen.2024.103220

Catherine M. Grgicak , Klaas Slooten , Robert G. Cowell , Qhawe Bhembe , Desmond S. Lun

{"title":"单细胞数据推断对模型构造的依赖性。","authors":"Catherine M. Grgicak , Klaas Slooten , Robert G. Cowell , Qhawe Bhembe , Desmond S. Lun","doi":"10.1016/j.fsigen.2024.103220","DOIUrl":null,"url":null,"abstract":"<div><div>Recent developments in single-cell analysis have revolutionized basic research and have garnered the attention of the forensic domain. Though single-cell analysis is not new to forensics, the ways in which these data can be generated and interpreted are. Modern interpretation strategies report likelihood ratios that rely on a model of the world that is a simplification of it. It is, therefore, plausible that different reasonable models will assign noticeably different weights of evidence (WoEs) to some of these data, resulting in inconsistent reports and protracted reviews of that evidence, potentially across years. With one goal of research being to identify and understand sources of inconsistencies during early stages, we undertake a study that evaluates WoE at the limit of one single-cell electropherogram (scEPG) across three architecturally distinct probabilistic models. The three are named EESCIt (Evidentiary Evaluation of Single Cells), TD (Top-Down), and DCM (Discrete Cell Model). To do this, we performance test the three models on a set of 996 individual scEPGs and conduct one H<sub>1</sub>-true, i.e., true contributor, and 201 H<sub>2</sub>-true, i.e., false contributor, tests, per scEPG. With the 201,192 outcomes per model, we confirm that scEPGs well resolve the hypotheses, regardless of what model was applied. We also observe that WoEs increase, on average, by 1 for every 1000 RFU of total intensity added until a plateau near the logarithm of the inverse of the random match probability is reached at ca. 22,000 RFU. By querying WoE calibration for each model, we determine if the evidence is over- or under-stated for any one of them. We find that for WoE ≥ -1 hardly any calibration discrepancy is observed. There were rare instances, however, for which WoEs that were ≤ -1 too strongly pointed in the negative direction, though H<sub>1</sub> was true. This was the result of five scEPGs that not only exhibited extreme signal in stutter positions, but also carried little information in other loci. These findings show that all three models appropriately stated WoEs for scEPGs when reporting positive WoE, and the two continuous model’s WoE reasonably represented the findings when WoE < -1 for most loci. To further explore, we continued with paired analyses that evaluated the agreement in WoE, per scEPG, across models. Unlike unpaired analyses, this evaluation determines if well performing models return equivalent results for the same scEPG. The paired analysis was summarized by way of intraclass correlations, which were at least 0.99997. Further, we found that 762 of 996 WoEs were within a range of 3 orders of magnitude of each other, though many of these were associated with WoEs that were large, i.e., > 9, in the first instance. When we more closely focus on scEPGs giving ranges ≥ 3, but whose WoE ≤ 9 for at least one of the models, we find there are 21 of them. When we perform a locus-by-locus investigation of these 21 and of the five scEPGs returning too strong negative WoE for true contributors we find that extreme stutter is usually the cause of the challenges. To ameliorate differences in predicting rare, though impactful, events we proffer interpretive adaptions that extend beyond manually addressing the phenomena. With the WoE being calibrated within their relevant regions across EESCIt, TD and DCM, we categorize each as meeting the pillar of legitimacy for single-cell data within their intended WoE ranges.</div></div>","PeriodicalId":50435,"journal":{"name":"Forensic Science International-Genetics","volume":"76 ","pages":"Article 103220"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The (in)dependence of single-cell data inferences on model constructs\",\"authors\":\"Catherine M. Grgicak , Klaas Slooten , Robert G. Cowell , Qhawe Bhembe , Desmond S. Lun\",\"doi\":\"10.1016/j.fsigen.2024.103220\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent developments in single-cell analysis have revolutionized basic research and have garnered the attention of the forensic domain. Though single-cell analysis is not new to forensics, the ways in which these data can be generated and interpreted are. Modern interpretation strategies report likelihood ratios that rely on a model of the world that is a simplification of it. It is, therefore, plausible that different reasonable models will assign noticeably different weights of evidence (WoEs) to some of these data, resulting in inconsistent reports and protracted reviews of that evidence, potentially across years. With one goal of research being to identify and understand sources of inconsistencies during early stages, we undertake a study that evaluates WoE at the limit of one single-cell electropherogram (scEPG) across three architecturally distinct probabilistic models. The three are named EESCIt (Evidentiary Evaluation of Single Cells), TD (Top-Down), and DCM (Discrete Cell Model). To do this, we performance test the three models on a set of 996 individual scEPGs and conduct one H<sub>1</sub>-true, i.e., true contributor, and 201 H<sub>2</sub>-true, i.e., false contributor, tests, per scEPG. With the 201,192 outcomes per model, we confirm that scEPGs well resolve the hypotheses, regardless of what model was applied. We also observe that WoEs increase, on average, by 1 for every 1000 RFU of total intensity added until a plateau near the logarithm of the inverse of the random match probability is reached at ca. 22,000 RFU. By querying WoE calibration for each model, we determine if the evidence is over- or under-stated for any one of them. We find that for WoE ≥ -1 hardly any calibration discrepancy is observed. There were rare instances, however, for which WoEs that were ≤ -1 too strongly pointed in the negative direction, though H<sub>1</sub> was true. This was the result of five scEPGs that not only exhibited extreme signal in stutter positions, but also carried little information in other loci. These findings show that all three models appropriately stated WoEs for scEPGs when reporting positive WoE, and the two continuous model’s WoE reasonably represented the findings when WoE < -1 for most loci. To further explore, we continued with paired analyses that evaluated the agreement in WoE, per scEPG, across models. Unlike unpaired analyses, this evaluation determines if well performing models return equivalent results for the same scEPG. The paired analysis was summarized by way of intraclass correlations, which were at least 0.99997. Further, we found that 762 of 996 WoEs were within a range of 3 orders of magnitude of each other, though many of these were associated with WoEs that were large, i.e., > 9, in the first instance. When we more closely focus on scEPGs giving ranges ≥ 3, but whose WoE ≤ 9 for at least one of the models, we find there are 21 of them. When we perform a locus-by-locus investigation of these 21 and of the five scEPGs returning too strong negative WoE for true contributors we find that extreme stutter is usually the cause of the challenges. To ameliorate differences in predicting rare, though impactful, events we proffer interpretive adaptions that extend beyond manually addressing the phenomena. With the WoE being calibrated within their relevant regions across EESCIt, TD and DCM, we categorize each as meeting the pillar of legitimacy for single-cell data within their intended WoE ranges.</div></div>\",\"PeriodicalId\":50435,\"journal\":{\"name\":\"Forensic Science International-Genetics\",\"volume\":\"76 \",\"pages\":\"Article 103220\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Forensic Science International-Genetics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1872497324002163\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Science International-Genetics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1872497324002163","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

摘要

单细胞分析的最新发展使基础研究发生了革命性的变化，并引起了法医领域的注意。虽然单细胞分析对法医学来说并不新鲜，但产生和解释这些数据的方式却是新鲜的。现代解释策略报告的可能性比依赖于一个简化的世界模型。因此，不同的合理模型可能会给其中一些数据分配明显不同的证据权重（哀伤），从而导致不一致的报告和对证据的长期审查，可能会持续数年。研究的一个目标是在早期阶段识别和理解不一致的来源，我们进行了一项研究，在三个结构不同的概率模型中，在一个单细胞电泳（scEPG）的极限下评估WoE。这三种方法分别是EESCIt（单细胞证据评估）、TD（自上而下）和DCM（离散细胞模型）。为此，我们在一组996个单独的scEPG上对这三个模型进行性能测试，并对每个scEPG进行一次H1-true（即真贡献者）和201次 H2-true（即假贡献者）测试。每个模型有201,192个结果，我们证实，无论应用哪种模型，scepg都能很好地解决假设。我们还观察到，平均而言，每增加1000 RFU的总强度，灾难就会增加1，直到在大约22,000 RFU时达到随机匹配概率逆的对数附近的平台。通过查询每个模型的WoE校准，我们确定其中任何一个模型的证据是否被夸大或低估。我们发现，对于WoE≥ -1，几乎没有观测到任何校准差异。然而，在极少数情况下，尽管H1是正确的，但≤ -1的哀伤过于强烈地指向负面方向。这是5个scepg的结果，它们不仅在口吃位置表现出极端的信号，而且在其他位点上携带的信息很少。这些发现表明，当报告积极的WoE时，所有三个模型都适当地为scepg陈述了WoE，并且两个连续模型的WoE在第一次实例中合理地代表了WoE 9时的发现。当我们更仔细地关注范围≥ 3，但至少有一个模型的WoE≤ 9的scepg时，我们发现有21个。当我们对这21个基因座和5个基因座进行逐一调查时，我们发现极端口吃通常是造成这些挑战的原因。为了改善预测罕见但有影响的事件的差异，我们提供了超越手动解决现象的解释性适应。随着在EESCIt， TD和DCM的相关区域内校准WoE，我们将每个分类为满足其预期WoE范围内单细胞数据的合法性支柱。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The (in)dependence of single-cell data inferences on model constructs

Recent developments in single-cell analysis have revolutionized basic research and have garnered the attention of the forensic domain. Though single-cell analysis is not new to forensics, the ways in which these data can be generated and interpreted are. Modern interpretation strategies report likelihood ratios that rely on a model of the world that is a simplification of it. It is, therefore, plausible that different reasonable models will assign noticeably different weights of evidence (WoEs) to some of these data, resulting in inconsistent reports and protracted reviews of that evidence, potentially across years. With one goal of research being to identify and understand sources of inconsistencies during early stages, we undertake a study that evaluates WoE at the limit of one single-cell electropherogram (scEPG) across three architecturally distinct probabilistic models. The three are named EESCIt (Evidentiary Evaluation of Single Cells), TD (Top-Down), and DCM (Discrete Cell Model). To do this, we performance test the three models on a set of 996 individual scEPGs and conduct one H₁-true, i.e., true contributor, and 201 H₂-true, i.e., false contributor, tests, per scEPG. With the 201,192 outcomes per model, we confirm that scEPGs well resolve the hypotheses, regardless of what model was applied. We also observe that WoEs increase, on average, by 1 for every 1000 RFU of total intensity added until a plateau near the logarithm of the inverse of the random match probability is reached at ca. 22,000 RFU. By querying WoE calibration for each model, we determine if the evidence is over- or under-stated for any one of them. We find that for WoE ≥ -1 hardly any calibration discrepancy is observed. There were rare instances, however, for which WoEs that were ≤ -1 too strongly pointed in the negative direction, though H₁ was true. This was the result of five scEPGs that not only exhibited extreme signal in stutter positions, but also carried little information in other loci. These findings show that all three models appropriately stated WoEs for scEPGs when reporting positive WoE, and the two continuous model’s WoE reasonably represented the findings when WoE < -1 for most loci. To further explore, we continued with paired analyses that evaluated the agreement in WoE, per scEPG, across models. Unlike unpaired analyses, this evaluation determines if well performing models return equivalent results for the same scEPG. The paired analysis was summarized by way of intraclass correlations, which were at least 0.99997. Further, we found that 762 of 996 WoEs were within a range of 3 orders of magnitude of each other, though many of these were associated with WoEs that were large, i.e., > 9, in the first instance. When we more closely focus on scEPGs giving ranges ≥ 3, but whose WoE ≤ 9 for at least one of the models, we find there are 21 of them. When we perform a locus-by-locus investigation of these 21 and of the five scEPGs returning too strong negative WoE for true contributors we find that extreme stutter is usually the cause of the challenges. To ameliorate differences in predicting rare, though impactful, events we proffer interpretive adaptions that extend beyond manually addressing the phenomena. With the WoE being calibrated within their relevant regions across EESCIt, TD and DCM, we categorize each as meeting the pillar of legitimacy for single-cell data within their intended WoE ranges.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Forensic Science International-Genetics 生物-医学：法

CiteScore

7.50

自引率

32.30%

发文量

132

审稿时长

11.3 weeks

期刊介绍： Forensic Science International: Genetics is the premier journal in the field of Forensic Genetics. This branch of Forensic Science can be defined as the application of genetics to human and non-human material (in the sense of a science with the purpose of studying inherited characteristics for the analysis of inter- and intra-specific variations in populations) for the resolution of legal conflicts. The scope of the journal includes: Forensic applications of human polymorphism. Testing of paternity and other family relationships, immigration cases, typing of biological stains and tissues from criminal casework, identification of human remains by DNA testing methodologies. Description of human polymorphisms of forensic interest, with special interest in DNA polymorphisms. Autosomal DNA polymorphisms, mini- and microsatellites (or short tandem repeats, STRs), single nucleotide polymorphisms (SNPs), X and Y chromosome polymorphisms, mtDNA polymorphisms, and any other type of DNA variation with potential forensic applications. Non-human DNA polymorphisms for crime scene investigation. Population genetics of human polymorphisms of forensic interest. Population data, especially from DNA polymorphisms of interest for the solution of forensic problems. DNA typing methodologies and strategies. Biostatistical methods in forensic genetics. Evaluation of DNA evidence in forensic problems (such as paternity or immigration cases, criminal casework, identification), classical and new statistical approaches. Standards in forensic genetics. Recommendations of regulatory bodies concerning methods, markers, interpretation or strategies or proposals for procedural or technical standards. Quality control. Quality control and quality assurance strategies, proficiency testing for DNA typing methodologies. Criminal DNA databases. Technical, legal and statistical issues. General ethical and legal issues related to forensic genetics.