Amber C W Vandepoele, Natalie Novotna, Dan Myers, Michael A Marciano
{"title":"Characterizing stutter in single cells and the impact on multi-cell analysis.","authors":"Amber C W Vandepoele, Natalie Novotna, Dan Myers, Michael A Marciano","doi":"10.1016/j.fsigen.2024.103211","DOIUrl":"https://doi.org/10.1016/j.fsigen.2024.103211","url":null,"abstract":"<p><p>Short tandem repeat analysis is a robust and reliable DNA analysis technique that aids in source identification of a biological sample. However, the interpretation, particularly when DNA mixtures are present at low levels, can be complicated by the presence of PCR artifacts most commonly referred to as stutter. The presence of stutter products can increase the difficulty of interpretation in DNA mixtures as well as low-level DNA samples down to a single cell. Stutter product formation is stochastic in nature and although methods exist that can estimate the magnitude of stutter product formation, it still is not well understood. With the increased sensitivity of forensic DNA analyses, it has become possible to obtain interpretable DNA profiles from as low as 6.6 pg of DNA, or a single human diploid cell. However, this presents an interpretational challenge because the stutter in these low-level DNA samples might stray from the expected patterns observed in high-level DNA samples. Therefore, this project focuses on characterizing stutter in single cell samples to help generate a deeper understanding of stutter and provide a guide for detecting and evaluating stutter in low-level samples. Stutter analysis was performed using data generated from 180 single cells isolated with the DEPArrayTM NxT, amplified using the PowerPlex Fusion 6 C amplification kit at 29 or 30 cycles. Stutter was successfully characterized in single cells and stutter percentages were highly elevated compared to high-level samples where the variance increased as the number of cells being analyzed decreased leading to potential high stutter at low DNA levels. Using empirical and simulated (resampled) data, this study also reinforces historically relevant patterns in stutter product formation and demonstrates the relative differences in stutter in n-1, n-2 and n + 1 stutter product formation in simple, complex and compound repeats.</p>","PeriodicalId":94012,"journal":{"name":"Forensic science international. Genetics","volume":"76 ","pages":"103211"},"PeriodicalIF":0.0,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Séverine Nozownik, Tacha Hicks, Patrick Basset, Vincent Castella
{"title":"Searching national DNA databases with complex DNA profiles: An empirical study using probabilistic genotyping.","authors":"Séverine Nozownik, Tacha Hicks, Patrick Basset, Vincent Castella","doi":"10.1016/j.fsigen.2024.103208","DOIUrl":"https://doi.org/10.1016/j.fsigen.2024.103208","url":null,"abstract":"<p><p>In most National DNA databases (NDNADB), only single source DNA profiles, and sometimes two-person DNA mixtures, can be searched provided a minimum number of loci (or alleles) is available. DNA profiles that do not meet these criteria (about 14 % of the traces analyzed in Western Switzerland) can be compared locally with candidates upon request from police services, used for one-off search, or remain unused. With the advent of probabilistic genotyping (PG), such complex DNA profiles can be compared to those stored in NDNADB based on likelihood ratios (LRs). In this pilot study, traces of known contributors and casework DNA profiles were used to evaluate the performance of the DBLR™ \"Search database\" tool in conjunction with the Swiss NDNADB. First, 40 DNA mixtures (2-5 contributors) from 15 volunteers were prepared in the wet laboratory. They were deconvoluted with STRmix™ and compared to a database containing the DNA profiles of these 15 volunteers, along with 174,493 person DNA profiles from the Swiss NDNADB (ground-truth experiments). Using LR thresholds of 10<sup>3</sup> and 10<sup>6</sup>, sensitivity and specificity were respectively 90.0 %/57.1 % and 99.9 %/100.0 %. For the lower LR threshold, this resulted in 52 adventitious associations out of more than 24 million pairwise comparisons. Second, 160 DNA mixture profiles from casework (2-4 contributors) that had previously been locally compared were searched with DBLR™ using the same conditions as for phase 1. With the 10<sup>3</sup> LR threshold, 380 associations were retrieved: 194 of these corresponded to expected associations, as they were previously made through the local comparisons with known persons, and 186 were new. With the 10<sup>6</sup> LR threshold, 199 associations were recovered of which 180 were expected and 19 new. This demonstrates that even with complex DNA profiles (up to 4 contributors) all expected associations were retrieved with a limited number of candidates per trace. Database searches of complex DNA mixtures allow for the generation of leads early in an investigation for DNA profiles that might otherwise remain underutilized. Next steps for the possible integration of DBLR™ or similar software within an operational context will require discussions on legal, financial, and technical aspects among stakeholders.</p>","PeriodicalId":94012,"journal":{"name":"Forensic science international. Genetics","volume":"76 ","pages":"103208"},"PeriodicalIF":0.0,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142824841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peter Resutik, Joëlle Schneider, Simon Aeschbacher, Magnus Dehli Vigeland, Mario Gysi, Corinne Moser, Chiara Barbieri, Paul Widmer, Mathias Currat, Adelgunde Kratzer, Michael Krützen, Cordula Haas, Natasha Arora
{"title":"Uncovering genetic signatures of the Walser migration in the Alps: Patterns of diversity and differentiation.","authors":"Peter Resutik, Joëlle Schneider, Simon Aeschbacher, Magnus Dehli Vigeland, Mario Gysi, Corinne Moser, Chiara Barbieri, Paul Widmer, Mathias Currat, Adelgunde Kratzer, Michael Krützen, Cordula Haas, Natasha Arora","doi":"10.1016/j.fsigen.2024.103206","DOIUrl":"https://doi.org/10.1016/j.fsigen.2024.103206","url":null,"abstract":"<p><p>Since leaving Africa, human populations have gone through a series of range expansions. While the genomic signatures of these expansions are well detectable on a continental scale, the genomic consequences of small-scale expansions over shorter time spans are more challenging to disentangle. The medieval migration of the Walser people from their homeland in ssouthern Switzerland (Upper Valais) into other regions of the Alps is a good example of such a comparatively recent geographic and demographic expansion in humans. While several studies from the 1980s, based on allozyme markers, assessed levels of isolation and inbreeding in individual Walser communities, they mostly did so by focusing on a single community at a time. Here, we provide a comprehensive overview of genetic diversity and differentiation based on samples from multiple Walser, Walser-homeland, and non-Walser Alpine communities, along with an idealized (simulated) Swiss reference population (Ref-Pop). To explore genetic signals of the Walser migration in the genomes of their descendants, we use a set of forensic autosomal STRs as well as uniparental markers. Estimates of pairwise F<sub>ST</sub> based on autosomal STRs reveal that the Walser-homeland and Walser communities show low to moderate genetic differentiation from the non-Walser Alpine communities and the idealized Ref-Pop. The geographically more remote and likely more isolated Walser-homeland community of Lötschental and the Walser communities of Vals and Gressoney appear genetically more strongly differentiated than other communities. Analyses of mitochondrial DNA revealed the presence of haplogroup W6 among the Walser communities, a haplogroup that is otherwise rare in central Europe. Our study contributes to the understanding of genetic diversity in the Walser-homeland and Walser people, but also highlights the need for a more comprehensive study of the population genetic structure and evolutionary history of European Alpine populations using genome-wide data.</p>","PeriodicalId":94012,"journal":{"name":"Forensic science international. Genetics","volume":"76 ","pages":"103206"},"PeriodicalIF":0.0,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142824844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"STRAF 2: New features and improvements of the STR population data analysis software.","authors":"Alexandre Gouy, Martin Zieger","doi":"10.1016/j.fsigen.2024.103207","DOIUrl":"https://doi.org/10.1016/j.fsigen.2024.103207","url":null,"abstract":"<p><p>Population data in forensic genetics must be checked for a variety of statistical parameters before it can be employed for casework. Several tools exist to perform such tasks; however, it can become challenging to obtain the right results due to the number of software to use and the broad range of input formats. Furthermore, a substantial amount of experience is required to use some of these programs. To overcome these difficulties, we have developed STRAF (STR Analysis for Forensics), a convenient online tool to analyse STR data in forensic genetics. Since its first release in 2017, it has been used in many studies to report allele frequencies, forensic and population genetics parameters, and to explore genetic datasets interactively through a user-friendly interface. Herewith, we introduce the latest version of the STRAF software and the improvements we have implemented over the last years. STRAF 2 includes several new features, such as new statistical methods (multidimensional scaling, comparison to a reference population, haplotype diversities and frequencies) and file conversion utilities. Performance and user experience have also been improved and documentation has been extended. This new version is freely available as an R package (https://github.com/agouy/straf) and a web application (https://straf.fr).</p>","PeriodicalId":94012,"journal":{"name":"Forensic science international. Genetics","volume":"76 ","pages":"103207"},"PeriodicalIF":0.0,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142815366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duncan Taylor, Amy Cahill, Roland A H van Oorschot, Luke Volgin, Mariya Goray
{"title":"Using an interaction timeline to investigate factors related to shedder status.","authors":"Duncan Taylor, Amy Cahill, Roland A H van Oorschot, Luke Volgin, Mariya Goray","doi":"10.1016/j.fsigen.2024.103205","DOIUrl":"https://doi.org/10.1016/j.fsigen.2024.103205","url":null,"abstract":"<p><p>A major factor that influences DNA transfer is the propensity of individuals to 'shed' DNA, commonly referred to as their 'shedder status'. In this work we provide a novel method to analyse and interrogate DNA transfer data from a largely uncontrolled study that tracks the movements and actions of a group of individuals over the course of an hour. By setting up a model that provides a simplistic description of the world, parameters within the model that represent properties of interest can be iteratively refined until the model can sufficiently describe a set of final DNA observations. Because the model describing reality can be constructed and parametrised in any desired configuration, aspects that may be difficult to traditionally test together can be investigated. To that end, we use a 60-min timeline of activity between four individuals and use DNA profiling results from objects taken at the conclusion of the hour to investigate factors that may affect shedder status. We simultaneously consider factors of: the amount of DNA transferred per contact, the rate of self-DNA regeneration, the capacity of hands to hold DNA, and the rate of non-self-DNA removal, all of which may ultimately contribute to someone's shedder status.</p>","PeriodicalId":94012,"journal":{"name":"Forensic science international. Genetics","volume":"76 ","pages":"103205"},"PeriodicalIF":0.0,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142793069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dejan Šorgić, Aleksandra Stefanović, Dušan Keckarević, Mladen Popović
{"title":"XGBoost as a reliable machine learning tool for predicting ancestry using autosomal STR profiles - Proof of method.","authors":"Dejan Šorgić, Aleksandra Stefanović, Dušan Keckarević, Mladen Popović","doi":"10.1016/j.fsigen.2024.103183","DOIUrl":"https://doi.org/10.1016/j.fsigen.2024.103183","url":null,"abstract":"<p><p>The aim of this study was to test the validity of a predictive model of ancestry affiliation based on Short Tandem Repeat (STR) profiles. Frequencies of 29 genetic markers from the Promega website for four distinct population groups (African Americans, Asians, Caucasians, Hispanic Americans) were used to generate 360,000 profiles (90000 profiles per group), which were later used to train and test a range of machine learning algorithms with the goal of establishing the most optimal model for accurate ancestry prediction. The chosen models (Decision Trees, Support Vector Machines, XGBoost, among others) were deployed in Python, and their performance was compared. The XGBoost model outperformed others, displaying significant predictive power with an accuracy rating of 94.24 % for all four classes, and an accuracy rating of 99.06 % on a differentiation task involving Asian, African American, and Caucasian subsamples and an accuracy rating of 98.57 % when differentiating between the African-American, Asian, and the mixed group combining Caucasians and Hispanics. Evaluating the impact of training set size revealed that model accuracy peaked at 94 % with 90,000 profiles per category, but decreased to 83 % as the number of profiles per category was reduced to 500, particularly affecting precision when distinguishing between Caucasian and Hispanic subgroups. The study further investigated the impact of marker quantity on model accuracy, finding that the use of 21 markers, commonly available in commercial amplification kits, resulted in an accuracy of 96.3 % for African Americans, Asians, and Caucasians, and 88.28 % for all four groups combined. These findings underscore the potential of STR-based models in forensic analysis and hint at the broader applicability of machine learning in genetic ancestry determination, with implications for enhancing the precision and reliability of forensic investigations, particularly in heterogeneous environments where ancestral background can be a crucial piece of information.</p>","PeriodicalId":94012,"journal":{"name":"Forensic science international. Genetics","volume":"76 ","pages":"103183"},"PeriodicalIF":0.0,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142786984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}