{"title":"The ancestry of a genome","authors":"Elizabeth Thompson","doi":"10.1101/2024.08.30.610585","DOIUrl":"https://doi.org/10.1101/2024.08.30.610585","url":null,"abstract":"Motivated originally by the increasing number of examples where a lone member of a once thriving population or even species now survives, we investigate what genomes of that population the survivor may represent. More generally it is of interest to consider what genomes of our ancestry each of us may represent. We consider only diploid dioecious organisms, and consider primarily the ancestry of a haploid genome, for example the maternal autosomes of the focal individual. Our ancestors are many and, in an unbounded population, increase exponentially in number. Our genetic ancestors are few, bounded by the number of ancestral genome segments which increases linearly over past generations. First we show that the major loss of potential ancestral lineages is at 8-11 generations, and that thereafter the number of genetic ancestors increases approximately linearly, but does not approach the upper bound. Over many generations, there remain tightly linked but not contiguous segments that result from the same ancestral lineage. Second we analyze the process of these \"repeated\" ancestral segments that continue to be formed in distant ancestry, even as others are lost by recombination. Thirdly, we consider the effect of a finite population, with one model of a geographically structured population. Ancestors are many, and soon fill the entire species range even with low migration rates. Genetic ancestors are not only few, but remain geographically local.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mispitools: An R Package for Comprehensive Statistical Methods in Kinship Inference","authors":"Franco Marsico","doi":"10.1101/2024.08.16.608307","DOIUrl":"https://doi.org/10.1101/2024.08.16.608307","url":null,"abstract":"The search for missing persons is a complex process that involves the comparison of data from two entities: unidentified persons (UP), who may be alive or deceased, and missing persons (MP), whose whereabouts are unknown. Although existing tools support DNA-based kinship analyses for the search, they typically do not integrate or statistically evaluate diverse lines of evidence collected throughout the investigative process. Examples of alternative lines of evidence are pigmentation traits, biological sex, and age, among others. The package Mispitools fills this gap by providing comprehensive statistical methods adapted to a holistic investigation workflow. Mispitools systematically assesses the data from each investigative stage, computing the statistical weight of various types of evidence through a likelihood ratio (LR) approach. It also provides models for combining obtained LRs. Furthermore, Mispitools offers customized visualizations and a user-friendly interface, broadening its applicability among forensic practitioners and genealogical researchers.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiazheng Zhu, Georgios Kalantzis, Ali Pazokitoroudi, Árni Freyr Gunnarsson, Hrushikesh Loya, Han Chen, Sriram Sankararaman, Pier Francesco Palamara
{"title":"Fast variance component analysis using large-scale ancestral recombination graphs","authors":"Jiazheng Zhu, Georgios Kalantzis, Ali Pazokitoroudi, Árni Freyr Gunnarsson, Hrushikesh Loya, Han Chen, Sriram Sankararaman, Pier Francesco Palamara","doi":"10.1101/2024.08.31.610262","DOIUrl":"https://doi.org/10.1101/2024.08.31.610262","url":null,"abstract":"Recent algorithmic advancements have enabled the inference of genome-wide ancestral recombination graphs (ARGs) from genomic data in large cohorts. These inferred ARGs provide a detailed representation of genealogical relatedness along the genome and have been shown to complement genotype imputation in complex trait analyses by capturing the effects of unobserved genomic variants. An inferred ARG can be used to construct a genetic relatedness matrix, which can be leveraged within a linear mixed model for the analysis of complex traits. However, these analyses are computationally infeasible for large datasets. We introduce a computationally efficient approach, called ARG-RHE, to estimate narrow-sense heritability and perform region-based association testing using an ARG. ARG-RHE relies on scalable randomized algorithms to estimate variance components and assess their statistical significance, and can be applied to multiple quantitative traits in parallel. We conduct extensive simulations to verify the computational efficiency, statistical power, and robustness of this approach. We then apply it to detect associations between 21,374 genes and 52 blood-related traits, using an ARG inferred from genotype data of 337,464 individuals from the UK Biobank. In these analyses, combining ARG-based and imputation-based testing yields 8% more gene-trait associations than using imputation alone, suggesting that inferred genome-wide genealogies may effectively complement genotype imputation in the analysis of complex traits.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"83 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruth V Nichols, Lauren Rylaarsdam, Brendan L O'Connell, Zohar Shipony, Nika Iremadze, Sonia N Acharya, Andrew Adey
{"title":"Atlas-scale Single-cell DNA Methylation Profiling with sciMETv3","authors":"Ruth V Nichols, Lauren Rylaarsdam, Brendan L O'Connell, Zohar Shipony, Nika Iremadze, Sonia N Acharya, Andrew Adey","doi":"10.1101/2024.08.29.610369","DOIUrl":"https://doi.org/10.1101/2024.08.29.610369","url":null,"abstract":"Single-cell methods to assess DNA methylation have not yet achieved the same level of cell throughput compared to other modalities. Here, we describe sciMETv3, a combinatorial indexing-based technique that builds on our prior technology, sciMETv2. SciMETv3 achieves nearly a 100-fold improvement in cell throughput by increasing the index space while simultaneously reducing hands-on time and total costs per experiment. To reduce the sequencing burden of the assay, we demonstrate compatibility of sciMETv3 with capture techniques that enrich for regulatory regions, as well as the ability to leverage enzymatic conversion which can yield higher library diversity. We showcase the throughput of sciMETv3 by producing a >140k cell library from human middle frontal gyrus split across four multiplexed individuals using both Illumina and Ultima sequencing instrumentation. This library was prepared over two days by one individual and required no expensive equipment (e.g. a flow sorter, as required by sciMETv2). The same experiment produced an estimated 650k additional cells that were not sequenced, representing the power of sciMETv3 to meet the throughput needs of the most demanding atlas-scale projects. Finally, we demonstrate the compatibility of sciMETv3 with multimodal assays by introducing sciMET+ATAC, which will enable high-throughput exploration of the interplay between two layers of epigenetic regulation within the same cell, as well as the ability to directly integrate single-cell methylation datasets with existing single-cell ATAC-seq.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faidra Karkala, Roy B. Simons, Floor Claessens, Vivian Kalamara, Manfred Kayser, Athina Vidaki
{"title":"qBiCo: A method to assess global DNA conversion performance in epigenetics via single-copy genes and repetitive elements","authors":"Faidra Karkala, Roy B. Simons, Floor Claessens, Vivian Kalamara, Manfred Kayser, Athina Vidaki","doi":"10.1101/2024.08.29.610354","DOIUrl":"https://doi.org/10.1101/2024.08.29.610354","url":null,"abstract":"Human DNA methylation profiling offers great promises in various biomedical applications, including ageing, cancer and even forensics. So far, most DNA methylation techniques are based on a chemical process called sodium bisulfite conversion, which specifically converts non-methylated cytosines into uracils. However, despite the popularity of this approach, it is known to cause DNA fragmentation and loss affecting standardization, while incomplete conversion may result in potential misinterpretation of methylation-based outcomes. To offer the community a solution, we developed qBiCo - a novel quality-control method to address the quantity and quality of bisulfite-converted DNA. qBiCo is a 5-plex, TaqMan® probe-based, quantitative (q)PCR assay that amplifies single- and multi-copy DNA fragments of converted and non-converted nature. It estimates four parameters: converted DNA concentration, fragmentation, global conversion efficiency, and potential PCR inhibition. We optimized qBiCo using synthetic DNA standards and assessed it using standard developmental validation criteria, showcasing that qBiCo is reliable, robust and sensitive down to picogram level. We also evaluated its performance by testing decreasing DNA amounts using several commercial bisulfite conversion kits. Depending on the starting DNA quantity, bisulfite-converted DNA recoveries ranged from 8.5 – 100 %, conversion efficiencies from 78 – 99.9 %, while certain kits highly fragment DNA, demonstrating large variability in their performance. Towards building a prototype tool, we further optimized key functionalities, for example, by replacing the poorest performing single-plex assay and creating a more representative DNA standard. Aiming to scale-up and move towards implementation, we successfully transferred and validated our novel method in six different qPCR platforms from different major manufacturers. Overall, with the present study, we offer researchers in the epigenetic field a novel long-awaited QC tool that for the first time allows them to measure key quality and quantity parameters of the most popular DNA conversion process. The tool also enables standardization to prevent inconsistent data and false outcomes in the future, regardless of the downstream experimental analysis of DNA methylation-based research and applications across different fields of biology and biomedicine.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temperature affects recombination rate plasticity and meiotic success between thermotolerant and cold tolerant yeast species","authors":"Jessica McNeill, Nathan Brandt, Enrique J. Schwarzkopf, Mili Jimenez Gallardo, Caiti Smukowski Heil","doi":"10.1101/2024.08.28.610152","DOIUrl":"https://doi.org/10.1101/2024.08.28.610152","url":null,"abstract":"Meiosis is required for the formation of gametes in all sexually reproducing species and the process is well conserved across the tree of life. However, meiosis is sensitive to a variety of external factors, which can impact chromosome pairing, recombination, and fertility. For example, the optimal temperature for successful meiosis varies between species of plants and animals. This suggests that meiosis is temperature sensitive, and that natural selection may act on variation in meiotic success as organisms adapt to different environmental conditions. To understand how temperature alters the successful completion of meiosis, we utilized two species of the budding yeast <em>Saccharomyces</em> with different temperature preferences: thermotolerant <em>Saccharomyces cerevisiae</em> and cold tolerant <em>Saccharomyces uvarum</em>. We surveyed three metrics of meiosis: sporulation efficiency, spore viability, and recombination rate in multiple strains of each species. As per our predictions, the proportion of cells that complete meiosis and form spores is temperature sensitive, with thermotolerant <em>S. cerevisiae</em> having a higher temperature threshold for successful meiosis than cold tolerant <em>S. uvarum</em>. We confirmed previous observations that <em>S. cerevisiae</em> recombination rate varies between strains and across genomic regions, and add new results that <em>S. uvarum</em> has higher recombination rates than <em>S. cerevisiae</em>. We find that temperature significantly influences recombination rate plasticity in <em>S. cerevisiae</em> and <em>S. uvarum</em>, in agreement with studies in animals and plants. Overall, these results suggest that meiotic thermal sensitivity is associated with organismal thermal tolerance, and may even result in temporal reproductive isolation as populations diverge in thermal profiles.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanyu Liang, 23andMe Research Team, Adam Auton, Xin Wang
{"title":"Credible set is sensitive to imputation quality and missing variants","authors":"Yanyu Liang, 23andMe Research Team, Adam Auton, Xin Wang","doi":"10.1101/2024.08.28.610135","DOIUrl":"https://doi.org/10.1101/2024.08.28.610135","url":null,"abstract":"Bayesian fine-mapping to obtain credible sets has been widely applied post GWAS to pinpoint causal variants. The calculation of credible sets generally assumes that all variants have been equally well genotyped, which is often not the case when a GWAS has been run on imputed data. In this work, we investigate the behavior of credible sets in imputed datasets utilizing 'held out' genotyped variants to measure accuracy. We show, via simulation, that: i) the coverage of credible sets decreases when using imputed variants in GWAS; ii) rare causal variants often fail to be tagged in credible sets when they are not present in the GWAS variant set. We develop a reweighting approach to take imputation quality into account during fine-mapping that only requires summary statistics, and demonstrate the approach with real data.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rakesh Kumar, Fan Zhang, Shreyas Niphadkar, Chisom Onu, Anil Kumar Vijjamarri, Miriam L Greenberg, Sunil Laxman, Alan G Hinnebusch
{"title":"Decapping activators Edc3 and Scd6 act redundantly with Dhh1 in post-transcriptional repression of starvation-induced pathways","authors":"Rakesh Kumar, Fan Zhang, Shreyas Niphadkar, Chisom Onu, Anil Kumar Vijjamarri, Miriam L Greenberg, Sunil Laxman, Alan G Hinnebusch","doi":"10.1101/2024.08.28.610059","DOIUrl":"https://doi.org/10.1101/2024.08.28.610059","url":null,"abstract":"Degradation of many yeast mRNAs involves decapping by the Dcp1:Dcp2 complex. Previous studies on decapping activators Edc3 and Scd6 suggested their limited roles in mRNA decay. RNA-seq analysis of mutants lacking one or both proteins revealed that Scd6 and Edc3 have largely redundant activities in targeting numerous mRNAs for degradation that are masked in the single mutants. These transcripts also are frequently targeted by decapping activators Dhh1 and Pat1, and the collective evidence suggests that Scd6/Edc3 act interchangeably to recruit Dhh1 to Dcp2. Ribosome profiling shows that redundancy between Scd6 and Edc3 and their functional interactions with Dhh1 and Pat1 extend to translational repression of particular transcripts, including a cohort of poorly translated mRNAs displaying interdependent regulation by all four factors. Scd6/Edc3 also participate with Dhh1/Pat1 in post-transcriptional repression of proteins required for respiration and catabolism of alternative carbon sources, which are normally expressed only in limiting glucose. Simultaneously eliminating Scd6/Edc3 increases mitochondrial membrane potential and elevates metabolites of the tricarboxylic acid and glyoxylate cycles typically observed only during growth in low glucose. Thus, Scd6/Edc3 act redundantly, in parallel with Dhh1 and in cooperation with Pat1, to adjust gene expression to nutrient availability by controlling mRNA decapping and decay.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"109 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaison Jeevan Sequeira, M Chaitra, Ananya Rai N R, M Sudeepthi, R Shalini, Mohammed S Mustak, Jagriti Khanna, Shivkant Sharma, Rajendra V E Chilukuri, George van Driem, Pankaj Shrivastava
{"title":"Y chromosome STR variation reveals traditional occupation based population structure in India","authors":"Jaison Jeevan Sequeira, M Chaitra, Ananya Rai N R, M Sudeepthi, R Shalini, Mohammed S Mustak, Jagriti Khanna, Shivkant Sharma, Rajendra V E Chilukuri, George van Driem, Pankaj Shrivastava","doi":"10.1101/2024.08.28.610024","DOIUrl":"https://doi.org/10.1101/2024.08.28.610024","url":null,"abstract":"Earlier models of grouping Indian populations were based on language families, social stratification and geographical location. Such grouping system has often resulted in oversimplification of ancestry inferences. Moreover, we do not find many studies focused on studying the variation within these groups and the role of past demographic events in shaping them. We analysed the Y-chromosome Short Tandem Repeats haplotypes from 8153 males from India and Eurasia to explore the impact of Holocene migration on the Indian gene pool. We used haplotype variation and date estimates to understand the characteristics of each haplogroup with respect to the different grouping models. Our findings show that the Neolithic agricultural expansion has had a strong influence in shaping the male gene pool of the Indian subcontinent. Haplogroups F, L and R1a contribute greatly towards stratifying Indian populations as hunter-gatherer related, farming-related and priestly groups respectively. Although the caste system enforced endogamy, a traditional occupation based admixture existed since the Neolithic times. Dispersal of haplogroup L from the Near East played a major role in the formation of an agriculturist population that formed an intermediary between the primitive tribes and the R1a-rich priestly group. This study shows that the frequency of R1a in the hunter-gatherer tribes (1.5%) is much lower than previously reported based on other models of population clustering.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Allison E Mann, Eyðfinn Magnussen, Christopher R Tillquist
{"title":"Early founder effects have determined paternal population structure in the Faroe Islands","authors":"Allison E Mann, Eyðfinn Magnussen, Christopher R Tillquist","doi":"10.1101/2024.08.27.601563","DOIUrl":"https://doi.org/10.1101/2024.08.27.601563","url":null,"abstract":"The Faroe Islands are a small archipelago located in the North Atlantic likely colonized by a small group of founders sometime between 50 and 300 CE. Post colonization, the Faroese people have been largely isolated from admixture with mainland and other island populations in the region. As such, the initial founder effect and subsequent genetic drift are likely major contributors to the modern genetic diversity found among the Faroese. In this study, we assess the utility of Y-chromosomal microsatellites to detect founder effect in the Faroe Islands through the construction of haplotype networks and a novel empirical method, mutational distance from modal haplotype histograms (MDM), for the visualization and evaluation of population bottlenecks. We compared samples from the Faroe Islands and Iceland to possible regional source populations and documented a loss of diversity associated with founder events. Additionally, within-haplogroup diversity statistics reveals lower haplotype diversity and richness within both the Faroe Islands and Iceland, consistent with a small founder population colonizing both regions. However, in the within-haplogroup networks, the Faroe Islands are found within the larger set of potential source populations while Iceland is consistently found on isolated branches. Moreover, comparisons of within-haplogroup MDM histograms document a clear founder signal in the Faroes and Iceland, but the strength of this signal is haplogroup-dependent which may be indicative of more recent admixture or other demographic processes. The results of the current study and lack of conformity between Icelandic and Faroese haplotypes implies that the two populations were founded by different paternal gene pools and there is no detectable post-founder admixture between the two groups.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}