Genome researchPub Date : 2025-03-31DOI: 10.1101/gr.279428.124
Yu-Chi Chen, David LJ Vendrami, Maximilian L Huber, Luisa EY Handel, Christopher R Cooney, Joseph Ivan Hoffman, Toni I Gossmann
{"title":"Diverse evolutionary trajectories of mitocoding DNA in mammalian and avian nuclear genomes","authors":"Yu-Chi Chen, David LJ Vendrami, Maximilian L Huber, Luisa EY Handel, Christopher R Cooney, Joseph Ivan Hoffman, Toni I Gossmann","doi":"10.1101/gr.279428.124","DOIUrl":"https://doi.org/10.1101/gr.279428.124","url":null,"abstract":"Sporadically genetic material that originates from an organelle genome integrates into the nuclear genome. However it is unclear what processes maintain such integrations over evolutionary time. Recently it was shown that nuclear DNA of mitochondrial origin (NUMTs) may harbour genes with intact mitochondrial reading frames despite the fact that they are highly divergent from the host's mitochondrial genome. Two major hypotheses have been put forward to explain the existence of such mitocoding nuclear genes: (i) recent introgression from another species and (ii) long-term selection. To investigate whether these intriguing possibilities play a role we scanned the genomes of more than 1,000 avian and mammalian species for NUMTs. We show that the subclass of divergent NUMTs harbouring mitogenes with intact reading frames are widespread across mammals and birds. We show that some of these NUMTs appear to have similarity across species. We also demonstrate that many mitochondrial-coding NUMTs exhibit signs of long-term selection. In a subset of these NUMT genes, we detected evolutionary signals consistent with adaptive evolution, including one human NUMT shared among seven ape species. These findings suggest that NUMT insertions may occasionally be functional.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"58 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143736572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-26DOI: 10.1101/gr.279414.124
Wouter Steyaert, Lydia Sagath, German Demidov, Vicente A. Yépez, Anna Esteve-Codina, Julien Gagneur, Kornelia Ellwanger, Ronny Derks, Marjan Weiss, Amber den Ouden, Simone van den Heuvel, Hilde Swinkels, Nick Zomer, Marloes Steehouwer, Luke O'Gorman, Galuh Astuti, Kornelia Neveling, Rebecca Schüle, Jishu Xu, Matthis Synofzik, Danique Beijer, Holger Hengel, Ludger Schöls, Kristl G. Claeys, Jonathan Baets, Liedewei Van de Vondel, Alessandra Ferlini, Rita Selvatici, Heba Morsy, Marwa Saeed Abd Elmaksoud, Volker Straub, Juliane Müller, Veronica Pini, Luke Perry, Anna Sarkozy, Irina Zaharieva, Francesco Muntoni, Enrico Bugiardini, Kiran Polavarapu, Rita Horvath, Evan Reid, Hanns Lochmüller, Marco Spinazzi, Marco Savarese, Solve-RD DITF-ITHACA, Solve-RD DITF-Euro-NMD, Solve-RD DITF-RND, Solve-RD DITF-EpiCARE, Leslie Matalonga, Steven Laurie, Han G. Brunner, Holm Graessner, Sergi Beltran, Stephan Ossowski, Lisenka E.L.M. Vissers, Christian Gilissen, Alexander Hoischen, on behalf of the Solve-RD consortium
{"title":"Unraveling undiagnosed rare disease cases by HiFi long-read genome sequencing","authors":"Wouter Steyaert, Lydia Sagath, German Demidov, Vicente A. Yépez, Anna Esteve-Codina, Julien Gagneur, Kornelia Ellwanger, Ronny Derks, Marjan Weiss, Amber den Ouden, Simone van den Heuvel, Hilde Swinkels, Nick Zomer, Marloes Steehouwer, Luke O'Gorman, Galuh Astuti, Kornelia Neveling, Rebecca Schüle, Jishu Xu, Matthis Synofzik, Danique Beijer, Holger Hengel, Ludger Schöls, Kristl G. Claeys, Jonathan Baets, Liedewei Van de Vondel, Alessandra Ferlini, Rita Selvatici, Heba Morsy, Marwa Saeed Abd Elmaksoud, Volker Straub, Juliane Müller, Veronica Pini, Luke Perry, Anna Sarkozy, Irina Zaharieva, Francesco Muntoni, Enrico Bugiardini, Kiran Polavarapu, Rita Horvath, Evan Reid, Hanns Lochmüller, Marco Spinazzi, Marco Savarese, Solve-RD DITF-ITHACA, Solve-RD DITF-Euro-NMD, Solve-RD DITF-RND, Solve-RD DITF-EpiCARE, Leslie Matalonga, Steven Laurie, Han G. Brunner, Holm Graessner, Sergi Beltran, Stephan Ossowski, Lisenka E.L.M. Vissers, Christian Gilissen, Alexander Hoischen, on behalf of the Solve-RD consortium","doi":"10.1101/gr.279414.124","DOIUrl":"https://doi.org/10.1101/gr.279414.124","url":null,"abstract":"Solve-RD is a pan-European rare disease (RD) research program that aims to identify disease-causing genetic variants in previously undiagnosed RD families. We utilized 10-fold coverage HiFi long-read sequencing (LRS) for detecting causative structural variants (SVs), single-nucleotide variants (SNVs), insertion-deletions (indels), and short tandem repeat (STR) expansions in previously studied RD families without a clear molecular diagnosis. Our cohort includes 293 individuals from 114 genetically undiagnosed RD families selected by European Reference Network (ERN) experts. Of these, 21 families were affected by so-called “unsolvable” syndromes for which genetic causes remain unknown and for which prior testing was not a prerequisite. The remaining 93 families had at least one individual affected by a rare neurological, neuromuscular, or epilepsy disorder without a genetic diagnosis despite extensive prior testing. Clinical interpretation and orthogonal validation of variants in known disease genes yielded 12 novel genetic diagnoses due to de novo and rare inherited SNVs, indels, SVs, and STR expansions. In an additional five families, we identified a candidate disease-causing variant, including an <em>MCF2</em>/<em>FGF13</em> fusion and a <em>PSMA3</em> deletion. However, no common genetic cause was identified in any of the “unsolvable” syndromes. Taken together, we found (likely) disease-causing genetic variants in 11.8% of previously unsolved families and additional candidate disease-causing SVs in another 5.4% of these families. In conclusion, our results demonstrate the potential added value of HiFi long-read genome sequencing in undiagnosed rare diseases.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"35 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143712967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-24DOI: 10.1101/gr.279943.124
Inswasti Cahyani, John Tyson, Nadine Holmes, Joshua Quick, Chris Moore, Nicholas James Loman, Matt Loose
{"title":"An optimized toolkit for high molecular weight DNA extraction and ultra-long-read nanopore sequencing using glass beads and hexamminecobalt(III) chloride","authors":"Inswasti Cahyani, John Tyson, Nadine Holmes, Joshua Quick, Chris Moore, Nicholas James Loman, Matt Loose","doi":"10.1101/gr.279943.124","DOIUrl":"https://doi.org/10.1101/gr.279943.124","url":null,"abstract":"Since the advent of long- read sequencing, achieving longer read lengths has been a key goal for many users. Ultra-long read sets (N50 > 100 kb) produced from Oxford Nanopore sequencers have improved genome assemblies in recent years. However, despite progress in extraction protocols and library preparation methods, ultra-long sequencing remains challenging for many sample types. Here we compare various methods and introduce the FindingNemo protocol that: (1) optimizes ultra-high molecular weight (UHMW) DNA extraction and library clean-up by using glass beads and Hexamminecobalt(III) chloride (CoHex), (2) can deliver high ultra-long sequencing yield of >20 Gb of reads from a single MinION flow cell or >100 Gb from PromethION devices (R9.4 to R10.4 pore variants), and (3) is scalable to using fewer input cells or lower DNA amounts, with extraction to sequencing possible in a single working day. By comparison, we demonstrate that this protocol surpasses previous methods by enabling precise determination of input DNA quantity and quality through cell counting, sample dilution, and homogenization techniques.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"10 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143695663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-24DOI: 10.1101/gr.280092.124
Reuben M. Buckley, Nuket Bilgen, Alexander C. Harris, Peter Savolainen, Cafer Tepeli, Metin Erdoğan, Aitor Serres Armero, Dayna L. Dreger, Frank G. van Steenbeek, Marjo K. Hytönen, Heidi G Parker, Jessica Hale, Hannes Lohi, Bengi Çınar Kul, Adam R. Boyko, Elaine A. Ostrander
{"title":"Analysis of canine gene constraint identifies new variants for orofacial clefts and stature","authors":"Reuben M. Buckley, Nuket Bilgen, Alexander C. Harris, Peter Savolainen, Cafer Tepeli, Metin Erdoğan, Aitor Serres Armero, Dayna L. Dreger, Frank G. van Steenbeek, Marjo K. Hytönen, Heidi G Parker, Jessica Hale, Hannes Lohi, Bengi Çınar Kul, Adam R. Boyko, Elaine A. Ostrander","doi":"10.1101/gr.280092.124","DOIUrl":"https://doi.org/10.1101/gr.280092.124","url":null,"abstract":"Dog breeding promotes within-group homogeneity through conformation to strict breed standards, while simultaneously driving between-group heterogeneity. There are over 350 recognized dog breeds that provide the foundation for investigating the genetic basis of phenotypic diversity. Typically, breed standard phenotypes such as stature, pelage, and craniofacial structure are analyzed through genetic association studies. However, such analyses are limited to assayed phenotypes only, leaving difficult to measure phenotypic subtleties easily overlooked. We investigated coding variation from over 2,000 dogs, leading to discoveries of variants related to craniofacial morphology and stature. Breed-enriched variants were prioritized according to gene constraint, which was calculated using a mutation model derived from trinucleotide substitution probabilities. Among the newly found variants was a splice-acceptor variant in <em>PDGFRA</em> associated with bifid nose, a characteristic trait of Çatalburun dogs, implicating the gene's role in midline closure. Two additional <em>LCORL</em> variants, both associated with canine body size were also discovered: a frameshift that causes a premature stop in large breeds (>25 kg) and an intronic substitution found in small breeds (<10 kg), thus highlighting the importance of allelic heterogeneity in selection for breed traits. Most variants prioritized in this analysis were not associated with genomic signatures for breed differentiation, as these regions were enriched for constrained genes intolerant to nonsynonymous variation. This indicates trait selection in dogs is likely a balancing act between preserving essential gene functions and maximizing regulatory variation to drive phenotypic extremes.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"25 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143695664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-21DOI: 10.1101/gr.279392.124
John Beaulaurier, Lynn Ly, J. Andrew Duty, Carly Tyer, Christian Stevens, Chuan-Tien Hung, Akash Sookdeo, Alex W. Drong, Shreyas Kowdle, Axel Solis-Guzman, Domenico Tortorella, Daniel J. Turner, Sissel Juul, Scott Hickey, Benhur Lee
{"title":"De novo antibody identification in human blood from full-length single B cell transcriptomics and matching haplotype-resolved germline assemblies","authors":"John Beaulaurier, Lynn Ly, J. Andrew Duty, Carly Tyer, Christian Stevens, Chuan-Tien Hung, Akash Sookdeo, Alex W. Drong, Shreyas Kowdle, Axel Solis-Guzman, Domenico Tortorella, Daniel J. Turner, Sissel Juul, Scott Hickey, Benhur Lee","doi":"10.1101/gr.279392.124","DOIUrl":"https://doi.org/10.1101/gr.279392.124","url":null,"abstract":"Immunoglobulin (<em>IGH</em>, <em>IGK</em>, <em>IGL</em>) loci in the human genome are highly polymorphic regions that encode the building blocks of the light and heavy chain IG proteins that dimerize to form antibodies. The processes of V(D)J recombination and somatic hypermutation in B cells are responsible for creating an enormous reservoir of highly specific antibodies capable of binding a vast array of possible antigens. However, the antibody repertoire is fundamentally limited by the set of variable (V), diversity (D), and joining (J) alleles present in the germline IG loci. To better understand how the germline IG haplotypes contribute to the expressed antibody repertoire, we combined genome sequencing of the germline IG loci with single-cell transcriptome sequencing of B cells from the same donor. Sequencing and assembly of the germline IG loci captured the <em>IGH</em> locus in a single fully phased contig where the maternal and paternal contributions to the germline V, D, and J repertoire can be fully resolved. The B cells were collected following a measles, mumps, and rubella (MMR) vaccination, resulting in a population of cells that were activated in response to this specific immune challenge. Single-cell, full-length transcriptome sequencing of these B cells results in whole transcriptome characterization of each cell, as well as highly accurate consensus sequences for the somatically rearranged and hypermutated light and heavy chain IG transcripts. A subset of antibodies synthesized based on their consensus heavy and light chain transcript sequences demonstrate binding to measles antigens and neutralization of authentic measles virus.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"14 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143672337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-20DOI: 10.1101/gr.279240.124
Cristian Groza, Bing Ge, Warren A. Cheung, Tomi Pastinen, Guillaume Bourque
{"title":"Expanded methylome and quantitative trait loci detection by long-read profiling of personal DNA","authors":"Cristian Groza, Bing Ge, Warren A. Cheung, Tomi Pastinen, Guillaume Bourque","doi":"10.1101/gr.279240.124","DOIUrl":"https://doi.org/10.1101/gr.279240.124","url":null,"abstract":"Structural variants (SVs) are omnipresent in human DNA, yet their genotype and methylation statuses are rarely characterized due to previous limitations in genome assembly and detection of modified nucleotides. Also, the extent to which SVs act as methylation quantitative trait loci (SV-mQTLs) is largely unknown. Here, we generated a pangenome graph summarizing SVs in 782 de novo assemblies obtained from Genomic Answers for Kids, capturing 14.6 million CpG dinucleotides that are absent from the CHM13v2 reference (SV-CpGs), thus expanding their number by 43.6%. Using 435 methylomes, we genotyped 4.06 million SV-CpGs, of which 3.93 million (96.8%) are methylated at least once. Nonrepeat sequences contribute 1.59 × 10<sup>6</sup> novel SV-CpGs, followed by centromeric satellites (6.57 × 10<sup>5</sup>), simple repeats (5.40 × 10<sup>5</sup>), <em>Alu</em> elements (5.07 × 10<sup>5</sup>), satellites (2.17 × 10<sup>5</sup>), LINE-1s (1.83 × 10<sup>5</sup>), and SVA (SINE-VNTR-<em>Alu</em>) elements (1.50 × 10<sup>5</sup>). Centromeric satellites, simple repeats, and SVAs are overrepresented in SV-CpGs versus reference CpGs. Similarly, methylation levels in SV-CpGs are more variable than in reference CpGs. To explore if SVs are potentially causal for functional variation, we measured SV-mQTLs. This revealed over 230,464 methylation bins where the methylation is associated with common SVs within 100 kbp. Finally, we identified 65,659 methylation bins (28.5%) where the leading QTL variant is an SV. In conclusion, we demonstrate that graph pangenomes provide full SV structures, the associated methylation variation, and reveal tens of thousands of SV-mQTLs, underscoring the importance of assembly based analyses of human traits.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"214 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-20DOI: 10.1101/gr.279323.124
Tanner D. Jensen, Bohan Ni, Chloe M. Reuter, John E. Gorzynski, Sarah Fazal, Devon Bonner, Rachel A. Ungar, Pagé C. Goddard, Archana Raja, Euan A. Ashley, Jonathan A. Bernstein, Stephan Zuchner, Undiagnosed Diseases Network, Michael D. Greicius, Stephen B. Montgomery, Michael C. Schatz, Matthew T. Wheeler, Alexis Battle
{"title":"Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease","authors":"Tanner D. Jensen, Bohan Ni, Chloe M. Reuter, John E. Gorzynski, Sarah Fazal, Devon Bonner, Rachel A. Ungar, Pagé C. Goddard, Archana Raja, Euan A. Ashley, Jonathan A. Bernstein, Stephan Zuchner, Undiagnosed Diseases Network, Michael D. Greicius, Stephen B. Montgomery, Michael C. Schatz, Matthew T. Wheeler, Alexis Battle","doi":"10.1101/gr.279323.124","DOIUrl":"https://doi.org/10.1101/gr.279323.124","url":null,"abstract":"Rare structural variants (SVs)—insertions, deletions, and complex rearrangements—can cause Mendelian disease, yet they remain difficult to accurately detect and interpret. We sequenced and analyzed Oxford Nanopore Technologies long-read genomes of 68 individuals from the undiagnosed disease network (UDN) with no previously identified diagnostic mutations from short-read sequencing. Using our optimized SV detection pipelines and 571 control long-read genomes, we detected 716 long-read rare (MAF < 0.01) SV alleles per genome on average, achieving a 2.4× increase from short reads. To characterize the functional effects of rare SVs, we assessed their relationship with gene expression from blood or fibroblasts from the same individuals and found that rare SVs overlapping enhancers were enriched (LOR = 0.46) near expression outliers. We also evaluated tandem repeat expansions (TREs) and found 14 rare TREs per genome; notably, these TREs were also enriched near overexpression outliers. To prioritize candidate functional SVs, we developed Watershed-SV, a probabilistic model that integrates expression data with SV-specific genomic annotations, which significantly outperforms baseline models that do not incorporate expression data. Watershed-SV identified a median of eight high-confidence functional SVs per UDN genome. Notably, this included compound heterozygous deletions in <em>FAM177A1</em> shared by two siblings, which were likely causal for a rare neurodevelopmental disorder. Our observations demonstrate the promise of integrating long-read sequencing with gene expression toward improving the prioritization of functional SVs and TREs in rare disease patients.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"49 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-20DOI: 10.1101/gr.280041.124
Qiuhui Li, Ayse G. Keskus, Justin Wagner, Michal B. Izydorczyk, Winston Timp, Fritz J. Sedlazeck, Alison P. Klein, Justin M. Zook, Mikhail Kolmogorov, Michael C. Schatz
{"title":"Unraveling the hidden complexity of cancer through long-read sequencing","authors":"Qiuhui Li, Ayse G. Keskus, Justin Wagner, Michal B. Izydorczyk, Winston Timp, Fritz J. Sedlazeck, Alison P. Klein, Justin M. Zook, Mikhail Kolmogorov, Michael C. Schatz","doi":"10.1101/gr.280041.124","DOIUrl":"https://doi.org/10.1101/gr.280041.124","url":null,"abstract":"Cancer is fundamentally a disease of the genome, characterized by extensive genomic, transcriptomic, and epigenomic alterations. Most current studies predominantly use short-read sequencing, gene panels, or microarrays to explore these alterations; however, these technologies can systematically miss or misrepresent certain types of alterations, especially structural variants, complex rearrangements, and alterations within repetitive regions. Long-read sequencing is rapidly emerging as a transformative technology for cancer research by providing a comprehensive view across the genome, transcriptome, and epigenome, including the ability to detect alterations that previous technologies have overlooked. In this review, we explore the current applications of long-read sequencing for both germline and somatic cancer analysis. We provide an overview of the computational methodologies tailored to long-read data and highlight key discoveries and resources within cancer genomics that were previously inaccessible with prior technologies. We also address future opportunities and persistent challenges, including the experimental and computational requirements needed to scale to larger sample sizes, the hurdles in sequencing and analyzing complex cancer genomes, and opportunities for leveraging machine learning and artificial intelligence technologies for cancer informatics. We further discuss how the telomere-to-telomere genome and the emerging human pangenome could enhance the resolution of cancer genome analysis, potentially revolutionizing early detection and disease monitoring in patients. Finally, we outline strategies for transitioning long-read sequencing from research applications to routine clinical practice.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"12 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143666087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-20DOI: 10.1101/gr.279930.124
Da-Inn Lee, Sushmita Roy
{"title":"Examining dynamics of three-dimensional genome organization with multitask matrix factorization","authors":"Da-Inn Lee, Sushmita Roy","doi":"10.1101/gr.279930.124","DOIUrl":"https://doi.org/10.1101/gr.279930.124","url":null,"abstract":"Three-dimensional (3D) genome organization, which determines how the DNA is packaged inside the nucleus, has emerged as a key component of the gene regulation machinery. High-throughput chromosome conformation datasets, such as Hi-C, have become available across multiple conditions and timepoints, offering a unique opportunity to examine changes in 3D genome organization and link them to phenotypic changes in normal and diseases processes. However, systematic detection of higher-order structural changes across multiple Hi-C datasets remains a major challenge. Existing computational methods either do not model higher-order structural units or cannot model dynamics across more than two conditions of interest. We address these limitations with Tree-Guided Integrated Factorization (TGIF), a generalizable multitask Non-negative Matrix Factorization (NMF) approach that can be applied to time series or hierarchically related biological conditions. TGIF can identify large-scale changes at compartment or subcompartment levels, as well as local changes at boundaries of topologically associated domains (TADs). Based on benchmarking in simulated and real Hi-C data, TGIF boundaries are more accurate and reproducible across differential levels of noise and sources of technical artifacts, and more enriched in CTCF. Application to three multisample mammalian datasets shows TGIF can detect differential regions at compartment, subcompartment, and boundary levels that are associated with significant changes in regulatory signals and gene expression enriched in tissue-specific processes. Finally, we leverage TGIF boundaries to prioritize sequence variants for multiple phenotypes from the NHGRI GWAS catalog. Taken together, TGIF is a flexible tool to examine 3D genome organization dynamics across disease and developmental processes.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"183 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143666090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome researchPub Date : 2025-03-20DOI: 10.1101/gr.279180.124
Turid Everitt, Tilman Ronneburg, Daniel Elsner, Anna Olsson, Yuanzhen Liu, Tuuli Larva, Judith Korb, Matthew T. Webster
{"title":"Unexpectedly low recombination rates and presence of hotspots in termite genomes","authors":"Turid Everitt, Tilman Ronneburg, Daniel Elsner, Anna Olsson, Yuanzhen Liu, Tuuli Larva, Judith Korb, Matthew T. Webster","doi":"10.1101/gr.279180.124","DOIUrl":"https://doi.org/10.1101/gr.279180.124","url":null,"abstract":"Meiotic recombination is a fundamental evolutionary process that facilitates adaptation and the removal of deleterious genetic variation. Social Hymenoptera exhibit some of the highest recombination rates among metazoans, whereas high recombination rates have not been found among nonsocial species from this insect order. It is unknown whether elevated recombination rates are a ubiquitous feature of all social insects. In many metazoan taxa, recombination is mainly restricted to hotspots a few kilobases in length. However, little is known about the prevalence of recombination hotspots in insect genomes. Here we infer recombination rate and its fine-scale variation across the genomes of two social species from the insect order Blattodea: the termites <em>Macrotermes bellicosus</em> and <em>Cryptotermes secundus</em>. We used linkage-disequilibrium-based methods to infer recombination rate. We infer that recombination rates are close to 1 cM/Mb in both species, similar to the average metazoan rate. We also observed a highly punctate distribution of recombination in both termite genomes, indicative of the presence of recombination hotspots. We infer the presence of full-length <em>PRDM9</em> genes in the genomes of both species, which suggests recombination hotspots in termites might be determined by <em>PRDM9</em>, as they are in mammals. We also find that recombination rates in genes are correlated with inferred levels of germline DNA methylation. The finding of low recombination rates in termites indicates that eusociality is not universally connected to elevated recombination rate. We speculate that the elevated recombination rates in social Hymenoptera are instead promoted by intense selection among haploid males.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"37 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143666088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}