{"title":"基于小波的方法可对三维基因组结构进行定量、无标度和分层描述,并提供新的生物学见解","authors":"Ryan Pellow, J. M. Comeron","doi":"10.1101/2024.07.12.603291","DOIUrl":null,"url":null,"abstract":"Eukaryotes fold their genomes within nuclei in three-dimensional space, with coordinated multiscale structures including loops, topologically associating domains (TADs), and higher-order chromosome territories. This 3D organization plays essential roles in gene regulation and development, responses to physiological stress, and disease. However, current methodologies to infer these 3D structures from genomic data have limitations. These include varying outcomes depending on the resolution of the analysis and sequencing depth, qualitative results that hinder statistical comparisons, lack of insight into the frequency of the structures in samples with many genomes, and no direct inference of hierarchical structures. These shortcomings can make it difficult for the rigorous comparison of 3D properties across genomes, between experimental conditions, or species. To address these challenges, we developed a wavelet transform-based method (WaveTAD) that describes the 3D nuclear organization in a resolution-free, probabilistic, and hierarchical manner. WaveTAD generates probabilities that capture the variable frequency within samples and shows increased accuracy and sensitivity compared to current approaches. We applied WaveTAD to multiple datasets from Drosophila, mouse, and humans to illustrate new biological insights that our more sensitive and quantitative approach provides, such as the widespread presence of embryonic 3D organization before zygotic genome activation, the effect of multiple CTCF units on the stability of loops and TADs, and the association between gene expression and TAD structures in COVID-19 patients or sex-specific transcription in Drosophila.","PeriodicalId":9124,"journal":{"name":"bioRxiv","volume":"13 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A wavelet-based approach generates quantitative, scale-free and hierarchical descriptions of 3D genome structures and new biological insights\",\"authors\":\"Ryan Pellow, J. M. Comeron\",\"doi\":\"10.1101/2024.07.12.603291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Eukaryotes fold their genomes within nuclei in three-dimensional space, with coordinated multiscale structures including loops, topologically associating domains (TADs), and higher-order chromosome territories. This 3D organization plays essential roles in gene regulation and development, responses to physiological stress, and disease. However, current methodologies to infer these 3D structures from genomic data have limitations. These include varying outcomes depending on the resolution of the analysis and sequencing depth, qualitative results that hinder statistical comparisons, lack of insight into the frequency of the structures in samples with many genomes, and no direct inference of hierarchical structures. These shortcomings can make it difficult for the rigorous comparison of 3D properties across genomes, between experimental conditions, or species. To address these challenges, we developed a wavelet transform-based method (WaveTAD) that describes the 3D nuclear organization in a resolution-free, probabilistic, and hierarchical manner. WaveTAD generates probabilities that capture the variable frequency within samples and shows increased accuracy and sensitivity compared to current approaches. We applied WaveTAD to multiple datasets from Drosophila, mouse, and humans to illustrate new biological insights that our more sensitive and quantitative approach provides, such as the widespread presence of embryonic 3D organization before zygotic genome activation, the effect of multiple CTCF units on the stability of loops and TADs, and the association between gene expression and TAD structures in COVID-19 patients or sex-specific transcription in Drosophila.\",\"PeriodicalId\":9124,\"journal\":{\"name\":\"bioRxiv\",\"volume\":\"13 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.07.12.603291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.12.603291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
真核生物将其基因组折叠在三维空间的细胞核内,具有协调的多尺度结构,包括环状结构、拓扑关联域(TAD)和高阶染色体区域。这种三维组织在基因调控、发育、生理应激反应和疾病中发挥着至关重要的作用。然而,目前从基因组数据中推断这些三维结构的方法有其局限性。这些限制包括:分析的分辨率和测序深度不同,结果也不同;定性结果妨碍了统计比较;无法深入了解结构在多基因组样本中的频率;无法直接推断层次结构。这些缺陷使得我们难以对不同基因组、不同实验条件或不同物种的三维特性进行严格比较。为了应对这些挑战,我们开发了一种基于小波变换的方法(WaveTAD),它能以无分辨率、概率和分层的方式描述三维核组织。WaveTAD 生成的概率能捕捉样本内的可变频率,与目前的方法相比,其准确性和灵敏度都有所提高。我们将 WaveTAD 应用于果蝇、小鼠和人类的多个数据集,以说明我们这种更灵敏、更定量的方法所提供的新的生物学见解,如胚胎三维组织在子代基因组激活前的广泛存在、多个 CTCF 单元对环路和 TAD 稳定性的影响、COVID-19 患者的基因表达与 TAD 结构之间的关联或果蝇的性别特异性转录。
A wavelet-based approach generates quantitative, scale-free and hierarchical descriptions of 3D genome structures and new biological insights
Eukaryotes fold their genomes within nuclei in three-dimensional space, with coordinated multiscale structures including loops, topologically associating domains (TADs), and higher-order chromosome territories. This 3D organization plays essential roles in gene regulation and development, responses to physiological stress, and disease. However, current methodologies to infer these 3D structures from genomic data have limitations. These include varying outcomes depending on the resolution of the analysis and sequencing depth, qualitative results that hinder statistical comparisons, lack of insight into the frequency of the structures in samples with many genomes, and no direct inference of hierarchical structures. These shortcomings can make it difficult for the rigorous comparison of 3D properties across genomes, between experimental conditions, or species. To address these challenges, we developed a wavelet transform-based method (WaveTAD) that describes the 3D nuclear organization in a resolution-free, probabilistic, and hierarchical manner. WaveTAD generates probabilities that capture the variable frequency within samples and shows increased accuracy and sensitivity compared to current approaches. We applied WaveTAD to multiple datasets from Drosophila, mouse, and humans to illustrate new biological insights that our more sensitive and quantitative approach provides, such as the widespread presence of embryonic 3D organization before zygotic genome activation, the effect of multiple CTCF units on the stability of loops and TADs, and the association between gene expression and TAD structures in COVID-19 patients or sex-specific transcription in Drosophila.