基序分布在基因组提供了洞察基因聚类和共同调控

IF 13.1 2区生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY

Nucleic Acids Research Pub Date : 2024-12-11 DOI:10.1093/nar/gkae1178

Atreyi Chakraborty, Sumant Chopde, Mallur Srivatsan Madhusudhan

{"title":"基序分布在基因组提供了洞察基因聚类和共同调控","authors":"Atreyi Chakraborty, Sumant Chopde, Mallur Srivatsan Madhusudhan","doi":"10.1093/nar/gkae1178","DOIUrl":null,"url":null,"abstract":"We read the genome as proteins in the cell would – by studying the distributions of 5–6 base motifs of DNA in the whole genome or smaller stretches such as parts of, or whole chromosomes. This led us to some interesting findings about motif clustering and chromosome organization. It is quite clear that the motif distribution in genomes is not random at the length scales we examined: 1 kb to entire chromosomes. The observed-to-expected (OE) ratios of motif distributions show strong correlations in pairs of chromosomes that are susceptible to translocations. With the aid of examples, we suggest that similarity in motif distributions in promoter regions of genes could imply co-regulation. A simple extension of this idea empowers us with the ability to construct gene regulatory networks. Further, we could make inferences about the spatial proximity of genomic fragments using these motif distributions. Spatially proximal regions, as deduced by Hi-C or pcHi-C, were ∼3.5 times more likely to have their motif distributions correlated than non-proximal regions. These correlations had strong contributions from the CTCF protein recognizing motifs which are known markers of topologically associated domains. In general, correlating genomic regions by motif distribution comparisons alone is rife with functional information.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"28 1","pages":""},"PeriodicalIF":13.1000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Motif distribution in genomes gives insights into gene clustering and co-regulation\",\"authors\":\"Atreyi Chakraborty, Sumant Chopde, Mallur Srivatsan Madhusudhan\",\"doi\":\"10.1093/nar/gkae1178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We read the genome as proteins in the cell would – by studying the distributions of 5–6 base motifs of DNA in the whole genome or smaller stretches such as parts of, or whole chromosomes. This led us to some interesting findings about motif clustering and chromosome organization. It is quite clear that the motif distribution in genomes is not random at the length scales we examined: 1 kb to entire chromosomes. The observed-to-expected (OE) ratios of motif distributions show strong correlations in pairs of chromosomes that are susceptible to translocations. With the aid of examples, we suggest that similarity in motif distributions in promoter regions of genes could imply co-regulation. A simple extension of this idea empowers us with the ability to construct gene regulatory networks. Further, we could make inferences about the spatial proximity of genomic fragments using these motif distributions. Spatially proximal regions, as deduced by Hi-C or pcHi-C, were ∼3.5 times more likely to have their motif distributions correlated than non-proximal regions. These correlations had strong contributions from the CTCF protein recognizing motifs which are known markers of topologically associated domains. In general, correlating genomic regions by motif distribution comparisons alone is rife with functional information.\",\"PeriodicalId\":19471,\"journal\":{\"name\":\"Nucleic Acids Research\",\"volume\":\"28 1\",\"pages\":\"\"},\"PeriodicalIF\":13.1000,\"publicationDate\":\"2024-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nucleic Acids Research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/nar/gkae1178\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nucleic Acids Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/nar/gkae1178","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

我们解读基因组就像解读细胞中的蛋白质一样——通过研究DNA在整个基因组或更小的片段（如部分或整个染色体）中的5-6个碱基基序的分布。这使我们对基序聚类和染色体组织有了一些有趣的发现。很明显，基序在基因组中的分布在我们研究的长度尺度上不是随机的：1 kb到整个染色体。基序分布的观察到的期望比（OE）在易易位的染色体对中显示出很强的相关性。通过实例分析，我们认为基因启动子区域基序分布的相似性可能暗示着共调控。这个想法的一个简单扩展使我们有能力构建基因调控网络。此外，我们可以利用这些基序分布推断基因组片段的空间接近性。根据Hi-C或pcHi-C推断，空间近端区域的基序分布相关的可能性是非近端区域的3.5倍。这些相关性来自CTCF蛋白识别基序的强大贡献，这些基序是拓扑相关结构域的已知标记。一般来说，仅通过基序分布比较来关联基因组区域就充斥着功能信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Motif distribution in genomes gives insights into gene clustering and co-regulation

We read the genome as proteins in the cell would – by studying the distributions of 5–6 base motifs of DNA in the whole genome or smaller stretches such as parts of, or whole chromosomes. This led us to some interesting findings about motif clustering and chromosome organization. It is quite clear that the motif distribution in genomes is not random at the length scales we examined: 1 kb to entire chromosomes. The observed-to-expected (OE) ratios of motif distributions show strong correlations in pairs of chromosomes that are susceptible to translocations. With the aid of examples, we suggest that similarity in motif distributions in promoter regions of genes could imply co-regulation. A simple extension of this idea empowers us with the ability to construct gene regulatory networks. Further, we could make inferences about the spatial proximity of genomic fragments using these motif distributions. Spatially proximal regions, as deduced by Hi-C or pcHi-C, were ∼3.5 times more likely to have their motif distributions correlated than non-proximal regions. These correlations had strong contributions from the CTCF protein recognizing motifs which are known markers of topologically associated domains. In general, correlating genomic regions by motif distribution comparisons alone is rife with functional information.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nucleic Acids Research 生物-生化与分子生物学

CiteScore

27.10

自引率

4.70%

发文量

1057

审稿时长

2 months

期刊介绍： Nucleic Acids Research (NAR) is a scientific journal that publishes research on various aspects of nucleic acids and proteins involved in nucleic acid metabolism and interactions. It covers areas such as chemistry and synthetic biology, computational biology, gene regulation, chromatin and epigenetics, genome integrity, repair and replication, genomics, molecular biology, nucleic acid enzymes, RNA, and structural biology. The journal also includes a Survey and Summary section for brief reviews. Additionally, each year, the first issue is dedicated to biological databases, and an issue in July focuses on web-based software resources for the biological community. Nucleic Acids Research is indexed by several services including Abstracts on Hygiene and Communicable Diseases, Animal Breeding Abstracts, Agricultural Engineering Abstracts, Agbiotech News and Information, BIOSIS Previews, CAB Abstracts, and EMBASE.