Chun-Ping Yu, Zhi Thong Soh, Maloyjo Joyraj Bhattacharjee, Wen-Hsiung Li
{"title":"转录因子结合位点在人类基因组中的位置分布。","authors":"Chun-Ping Yu, Zhi Thong Soh, Maloyjo Joyraj Bhattacharjee, Wen-Hsiung Li","doi":"10.1371/journal.pone.0329226","DOIUrl":null,"url":null,"abstract":"<p><p>As transcription factors (TFs) play a major role in gene regulation, we studied their binding motifs (positional weight matrices, PWMs) and binding sites (TFBSs) in the human genome, and how TFs bind DNA motifs, including the involvement of binding co-factors. Using the chromatin immunoprecipitation sequencing data recently released by ENCODE (Encyclopedia of DNA Elements), we obtained new PWMs for 196 TFs and revised PWMs for 119 TFs. From these and the PWMs previously obtained for 235 TFs, we inferred the canonical PWMs for 500 TFs, including 243 new PWMs. Analysis revealed that most TFBSs are in introns (42.6%) and intergenic regions (31.6%), with only 11.3% in promoters. However, the TFBS density is considerably higher in promoters, showing a bell-shaped distribution of TFBSs with a peak at the transcription start site. Many TFBSs lie close to CTCF (CCCTC-binding factor) binding sites. Tethered binding is far more frequent than co-binding, with the latter often requiring co-factors.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 7","pages":"e0329226"},"PeriodicalIF":2.6000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310040/pdf/","citationCount":"0","resultStr":"{\"title\":\"Positional distribution of transcription factor binding sites in the human genome.\",\"authors\":\"Chun-Ping Yu, Zhi Thong Soh, Maloyjo Joyraj Bhattacharjee, Wen-Hsiung Li\",\"doi\":\"10.1371/journal.pone.0329226\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>As transcription factors (TFs) play a major role in gene regulation, we studied their binding motifs (positional weight matrices, PWMs) and binding sites (TFBSs) in the human genome, and how TFs bind DNA motifs, including the involvement of binding co-factors. Using the chromatin immunoprecipitation sequencing data recently released by ENCODE (Encyclopedia of DNA Elements), we obtained new PWMs for 196 TFs and revised PWMs for 119 TFs. From these and the PWMs previously obtained for 235 TFs, we inferred the canonical PWMs for 500 TFs, including 243 new PWMs. Analysis revealed that most TFBSs are in introns (42.6%) and intergenic regions (31.6%), with only 11.3% in promoters. However, the TFBS density is considerably higher in promoters, showing a bell-shaped distribution of TFBSs with a peak at the transcription start site. Many TFBSs lie close to CTCF (CCCTC-binding factor) binding sites. Tethered binding is far more frequent than co-binding, with the latter often requiring co-factors.</p>\",\"PeriodicalId\":20189,\"journal\":{\"name\":\"PLoS ONE\",\"volume\":\"20 7\",\"pages\":\"e0329226\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310040/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS ONE\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pone.0329226\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0329226","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
由于转录因子(tf)在基因调控中发挥着重要作用,我们研究了它们在人类基因组中的结合基序(position weight matrices, PWMs)和结合位点(binding sites, TFBSs),以及tf如何结合DNA基序,包括结合辅因子的参与。利用ENCODE (Encyclopedia of DNA Elements)最近发布的染色质免疫沉淀测序数据,我们获得了196个tf的新PWMs和119个tf的修订PWMs。从这些和之前获得的235个tf的PWMs,我们推断了500个tf的典型PWMs,包括243个新的PWMs。分析显示,大多数TFBSs位于内含子区(42.6%)和基因间区(31.6%),只有11.3%位于启动子区。然而,启动子中的TFBS密度要高得多,呈现钟形分布,在转录起始位点有一个峰值。许多TFBSs位于CTCF (ccctc结合因子)结合位点附近。栓系结合远比共结合频繁,后者通常需要辅助因子。
Positional distribution of transcription factor binding sites in the human genome.
As transcription factors (TFs) play a major role in gene regulation, we studied their binding motifs (positional weight matrices, PWMs) and binding sites (TFBSs) in the human genome, and how TFs bind DNA motifs, including the involvement of binding co-factors. Using the chromatin immunoprecipitation sequencing data recently released by ENCODE (Encyclopedia of DNA Elements), we obtained new PWMs for 196 TFs and revised PWMs for 119 TFs. From these and the PWMs previously obtained for 235 TFs, we inferred the canonical PWMs for 500 TFs, including 243 new PWMs. Analysis revealed that most TFBSs are in introns (42.6%) and intergenic regions (31.6%), with only 11.3% in promoters. However, the TFBS density is considerably higher in promoters, showing a bell-shaped distribution of TFBSs with a peak at the transcription start site. Many TFBSs lie close to CTCF (CCCTC-binding factor) binding sites. Tethered binding is far more frequent than co-binding, with the latter often requiring co-factors.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage