{"title":"基因聚类的转录方法","authors":"I. Tagkopoulos","doi":"10.1109/CIBCB.2005.1594921","DOIUrl":null,"url":null,"abstract":"We present an integrative method for clustering coregulated genes and elucidating their underlying regulatory mechanisms. We use multi-state partition functions and thermodynamic models to derive six distinct correlation classes that correspond to various Protein-Protein and Protein-DNA interactions. We then introduce a biclustering algorithm for clustering genes based on the correlations exhibited in their expression profiles. We evaluate the functional enrichment and statistical significance of the resulting clusters using precision-recall curves. Our results show that classification performance can be optimized by selecting the corresponding correlation class. Additionally, there is a significant improvement over single class biclustering when we use multi-class support vector machines and biclustering scores as features. Furthermore, the analysis of the upstream regions of all genes comprising each cluster shows that the derived correlation classes capture the expression of genes with shared regulation. We identify over a hundred highly conserved sequences, among which twenty one match well-known regulatory motifs. Further analysis of the identified conserved sequences provides not only an explanation of the classification performance, but serves also as an indicator of the regulatory correlation for various groups.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Transcriptional Approach to Gene Clustering\",\"authors\":\"I. Tagkopoulos\",\"doi\":\"10.1109/CIBCB.2005.1594921\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present an integrative method for clustering coregulated genes and elucidating their underlying regulatory mechanisms. We use multi-state partition functions and thermodynamic models to derive six distinct correlation classes that correspond to various Protein-Protein and Protein-DNA interactions. We then introduce a biclustering algorithm for clustering genes based on the correlations exhibited in their expression profiles. We evaluate the functional enrichment and statistical significance of the resulting clusters using precision-recall curves. Our results show that classification performance can be optimized by selecting the corresponding correlation class. Additionally, there is a significant improvement over single class biclustering when we use multi-class support vector machines and biclustering scores as features. Furthermore, the analysis of the upstream regions of all genes comprising each cluster shows that the derived correlation classes capture the expression of genes with shared regulation. We identify over a hundred highly conserved sequences, among which twenty one match well-known regulatory motifs. Further analysis of the identified conserved sequences provides not only an explanation of the classification performance, but serves also as an indicator of the regulatory correlation for various groups.\",\"PeriodicalId\":330810,\"journal\":{\"name\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIBCB.2005.1594921\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2005.1594921","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We present an integrative method for clustering coregulated genes and elucidating their underlying regulatory mechanisms. We use multi-state partition functions and thermodynamic models to derive six distinct correlation classes that correspond to various Protein-Protein and Protein-DNA interactions. We then introduce a biclustering algorithm for clustering genes based on the correlations exhibited in their expression profiles. We evaluate the functional enrichment and statistical significance of the resulting clusters using precision-recall curves. Our results show that classification performance can be optimized by selecting the corresponding correlation class. Additionally, there is a significant improvement over single class biclustering when we use multi-class support vector machines and biclustering scores as features. Furthermore, the analysis of the upstream regions of all genes comprising each cluster shows that the derived correlation classes capture the expression of genes with shared regulation. We identify over a hundred highly conserved sequences, among which twenty one match well-known regulatory motifs. Further analysis of the identified conserved sequences provides not only an explanation of the classification performance, but serves also as an indicator of the regulatory correlation for various groups.