基因无监督聚类的监督学习方法

2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Pub Date : 2010-12-01 DOI:10.1109/BIBM.2010.5706585

Andrew K. Rider, Geoffrey H. Siwo, S. Emrich, M. Ferdig, N. Chawla

{"title":"基因无监督聚类的监督学习方法","authors":"Andrew K. Rider, Geoffrey H. Siwo, S. Emrich, M. Ferdig, N. Chawla","doi":"10.1109/BIBM.2010.5706585","DOIUrl":null,"url":null,"abstract":"Clustering is a common step in the analysis of microarray data. Microarrays enable simultaneous high-throughput measurement of the expression level of genes. These data can be used to explore relationships between genes and can guide development of drugs and further research. A typical first step in the analysis of these data is to use an agglomerative hierarchical clustering algorithm on the correlation between all gene pairs. While this simple approach has been successful it fails to identify many genetic interactions that may be important for drug design and other important applications. We present an approach to the clustering of expression data that utilizes known gene-gene interaction data to improve results for already commonly used clustering techniques. The approach creates an ensemble similarity measure that can be used as input to common clustering techniques and provides results with increased biological significance while not altering the clustering approach at all.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A supervised learning approach to the unsupervised clustering of genes\",\"authors\":\"Andrew K. Rider, Geoffrey H. Siwo, S. Emrich, M. Ferdig, N. Chawla\",\"doi\":\"10.1109/BIBM.2010.5706585\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering is a common step in the analysis of microarray data. Microarrays enable simultaneous high-throughput measurement of the expression level of genes. These data can be used to explore relationships between genes and can guide development of drugs and further research. A typical first step in the analysis of these data is to use an agglomerative hierarchical clustering algorithm on the correlation between all gene pairs. While this simple approach has been successful it fails to identify many genetic interactions that may be important for drug design and other important applications. We present an approach to the clustering of expression data that utilizes known gene-gene interaction data to improve results for already commonly used clustering techniques. The approach creates an ensemble similarity measure that can be used as input to common clustering techniques and provides results with increased biological significance while not altering the clustering approach at all.\",\"PeriodicalId\":275098,\"journal\":{\"name\":\"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2010.5706585\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2010.5706585","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

聚类是分析微阵列数据的常见步骤。微阵列能够同时高通量测量基因的表达水平。这些数据可以用来探索基因之间的关系，并可以指导药物的开发和进一步的研究。对这些数据进行分析的典型第一步是对所有基因对之间的相关性使用聚类分层聚类算法。虽然这种简单的方法取得了成功，但它未能识别出许多可能对药物设计和其他重要应用很重要的基因相互作用。我们提出了一种表达数据聚类的方法，利用已知的基因-基因相互作用数据来改进已经常用的聚类技术的结果。该方法创建了一个集成相似性度量，可以用作普通聚类技术的输入，并提供具有更高生物学意义的结果，同时完全不改变聚类方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A supervised learning approach to the unsupervised clustering of genes

Clustering is a common step in the analysis of microarray data. Microarrays enable simultaneous high-throughput measurement of the expression level of genes. These data can be used to explore relationships between genes and can guide development of drugs and further research. A typical first step in the analysis of these data is to use an agglomerative hierarchical clustering algorithm on the correlation between all gene pairs. While this simple approach has been successful it fails to identify many genetic interactions that may be important for drug design and other important applications. We present an approach to the clustering of expression data that utilizes known gene-gene interaction data to improve results for already commonly used clustering techniques. The approach creates an ensemble similarity measure that can be used as input to common clustering techniques and provides results with increased biological significance while not altering the clustering approach at all.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

自引率

0.00%

发文量