Yujing Zeng, Jianshan Tang, J. Garcia-Frías, G. Gao
{"title":"An adaptive meta-clustering approach: combining the information from different clustering results","authors":"Yujing Zeng, Jianshan Tang, J. Garcia-Frías, G. Gao","doi":"10.1109/CSB.2002.1039350","DOIUrl":null,"url":null,"abstract":"With the development of microarray techniques, there is an increasing need for information processing methods to analyze high throughput data. Clustering is one of the most promising candidates because of its simplicity, flexibility and robustness. However, there is no \"perfect\" clustering approach outperforming its counterparts, and it is hard to evaluate and combine the results from different techniques, especially in a field without much prior knowledge, such as bioinformatics. This paper proposes a meta-clustering approach to extract information from results of different clustering techniques, so that a better interpretation of the data distribution can be obtained. A special distance measure is defined to represent the statistical \"signal\" of each cluster produced by various clustering techniques. The algorithm is applied to both artificial and real data Simulations show that the proposed approach is able to extract information efficiently and accurately from the input clustering structure.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"276-287"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039350","citationCount":"67","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computer Society Bioinformatics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSB.2002.1039350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 67
Abstract
With the development of microarray techniques, there is an increasing need for information processing methods to analyze high throughput data. Clustering is one of the most promising candidates because of its simplicity, flexibility and robustness. However, there is no "perfect" clustering approach outperforming its counterparts, and it is hard to evaluate and combine the results from different techniques, especially in a field without much prior knowledge, such as bioinformatics. This paper proposes a meta-clustering approach to extract information from results of different clustering techniques, so that a better interpretation of the data distribution can be obtained. A special distance measure is defined to represent the statistical "signal" of each cluster produced by various clustering techniques. The algorithm is applied to both artificial and real data Simulations show that the proposed approach is able to extract information efficiently and accurately from the input clustering structure.