{"title":"CVis — Towards a novel visualization tool to explore the relationship between input and output partitions in multi-objective clustering ensembles","authors":"Katti Faceli, T. Sakata, J. Handl","doi":"10.1109/CIBCB.2017.8058567","DOIUrl":null,"url":null,"abstract":"Ensemble methods for clustering take a collection of input partitions, produced for the same data set, and generate an ensemble partition that tries to preserve the information carried in this collective. Acceptance of the resulting partition(s) by decision makers can be a problem, due to the inherent complexity of ensemble techniques, and the associated lack of intuition on how a consensus has been derived from the original set of input partitions. This problem is exacerbated in multi-objective ensemble techniques, which generate a set of non-dominated consensus partitions. In this context, the selection of a final candidate clustering may require additional insight into the relationships between non-dominated output partitions. In this manuscript, we describe the first prototype of a novel visualization tool, CVis, which has been developed as a general tool to provide insight into the relationship between any set of partitions of a given data set. We proceed to demonstrate the specific use of this tool in understanding the relationship between the sets of input, the sets of outputs, and the input-output relationships for the multi-objective ensemble technique MOCLE. We discuss how the interlinked analysis of such sets of partitions can shed light onto the functioning, and the strengths and limitations of a particular ensemble technique. In particular, the tool facilitates the visual analysis of the level of support identified for individual consensus clusters, which is helpful in explaining final solutions to a decision maker.","PeriodicalId":283115,"journal":{"name":"2017 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2017.8058567","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Ensemble methods for clustering take a collection of input partitions, produced for the same data set, and generate an ensemble partition that tries to preserve the information carried in this collective. Acceptance of the resulting partition(s) by decision makers can be a problem, due to the inherent complexity of ensemble techniques, and the associated lack of intuition on how a consensus has been derived from the original set of input partitions. This problem is exacerbated in multi-objective ensemble techniques, which generate a set of non-dominated consensus partitions. In this context, the selection of a final candidate clustering may require additional insight into the relationships between non-dominated output partitions. In this manuscript, we describe the first prototype of a novel visualization tool, CVis, which has been developed as a general tool to provide insight into the relationship between any set of partitions of a given data set. We proceed to demonstrate the specific use of this tool in understanding the relationship between the sets of input, the sets of outputs, and the input-output relationships for the multi-objective ensemble technique MOCLE. We discuss how the interlinked analysis of such sets of partitions can shed light onto the functioning, and the strengths and limitations of a particular ensemble technique. In particular, the tool facilitates the visual analysis of the level of support identified for individual consensus clusters, which is helpful in explaining final solutions to a decision maker.