T. Lončar-Turukalo, I. Lazić, Nina Maljkovic, S. Brdar
{"title":"Clustering of Microbiome Data: Evaluation of Ensemble Design Approaches","authors":"T. Lončar-Turukalo, I. Lazić, Nina Maljkovic, S. Brdar","doi":"10.1109/EUROCON.2019.8861929","DOIUrl":null,"url":null,"abstract":"The research focus on the human microbiome is moving towards uncovering its association with the overall wellbeing and using this knowledge in personalized medicine and connected health. Driven by more affordable highthroughput sequencing, microbiome data generation rate has increased, enabling an efficient implementation of data-driven algorithms. This study evaluates the possibilities to identify clusters in a human microbiome data based on taxonomic profiles, relying on 24 different $\\beta $diversity measures, individual and ensemble clustering approaches. The influence of ensemble creation techniques and parameter selection to the robustness and quality of consensus partition was explored. Furthermore, we have evaluated changes in the clustering performance after dimensionality reduction. The results indicate that careful selection of the algorithm parameters and ensemble design are needed to ensure the stable consensus partition. Reduction in the number of input features using kernel principal component analysis is accompanied with loss of discrimination potential.","PeriodicalId":232097,"journal":{"name":"IEEE EUROCON 2019 -18th International Conference on Smart Technologies","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE EUROCON 2019 -18th International Conference on Smart Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUROCON.2019.8861929","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The research focus on the human microbiome is moving towards uncovering its association with the overall wellbeing and using this knowledge in personalized medicine and connected health. Driven by more affordable highthroughput sequencing, microbiome data generation rate has increased, enabling an efficient implementation of data-driven algorithms. This study evaluates the possibilities to identify clusters in a human microbiome data based on taxonomic profiles, relying on 24 different $\beta $diversity measures, individual and ensemble clustering approaches. The influence of ensemble creation techniques and parameter selection to the robustness and quality of consensus partition was explored. Furthermore, we have evaluated changes in the clustering performance after dimensionality reduction. The results indicate that careful selection of the algorithm parameters and ensemble design are needed to ensure the stable consensus partition. Reduction in the number of input features using kernel principal component analysis is accompanied with loss of discrimination potential.