{"title":"A Latent Feature Model Approach to Biclustering","authors":"J. Caldas, Samuel Kaski","doi":"10.4018/IJKDB.2016070102","DOIUrl":null,"url":null,"abstract":"Biclustering is the unsupervised learning task of mining a data matrix for useful submatrices, for instance groups of genes that are co-expressed under particular biological conditions. As these submatrices are expected to partly overlap, a significant challenge in biclustering is to develop methods that are able to detect overlapping biclusters. The authors propose a probabilistic mixture modelling framework for biclustering biological data that lends itself to various data types and allows biclusters to overlap. Their framework is akin to the latent feature and mixture-of-experts model families, with inference and parameter estimation being performed via a variational expectation-maximization algorithm. The model compares favorably with competing approaches, both in a binary DNA copy number variation data set and in a miRNA expression data set, indicating that it may potentially be used as a general-problem solving tool in biclustering.","PeriodicalId":160270,"journal":{"name":"Int. J. Knowl. Discov. Bioinform.","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Knowl. Discov. Bioinform.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/IJKDB.2016070102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Biclustering is the unsupervised learning task of mining a data matrix for useful submatrices, for instance groups of genes that are co-expressed under particular biological conditions. As these submatrices are expected to partly overlap, a significant challenge in biclustering is to develop methods that are able to detect overlapping biclusters. The authors propose a probabilistic mixture modelling framework for biclustering biological data that lends itself to various data types and allows biclusters to overlap. Their framework is akin to the latent feature and mixture-of-experts model families, with inference and parameter estimation being performed via a variational expectation-maximization algorithm. The model compares favorably with competing approaches, both in a binary DNA copy number variation data set and in a miRNA expression data set, indicating that it may potentially be used as a general-problem solving tool in biclustering.