{"title":"A graph-based elastic net for variable selection and module identification for genomic data analysis","authors":"Zheng Xia, Xiao-feng Zhou, Wei Chen, Chunqi Chang","doi":"10.1109/BIBM.2010.5706591","DOIUrl":null,"url":null,"abstract":"Recently a network-constraint regression model[1] is proposed to incorporate the prior biological knowledge to perform regression and variable selection. In their method, a l1-norm of the coefficients is defined to impose sparse, meanwhile a Laplacian operation on the biological graph is designed to encourage smoothness of the coefficients along the network. However the grouping effect of their Laplacian smoothness operation only exits when the two connected genes both have positive or negative effects on the response. To overcome this problem, we proposed to apply the Laplacian operation on the absolute values of the coefficients to take account of the positive and negative effects. Here, we call the presented method as graph-based elastic net (GENet) because the proposed method has similar grouping effect with elastic net(ENet)[2] except the smoothness of two coefficients are specified by the network in GENet. Further, an efficient algorithm which has same spirit with LARS [3] is developed to solve our optimization problem. Simulation studies showed that the proposed method has better performance than network-constrained regularization without absolute values. Application to Alzheimer's disease(AD) microarray gene-expression dataset identified several subnetworks on Kyoto Encyclopedia of Genes and Genomes(KEGG) transcriptional pathways that are related to progression of AD. Many of those findings are confirmed by published literatures.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"PP 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2010.5706591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Recently a network-constraint regression model[1] is proposed to incorporate the prior biological knowledge to perform regression and variable selection. In their method, a l1-norm of the coefficients is defined to impose sparse, meanwhile a Laplacian operation on the biological graph is designed to encourage smoothness of the coefficients along the network. However the grouping effect of their Laplacian smoothness operation only exits when the two connected genes both have positive or negative effects on the response. To overcome this problem, we proposed to apply the Laplacian operation on the absolute values of the coefficients to take account of the positive and negative effects. Here, we call the presented method as graph-based elastic net (GENet) because the proposed method has similar grouping effect with elastic net(ENet)[2] except the smoothness of two coefficients are specified by the network in GENet. Further, an efficient algorithm which has same spirit with LARS [3] is developed to solve our optimization problem. Simulation studies showed that the proposed method has better performance than network-constrained regularization without absolute values. Application to Alzheimer's disease(AD) microarray gene-expression dataset identified several subnetworks on Kyoto Encyclopedia of Genes and Genomes(KEGG) transcriptional pathways that are related to progression of AD. Many of those findings are confirmed by published literatures.