{"title":"Breast and prostate cancer expression similarity analysis by iterative SVM based ensemble gene selection","authors":"Darius Coelho, Lee Sael","doi":"10.1145/2512089.2512099","DOIUrl":null,"url":null,"abstract":"Epidemiologic and phenotypic evidences indicate that breast and prostate cancers have high pathological similarities. Analysis of pathological similarities between cancers can be beneficial in several aspects such as enabling the knowledge transfer between the cancer studies. To gain knowledge of the similarity between the breast and prostate cancer pathology, common genes that are affected by the two carcinomas are investigated. Gene expression data extracted from RNA-seq experiments, provided through TCGA consortium, is used for gene selection. Gene selection was performed using an iterative SVM based ensemble feature selection approach. Iterative SVM-based gene selection methods enable correlated gene expressions to be considered simultaneously and ensemble approach stabilizes the selection. As results of the analysis, two genes, Transglutaminase 4 (TGM4) and complement component 4A (C4A), were selected as commonly altered genes. Direct relationships of the two genes to the two cancers are not confirmed. However, TGM4 is known to be associated with adenocarcinomas and C4A with ovarian cancer. Thus provides evidence that they maybe pathologically important genes for the two cancers.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data and Text Mining in Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2512089.2512099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Epidemiologic and phenotypic evidences indicate that breast and prostate cancers have high pathological similarities. Analysis of pathological similarities between cancers can be beneficial in several aspects such as enabling the knowledge transfer between the cancer studies. To gain knowledge of the similarity between the breast and prostate cancer pathology, common genes that are affected by the two carcinomas are investigated. Gene expression data extracted from RNA-seq experiments, provided through TCGA consortium, is used for gene selection. Gene selection was performed using an iterative SVM based ensemble feature selection approach. Iterative SVM-based gene selection methods enable correlated gene expressions to be considered simultaneously and ensemble approach stabilizes the selection. As results of the analysis, two genes, Transglutaminase 4 (TGM4) and complement component 4A (C4A), were selected as commonly altered genes. Direct relationships of the two genes to the two cancers are not confirmed. However, TGM4 is known to be associated with adenocarcinomas and C4A with ovarian cancer. Thus provides evidence that they maybe pathologically important genes for the two cancers.