{"title":"从基因表达谱中识别最重要的基因用于样本分类","authors":"H. Al-Mubaid, Noushin Ghaffari","doi":"10.1109/GRC.2006.1635887","DOIUrl":null,"url":null,"abstract":"The gene expression data generated by the Microarray technology for thousands of genes simultaneously provide huge amounts of biomedical data in forms of gene expression profiles. This generated gene data include complex variations of expression levels of thousands of gene in the classes of samples. The gene level variations allow for classifying and clustering the samples based on only a small subset of genes. In this work, we want to identify the most significant genes that demonstrate the highest capabilities of discrimination between the classes of samples. We present a new gene selection technique for extracting the most significant genes from the huge gene/feature space in a given gene expression dataset. Our method is based on computing the discriminating capability of each gene, and classifying the data according to only those most significant genes that have highest discriminating capabilities. We also adapted from text categorization and information retrieval five feature selection techniques into the gene selection task to compare with our method. We evaluated the method using four well-known gene expression datasets. The experimental results showed that our method produces impressive and competitive results in terms of classification performance with few selected genes compared with the existing techniques.","PeriodicalId":400997,"journal":{"name":"2006 IEEE International Conference on Granular Computing","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Identifying the most significant genes from gene expression profiles for sample classification\",\"authors\":\"H. Al-Mubaid, Noushin Ghaffari\",\"doi\":\"10.1109/GRC.2006.1635887\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The gene expression data generated by the Microarray technology for thousands of genes simultaneously provide huge amounts of biomedical data in forms of gene expression profiles. This generated gene data include complex variations of expression levels of thousands of gene in the classes of samples. The gene level variations allow for classifying and clustering the samples based on only a small subset of genes. In this work, we want to identify the most significant genes that demonstrate the highest capabilities of discrimination between the classes of samples. We present a new gene selection technique for extracting the most significant genes from the huge gene/feature space in a given gene expression dataset. Our method is based on computing the discriminating capability of each gene, and classifying the data according to only those most significant genes that have highest discriminating capabilities. We also adapted from text categorization and information retrieval five feature selection techniques into the gene selection task to compare with our method. We evaluated the method using four well-known gene expression datasets. The experimental results showed that our method produces impressive and competitive results in terms of classification performance with few selected genes compared with the existing techniques.\",\"PeriodicalId\":400997,\"journal\":{\"name\":\"2006 IEEE International Conference on Granular Computing\",\"volume\":\"90 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE International Conference on Granular Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GRC.2006.1635887\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Conference on Granular Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRC.2006.1635887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identifying the most significant genes from gene expression profiles for sample classification
The gene expression data generated by the Microarray technology for thousands of genes simultaneously provide huge amounts of biomedical data in forms of gene expression profiles. This generated gene data include complex variations of expression levels of thousands of gene in the classes of samples. The gene level variations allow for classifying and clustering the samples based on only a small subset of genes. In this work, we want to identify the most significant genes that demonstrate the highest capabilities of discrimination between the classes of samples. We present a new gene selection technique for extracting the most significant genes from the huge gene/feature space in a given gene expression dataset. Our method is based on computing the discriminating capability of each gene, and classifying the data according to only those most significant genes that have highest discriminating capabilities. We also adapted from text categorization and information retrieval five feature selection techniques into the gene selection task to compare with our method. We evaluated the method using four well-known gene expression datasets. The experimental results showed that our method produces impressive and competitive results in terms of classification performance with few selected genes compared with the existing techniques.