J. Rosa, A. Magpantay, A. Gonzaga, Geoffrey A. Solano
{"title":"聚类中心基因作为白血病分类的候选生物标志物","authors":"J. Rosa, A. Magpantay, A. Gonzaga, Geoffrey A. Solano","doi":"10.1109/IISA.2014.6878769","DOIUrl":null,"url":null,"abstract":"Modern technologies such as DNA microarray have been developed to study the transcriptome of cancer cells. It has been used in many studies for tumor classification and of identification of marker genes associated with cancer. However, this technique often suffers the `curse of dimensionality'. A general approach to overcome this setback is to perform feature selection technique prior to classification. Biomarkers have long been used for the prognosis and diagnosis of different types of diseases. The need for new and more specific biomarkers for leukemia arises. In this study gene selection was approached first using gene filtering by determining the expressions inter-quartile ranges (IQR) of the genes and determining whether or not they are differentially expressed across the different sample types by using the Kruskal-Wallis analysis of variance (ANOVA). Filtered genes were then subjected to k-means clustering algorithm to identify candidate genes that can be used to discriminate the four main types of leukemia (ALL, AML, CLL, CML) and non-leukemia (NoL) bone marrow samples. The selected genes were then used to build classification models using Support Vector Machine (SVM) and Artificial Neural Network (ANN) learning algorithms. Forty samples were used to build the models and 20 samples were used to assess the models performance. A minimum of 6 genes was found to be needed to correctly classify all samples in the training dataset into the five categories and to classify the samples in the validation dataset with high accuracy.","PeriodicalId":298835,"journal":{"name":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Cluster center genes as candidate biomarkers for the classification of Leukemia\",\"authors\":\"J. Rosa, A. Magpantay, A. Gonzaga, Geoffrey A. Solano\",\"doi\":\"10.1109/IISA.2014.6878769\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern technologies such as DNA microarray have been developed to study the transcriptome of cancer cells. It has been used in many studies for tumor classification and of identification of marker genes associated with cancer. However, this technique often suffers the `curse of dimensionality'. A general approach to overcome this setback is to perform feature selection technique prior to classification. Biomarkers have long been used for the prognosis and diagnosis of different types of diseases. The need for new and more specific biomarkers for leukemia arises. In this study gene selection was approached first using gene filtering by determining the expressions inter-quartile ranges (IQR) of the genes and determining whether or not they are differentially expressed across the different sample types by using the Kruskal-Wallis analysis of variance (ANOVA). Filtered genes were then subjected to k-means clustering algorithm to identify candidate genes that can be used to discriminate the four main types of leukemia (ALL, AML, CLL, CML) and non-leukemia (NoL) bone marrow samples. The selected genes were then used to build classification models using Support Vector Machine (SVM) and Artificial Neural Network (ANN) learning algorithms. Forty samples were used to build the models and 20 samples were used to assess the models performance. A minimum of 6 genes was found to be needed to correctly classify all samples in the training dataset into the five categories and to classify the samples in the validation dataset with high accuracy.\",\"PeriodicalId\":298835,\"journal\":{\"name\":\"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IISA.2014.6878769\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2014.6878769","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cluster center genes as candidate biomarkers for the classification of Leukemia
Modern technologies such as DNA microarray have been developed to study the transcriptome of cancer cells. It has been used in many studies for tumor classification and of identification of marker genes associated with cancer. However, this technique often suffers the `curse of dimensionality'. A general approach to overcome this setback is to perform feature selection technique prior to classification. Biomarkers have long been used for the prognosis and diagnosis of different types of diseases. The need for new and more specific biomarkers for leukemia arises. In this study gene selection was approached first using gene filtering by determining the expressions inter-quartile ranges (IQR) of the genes and determining whether or not they are differentially expressed across the different sample types by using the Kruskal-Wallis analysis of variance (ANOVA). Filtered genes were then subjected to k-means clustering algorithm to identify candidate genes that can be used to discriminate the four main types of leukemia (ALL, AML, CLL, CML) and non-leukemia (NoL) bone marrow samples. The selected genes were then used to build classification models using Support Vector Machine (SVM) and Artificial Neural Network (ANN) learning algorithms. Forty samples were used to build the models and 20 samples were used to assess the models performance. A minimum of 6 genes was found to be needed to correctly classify all samples in the training dataset into the five categories and to classify the samples in the validation dataset with high accuracy.