{"title":"Microarray data analysis for cancer classification","authors":"A. Osareh, B. Shadgar","doi":"10.1109/HIBIT.2010.5478893","DOIUrl":null,"url":null,"abstract":"Cancer diagnosis is one of the most important emerging clinical applications of gene expression microarray data. In this work, we aim to develop an automated system for robust and reliable cancer diagnoses based on gene microarray data. Support vector machine classifiers outperform other popular classifiers, such as K nearest neighbours, naive Bayes, neural networks and decision tree, often to a remarkable degree. We choose a set of 9 publicly available benchmark microarray datasets that encompass both binary and multi-class cancer problems. Results of comparative studies are provided, demonstrating that effective feature selection is essential to the development of classifiers intended for use in gene-based cancer classification. In particular, amongst various systematic experiments carried out, best classification model is achieved using a subset of features chosen via information gain feature ranking for support vector machine classifier.","PeriodicalId":215457,"journal":{"name":"2010 5th International Symposium on Health Informatics and Bioinformatics","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 5th International Symposium on Health Informatics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIBIT.2010.5478893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32
Abstract
Cancer diagnosis is one of the most important emerging clinical applications of gene expression microarray data. In this work, we aim to develop an automated system for robust and reliable cancer diagnoses based on gene microarray data. Support vector machine classifiers outperform other popular classifiers, such as K nearest neighbours, naive Bayes, neural networks and decision tree, often to a remarkable degree. We choose a set of 9 publicly available benchmark microarray datasets that encompass both binary and multi-class cancer problems. Results of comparative studies are provided, demonstrating that effective feature selection is essential to the development of classifiers intended for use in gene-based cancer classification. In particular, amongst various systematic experiments carried out, best classification model is achieved using a subset of features chosen via information gain feature ranking for support vector machine classifier.