Tri-Cong Pham, C. Luong, A. Doucet, Van-Dung Hoang, Diem-Phuc Tran, Duc-Hau Le
{"title":"Meta-analysis of computational methods for breast cancer classification","authors":"Tri-Cong Pham, C. Luong, A. Doucet, Van-Dung Hoang, Diem-Phuc Tran, Duc-Hau Le","doi":"10.1504/ijiids.2020.10030219","DOIUrl":null,"url":null,"abstract":"Millions of women are suffering from breast cancer pressing burden on their shoulders and the global economy. Meanwhile, general treatment methods are applied without considering personalised health and genetic features. Artificial intelligence appears to be a robust method for breast cancer sub-typing. Most of researches have been implemented on binary classification with limited number of data samples. Multi-classification is much more difficult especially on large number of samples. The study aims to use machine learning to find better ways to subtype breast cancer as well as find new disease causative genes which help facilitate more personalised treatment with limited side effect in the future. This study compares the accuracy of three classification methods in combination with eight feature selection methods on a dataset of 2,682 samples. The study shows that the highest accuracy was 83.96% with the SVM-C005 classifier and percentile feature selection (800 genes). Additionally, our method can predict causative disease genes of breast cancer with four of them known to be associated with breast cancer and 29 promising ones with supporting evidence from the literature. This shows the effectiveness of our research.","PeriodicalId":39658,"journal":{"name":"International Journal of Intelligent Information and Database Systems","volume":"18 1","pages":"89-111"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Information and Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijiids.2020.10030219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 4
Abstract
Millions of women are suffering from breast cancer pressing burden on their shoulders and the global economy. Meanwhile, general treatment methods are applied without considering personalised health and genetic features. Artificial intelligence appears to be a robust method for breast cancer sub-typing. Most of researches have been implemented on binary classification with limited number of data samples. Multi-classification is much more difficult especially on large number of samples. The study aims to use machine learning to find better ways to subtype breast cancer as well as find new disease causative genes which help facilitate more personalised treatment with limited side effect in the future. This study compares the accuracy of three classification methods in combination with eight feature selection methods on a dataset of 2,682 samples. The study shows that the highest accuracy was 83.96% with the SVM-C005 classifier and percentile feature selection (800 genes). Additionally, our method can predict causative disease genes of breast cancer with four of them known to be associated with breast cancer and 29 promising ones with supporting evidence from the literature. This shows the effectiveness of our research.
期刊介绍:
Intelligent information systems and intelligent database systems are a very dynamically developing field in computer sciences. IJIIDS provides a medium for exchanging scientific research and technological achievements accomplished by the international community. It focuses on research in applications of advanced intelligent technologies for data storing and processing in a wide-ranging context. The issues addressed by IJIIDS involve solutions of real-life problems, in which it is necessary to apply intelligent technologies for achieving effective results. The emphasis of the reported work is on new and original research and technological developments rather than reports on the application of existing technology to different sets of data.