Ji Chen, Yang Hao, Tianjun Wang, Daiyun Huang, Xin Liu
{"title":"Discovery of Stomach Adenocarcinoma Biomarkers by Consensus Scoring of Random Sampling and Machine Learning Modeling","authors":"Ji Chen, Yang Hao, Tianjun Wang, Daiyun Huang, Xin Liu","doi":"10.1109/icbcb55259.2022.9802469","DOIUrl":null,"url":null,"abstract":"Stomach adenocarcinoma (STAD) is a subtype of gastric cancer with high incidence and mortality. Lack of early detection results in the poor prognosis of this cancer, leading to low survival rate of patients. In this study, machine learning methods, specifically support vector machine (SVM) based recursive feature elimination (SVM-RFE), were applied to discover the potential biomarkers of STAD with the data form the Cancer Genome Atlas (TCGA). After the optimal parameter set was determined, random sampling was conducted to minimize the limitation caused by small sample size (64 paired tumor and adjacent non-tumor samples). As a result, five genes (COL10A1, CST1, ESM1, HOXC11 and HOXC9) were identified to be essential to the predictive model built by SVM-RFE. In addition, other three genes GAD1, HOXA11 and PRKCG are of less importance but still could be potential biomarkers of STAD.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icbcb55259.2022.9802469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Stomach adenocarcinoma (STAD) is a subtype of gastric cancer with high incidence and mortality. Lack of early detection results in the poor prognosis of this cancer, leading to low survival rate of patients. In this study, machine learning methods, specifically support vector machine (SVM) based recursive feature elimination (SVM-RFE), were applied to discover the potential biomarkers of STAD with the data form the Cancer Genome Atlas (TCGA). After the optimal parameter set was determined, random sampling was conducted to minimize the limitation caused by small sample size (64 paired tumor and adjacent non-tumor samples). As a result, five genes (COL10A1, CST1, ESM1, HOXC11 and HOXC9) were identified to be essential to the predictive model built by SVM-RFE. In addition, other three genes GAD1, HOXA11 and PRKCG are of less importance but still could be potential biomarkers of STAD.