Ji Chen, Yang Hao, Tianjun Wang, Daiyun Huang, Xin Liu
{"title":"通过随机抽样和机器学习建模的共识评分发现胃腺癌生物标志物","authors":"Ji Chen, Yang Hao, Tianjun Wang, Daiyun Huang, Xin Liu","doi":"10.1109/icbcb55259.2022.9802469","DOIUrl":null,"url":null,"abstract":"Stomach adenocarcinoma (STAD) is a subtype of gastric cancer with high incidence and mortality. Lack of early detection results in the poor prognosis of this cancer, leading to low survival rate of patients. In this study, machine learning methods, specifically support vector machine (SVM) based recursive feature elimination (SVM-RFE), were applied to discover the potential biomarkers of STAD with the data form the Cancer Genome Atlas (TCGA). After the optimal parameter set was determined, random sampling was conducted to minimize the limitation caused by small sample size (64 paired tumor and adjacent non-tumor samples). As a result, five genes (COL10A1, CST1, ESM1, HOXC11 and HOXC9) were identified to be essential to the predictive model built by SVM-RFE. In addition, other three genes GAD1, HOXA11 and PRKCG are of less importance but still could be potential biomarkers of STAD.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Discovery of Stomach Adenocarcinoma Biomarkers by Consensus Scoring of Random Sampling and Machine Learning Modeling\",\"authors\":\"Ji Chen, Yang Hao, Tianjun Wang, Daiyun Huang, Xin Liu\",\"doi\":\"10.1109/icbcb55259.2022.9802469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stomach adenocarcinoma (STAD) is a subtype of gastric cancer with high incidence and mortality. Lack of early detection results in the poor prognosis of this cancer, leading to low survival rate of patients. In this study, machine learning methods, specifically support vector machine (SVM) based recursive feature elimination (SVM-RFE), were applied to discover the potential biomarkers of STAD with the data form the Cancer Genome Atlas (TCGA). After the optimal parameter set was determined, random sampling was conducted to minimize the limitation caused by small sample size (64 paired tumor and adjacent non-tumor samples). As a result, five genes (COL10A1, CST1, ESM1, HOXC11 and HOXC9) were identified to be essential to the predictive model built by SVM-RFE. In addition, other three genes GAD1, HOXA11 and PRKCG are of less importance but still could be potential biomarkers of STAD.\",\"PeriodicalId\":429633,\"journal\":{\"name\":\"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icbcb55259.2022.9802469\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icbcb55259.2022.9802469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Discovery of Stomach Adenocarcinoma Biomarkers by Consensus Scoring of Random Sampling and Machine Learning Modeling
Stomach adenocarcinoma (STAD) is a subtype of gastric cancer with high incidence and mortality. Lack of early detection results in the poor prognosis of this cancer, leading to low survival rate of patients. In this study, machine learning methods, specifically support vector machine (SVM) based recursive feature elimination (SVM-RFE), were applied to discover the potential biomarkers of STAD with the data form the Cancer Genome Atlas (TCGA). After the optimal parameter set was determined, random sampling was conducted to minimize the limitation caused by small sample size (64 paired tumor and adjacent non-tumor samples). As a result, five genes (COL10A1, CST1, ESM1, HOXC11 and HOXC9) were identified to be essential to the predictive model built by SVM-RFE. In addition, other three genes GAD1, HOXA11 and PRKCG are of less importance but still could be potential biomarkers of STAD.