A hybrid of data mining and ensemble learning forecasting for recurrent ovarian cancer

Y. Lu, Chi-Jie Lu, Chi-Chang Chang, Yu-Wen Lin
{"title":"A hybrid of data mining and ensemble learning forecasting for recurrent ovarian cancer","authors":"Y. Lu, Chi-Jie Lu, Chi-Chang Chang, Yu-Wen Lin","doi":"10.1109/ICIIBMS.2017.8279723","DOIUrl":null,"url":null,"abstract":"This study applied advanced machine learning techniques and combined with ensemble learning, widely considered as the most successful method to produce objective to an inferential problem of recurrent ovarian cancer. In this study, five machine learning approaches including SVM(support vector machine), C5.0, ELM(extreme learning machine), MARS(Multivariate Adaptive Regression Splines) and RF(Random Forests) were considered to find important risk factors and to predict the recurrence-proneness for ovarian cancer. We use ensemble learning to improve the defect of classification accuracy used normal machine learning: first, selecting important risk factors by ensemble learning, then, using the five machine learning approaches to analyze again. The medical records and pathology were accessible by the Chung Shan Medical University Hospital Tumor Registry. The existing literature on recurrent ovarian cancer reveals that factors include Age, Histology, Grade, Pathologic T, Pathologic N, Pathologic M, Pathologic Stage, The International Federation of Gynecology and Obstetrics (FIGO), Surgical Margins, Performance status, CA125, Operation Optimal Debulking, Chemotherapy Guideline. There are totally 987 patients in the data set. In our study, C5.0 is the superior approach in predicting recurrence of ovarian cancer. Moreover, the classification accuracy of C5.0, MARS, RF and SVM indeed increases after using ensemble learning. Particularly the classification accuracy of C5.0 obviously improves by using ensemble learning with hybrid scheme.","PeriodicalId":122969,"journal":{"name":"2017 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIIBMS.2017.8279723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

This study applied advanced machine learning techniques and combined with ensemble learning, widely considered as the most successful method to produce objective to an inferential problem of recurrent ovarian cancer. In this study, five machine learning approaches including SVM(support vector machine), C5.0, ELM(extreme learning machine), MARS(Multivariate Adaptive Regression Splines) and RF(Random Forests) were considered to find important risk factors and to predict the recurrence-proneness for ovarian cancer. We use ensemble learning to improve the defect of classification accuracy used normal machine learning: first, selecting important risk factors by ensemble learning, then, using the five machine learning approaches to analyze again. The medical records and pathology were accessible by the Chung Shan Medical University Hospital Tumor Registry. The existing literature on recurrent ovarian cancer reveals that factors include Age, Histology, Grade, Pathologic T, Pathologic N, Pathologic M, Pathologic Stage, The International Federation of Gynecology and Obstetrics (FIGO), Surgical Margins, Performance status, CA125, Operation Optimal Debulking, Chemotherapy Guideline. There are totally 987 patients in the data set. In our study, C5.0 is the superior approach in predicting recurrence of ovarian cancer. Moreover, the classification accuracy of C5.0, MARS, RF and SVM indeed increases after using ensemble learning. Particularly the classification accuracy of C5.0 obviously improves by using ensemble learning with hybrid scheme.
基于数据挖掘和集成学习的复发性卵巢癌预测
本研究应用先进的机器学习技术,并结合集成学习,被广泛认为是最成功的方法来产生客观的卵巢癌复发推理问题。本研究采用支持向量机SVM(support vector machine)、C5.0、极限学习机ELM(extreme learning machine)、多变量自适应回归样条(Multivariate Adaptive Regression Splines)和随机森林RF(Random Forests) 5种机器学习方法发现卵巢癌的重要危险因素并预测卵巢癌的复发倾向。我们使用集成学习来改善常规机器学习的分类精度缺陷:首先,通过集成学习选择重要的危险因素,然后,使用五种机器学习方法进行再次分析。医疗纪录及病理资料可于中山医科大学附属医院肿瘤登记处查阅。现有文献显示,卵巢癌复发的影响因素包括:年龄、组织学、分级、病理T、病理N、病理M、病理分期、国际妇产科联合会(FIGO)、手术切缘、运动状态、CA125、手术最佳减积、化疗指南。数据集中共有987例患者。在我们的研究中,C5.0是预测卵巢癌复发的最佳方法。此外,使用集成学习后,C5.0、MARS、RF和SVM的分类准确率确实有所提高。特别是采用混合方案的集成学习明显提高了C5.0的分类精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信