A new hybrid global optimization approach for selecting clinical and biological features that are relevant to the effective diagnosis of ovarian cancer

Abeer Alzubaidi, David J. Brown, G. Cosma, A. Pockley
{"title":"A new hybrid global optimization approach for selecting clinical and biological features that are relevant to the effective diagnosis of ovarian cancer","authors":"Abeer Alzubaidi, David J. Brown, G. Cosma, A. Pockley","doi":"10.1109/SSCI.2016.7849954","DOIUrl":null,"url":null,"abstract":"Reducing the number of features whilst maintaining an acceptable classification accuracy is a fundamental step in the process of constructing cancer predictive models. In this work, we introduce a novel hybrid (MI-LDA) feature selection approach for the diagnosis of ovarian cancer. This hybrid approach is embedded within a global optimization framework and offers a promising improvement on feature selection and classification accuracy processes. Global Mutual Information (MI) based feature selection optimizes the search process of finding best feature subsets in order to select the highly correlated predictors for ovarian cancer diagnosis. The maximal discriminative cancer predictors are then passed to a Linear Discriminant Analysis (LDA) classifier, and a Genetic Algorithm (GA) is applied to optimise the search process with respect to the estimated error rate of the LDA classifier (MI-LDA). Experiments were performed using an ovarian cancer dataset obtained from the FDA-NCI Clinical Proteomics Program Databank. The performance of the hybrid feature selection approach was evaluated using the Support Vector Machine (SVM) classifier and the LDA classifier. A comparison of the results revealed that the proposed (MI-LDA)-LDA model outperformed the (MI-LDA)-SVM model on selecting the maximal discriminative feature subset and achieved the highest predictive accuracy. The proposed system can therefore be used as an efficient tool for finding predictors and patterns in serum (blood)-derived proteomic data for the detection of ovarian cancer.","PeriodicalId":120288,"journal":{"name":"2016 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Symposium Series on Computational Intelligence (SSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSCI.2016.7849954","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Reducing the number of features whilst maintaining an acceptable classification accuracy is a fundamental step in the process of constructing cancer predictive models. In this work, we introduce a novel hybrid (MI-LDA) feature selection approach for the diagnosis of ovarian cancer. This hybrid approach is embedded within a global optimization framework and offers a promising improvement on feature selection and classification accuracy processes. Global Mutual Information (MI) based feature selection optimizes the search process of finding best feature subsets in order to select the highly correlated predictors for ovarian cancer diagnosis. The maximal discriminative cancer predictors are then passed to a Linear Discriminant Analysis (LDA) classifier, and a Genetic Algorithm (GA) is applied to optimise the search process with respect to the estimated error rate of the LDA classifier (MI-LDA). Experiments were performed using an ovarian cancer dataset obtained from the FDA-NCI Clinical Proteomics Program Databank. The performance of the hybrid feature selection approach was evaluated using the Support Vector Machine (SVM) classifier and the LDA classifier. A comparison of the results revealed that the proposed (MI-LDA)-LDA model outperformed the (MI-LDA)-SVM model on selecting the maximal discriminative feature subset and achieved the highest predictive accuracy. The proposed system can therefore be used as an efficient tool for finding predictors and patterns in serum (blood)-derived proteomic data for the detection of ovarian cancer.
一种新的混合全局优化方法,用于选择与卵巢癌有效诊断相关的临床和生物学特征
在构建癌症预测模型的过程中,减少特征的数量同时保持可接受的分类精度是一个基本步骤。在这项工作中,我们介绍了一种新的混合(MI-LDA)特征选择方法用于卵巢癌的诊断。这种混合方法嵌入在一个全局优化框架中,在特征选择和分类精度过程方面提供了有希望的改进。基于全局互信息(MI)的特征选择优化了寻找最佳特征子集的搜索过程,以选择高度相关的卵巢癌诊断预测因子。然后将最大判别性癌症预测因子传递给线性判别分析(LDA)分类器,并应用遗传算法(GA)根据LDA分类器(MI-LDA)的估计错误率优化搜索过程。实验使用从FDA-NCI临床蛋白质组学计划数据库获得的卵巢癌数据集进行。使用支持向量机(SVM)分类器和LDA分类器对混合特征选择方法的性能进行了评价。结果表明,(MI-LDA)-LDA模型在选择最大判别特征子集方面优于(MI-LDA)-SVM模型,并取得了最高的预测精度。因此,该系统可作为一种有效的工具,用于在血清(血液)来源的蛋白质组学数据中发现预测因子和模式,用于检测卵巢癌。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信