支持向量机算法在前列腺癌早期鉴别诊断中的应用

Boluwaji A. Akinnuwesi , Kehinde A. Olayanju , Benjamin S. Aribisala , Stephen G. Fashoto , Elliot Mbunge , Moses Okpeku , Patrick Owate
{"title":"支持向量机算法在前列腺癌早期鉴别诊断中的应用","authors":"Boluwaji A. Akinnuwesi ,&nbsp;Kehinde A. Olayanju ,&nbsp;Benjamin S. Aribisala ,&nbsp;Stephen G. Fashoto ,&nbsp;Elliot Mbunge ,&nbsp;Moses Okpeku ,&nbsp;Patrick Owate","doi":"10.1016/j.dsm.2022.10.001","DOIUrl":null,"url":null,"abstract":"<div><p>Prostate cancer (PCa) symptoms are commonly confused with benign prostate hyperplasia (BPH), particularly in the early stages due to similarities between symptoms, and in some instances, underdiagnoses. Clinical methods have been utilized to diagnose PCa; however, at the full-blown stage, clinical methods usually present high risks of complicated side effects. Therefore, we proposed the use of support vector machine for early differential diagnosis of PCa (SVM-PCa-EDD). SVM was used to classify persons with and without PCa. We used the PCa dataset from the Kaggle Healthcare repository to develop and validate SVM model for classification. The PCa dataset consisted of 250 features and one class of features. Attributes considered in this study were age, body mass index (BMI), race, family history, obesity, trouble urinating, urine stream force, blood in semen, bone pain, and erectile dysfunction. The SVM-PCa-EDD was used for preprocessing the PCa dataset, specifically dealing with class imbalance, and for dimensionality reduction. After eliminating class imbalance, the area under the receiver operating characteristic (ROC) curve (AUC) of the logistic regression (LR) model trained with the downsampled dataset was 58.4%, whereas that of the AUC-ROC of LR trained with the class imbalance dataset was 54.3%. The SVM-PCa-EDD achieved 90% accuracy, 80% sensitivity, and 80% specificity. The validation of SVM-PCa-EDD using random forest and LR showed that SVM-PCa-EDD performed better in early differential diagnosis of PCa. The proposed model can assist medical experts in early diagnosis of PCa, particularly in resource-constrained healthcare settings and making further recommendations for PCa testing and treatment.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Application of support vector machine algorithm for early differential diagnosis of prostate cancer\",\"authors\":\"Boluwaji A. Akinnuwesi ,&nbsp;Kehinde A. Olayanju ,&nbsp;Benjamin S. Aribisala ,&nbsp;Stephen G. Fashoto ,&nbsp;Elliot Mbunge ,&nbsp;Moses Okpeku ,&nbsp;Patrick Owate\",\"doi\":\"10.1016/j.dsm.2022.10.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Prostate cancer (PCa) symptoms are commonly confused with benign prostate hyperplasia (BPH), particularly in the early stages due to similarities between symptoms, and in some instances, underdiagnoses. Clinical methods have been utilized to diagnose PCa; however, at the full-blown stage, clinical methods usually present high risks of complicated side effects. Therefore, we proposed the use of support vector machine for early differential diagnosis of PCa (SVM-PCa-EDD). SVM was used to classify persons with and without PCa. We used the PCa dataset from the Kaggle Healthcare repository to develop and validate SVM model for classification. The PCa dataset consisted of 250 features and one class of features. Attributes considered in this study were age, body mass index (BMI), race, family history, obesity, trouble urinating, urine stream force, blood in semen, bone pain, and erectile dysfunction. The SVM-PCa-EDD was used for preprocessing the PCa dataset, specifically dealing with class imbalance, and for dimensionality reduction. After eliminating class imbalance, the area under the receiver operating characteristic (ROC) curve (AUC) of the logistic regression (LR) model trained with the downsampled dataset was 58.4%, whereas that of the AUC-ROC of LR trained with the class imbalance dataset was 54.3%. The SVM-PCa-EDD achieved 90% accuracy, 80% sensitivity, and 80% specificity. The validation of SVM-PCa-EDD using random forest and LR showed that SVM-PCa-EDD performed better in early differential diagnosis of PCa. The proposed model can assist medical experts in early diagnosis of PCa, particularly in resource-constrained healthcare settings and making further recommendations for PCa testing and treatment.</p></div>\",\"PeriodicalId\":100353,\"journal\":{\"name\":\"Data Science and Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Science and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666764922000443\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science and Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666764922000443","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

前列腺癌症(PCa)症状通常与良性前列腺增生(BPH)混淆,尤其是在早期阶段,因为症状相似,在某些情况下,诊断不足。临床方法已被用于诊断前列腺癌;然而,在成熟阶段,临床方法通常存在复杂副作用的高风险。因此,我们提出了使用支持向量机对前列腺癌进行早期鉴别诊断(SVM PCa-EDD)。SVM用于对患有和不患有前列腺癌的人进行分类。我们使用Kaggle Healthcare存储库中的PCa数据集来开发和验证用于分类的SVM模型。PCa数据集由250个特征和一类特征组成。本研究考虑的因素包括年龄、体重指数(BMI)、种族、家族史、肥胖、排尿困难、尿流力、精液中的血液、骨痛和勃起功能障碍。SVM PCa-EDD用于对PCa数据集进行预处理,特别是处理类不平衡和降维。在消除类别不平衡后,用下采样数据集训练的逻辑回归(LR)模型的受试者操作特征曲线下面积(ROC)为58.4%,而用类别不平衡数据集训练LR的AUC-ROC为54.3%。SVM PCa-EDD实现了90%的准确率、80%的灵敏度和80%的特异性。使用随机森林和LR对SVM PCa-EDD的验证表明,SVM PCa-ED在PCa的早期鉴别诊断中表现更好。所提出的模型可以帮助医学专家早期诊断前列腺癌,特别是在资源有限的医疗环境中,并为前列腺癌检测和治疗提出进一步建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Application of support vector machine algorithm for early differential diagnosis of prostate cancer

Prostate cancer (PCa) symptoms are commonly confused with benign prostate hyperplasia (BPH), particularly in the early stages due to similarities between symptoms, and in some instances, underdiagnoses. Clinical methods have been utilized to diagnose PCa; however, at the full-blown stage, clinical methods usually present high risks of complicated side effects. Therefore, we proposed the use of support vector machine for early differential diagnosis of PCa (SVM-PCa-EDD). SVM was used to classify persons with and without PCa. We used the PCa dataset from the Kaggle Healthcare repository to develop and validate SVM model for classification. The PCa dataset consisted of 250 features and one class of features. Attributes considered in this study were age, body mass index (BMI), race, family history, obesity, trouble urinating, urine stream force, blood in semen, bone pain, and erectile dysfunction. The SVM-PCa-EDD was used for preprocessing the PCa dataset, specifically dealing with class imbalance, and for dimensionality reduction. After eliminating class imbalance, the area under the receiver operating characteristic (ROC) curve (AUC) of the logistic regression (LR) model trained with the downsampled dataset was 58.4%, whereas that of the AUC-ROC of LR trained with the class imbalance dataset was 54.3%. The SVM-PCa-EDD achieved 90% accuracy, 80% sensitivity, and 80% specificity. The validation of SVM-PCa-EDD using random forest and LR showed that SVM-PCa-EDD performed better in early differential diagnosis of PCa. The proposed model can assist medical experts in early diagnosis of PCa, particularly in resource-constrained healthcare settings and making further recommendations for PCa testing and treatment.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.50
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信