Construction of an automated screening system to predict breast cancer diagnosis and prognosis

Sou-Young Jin, Jae-Kyung Won, Hojin Lee, Ho-Jin Choi
{"title":"Construction of an automated screening system to predict breast cancer diagnosis and prognosis","authors":"Sou-Young Jin,&nbsp;Jae-Kyung Won,&nbsp;Hojin Lee,&nbsp;Ho-Jin Choi","doi":"10.1111/j.1755-9294.2012.01124.x","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p><b>Background and aim:</b> Using machine learning methods can be helpful in the clinical decision processes such as pathological diagnosis with the aid of microscopic feature datasets. In the present study using the Breast Cancer Wisconsin dataset, an optimal algorithm (classifiers) which can predict both diagnosis (benign vs malignant) and prognosis (recur vs non-recur) was devised by comparing several classification algorithms. <b>Methods:</b> The performance of a two-step algorithm, which sequentially decides diagnosis and prognosis, was compared with that of a multi-class classifier, which divides classes simultaneously. <b>Results:</b> In the two-step classifier, it was discovered that the functional trees (FT) algorithm is the best for the first step of classification, and Naïve Bayes is the best for the second step of classification. On the other hand, the one-step classifier shows better accuracy and better prediction on benign and non-recurring cases than the two-step classifier, but it shows lower accuracy on predicting recurring cases, leading to lower sensitivity. <b>Conclusions:</b> We conclude that the two-step classifier with FT and Naïve Bayes is better than the one-step classifier. This work will be helpful in setting the automated screening system in real clinics and highlight clues to improve the accuracy by refining data and algorithm selection in data mining or machine learning processes.</p>\n </div>","PeriodicalId":92990,"journal":{"name":"Basic and applied pathology","volume":"5 1","pages":"15-18"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/j.1755-9294.2012.01124.x","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Basic and applied pathology","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/j.1755-9294.2012.01124.x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Background and aim: Using machine learning methods can be helpful in the clinical decision processes such as pathological diagnosis with the aid of microscopic feature datasets. In the present study using the Breast Cancer Wisconsin dataset, an optimal algorithm (classifiers) which can predict both diagnosis (benign vs malignant) and prognosis (recur vs non-recur) was devised by comparing several classification algorithms. Methods: The performance of a two-step algorithm, which sequentially decides diagnosis and prognosis, was compared with that of a multi-class classifier, which divides classes simultaneously. Results: In the two-step classifier, it was discovered that the functional trees (FT) algorithm is the best for the first step of classification, and Naïve Bayes is the best for the second step of classification. On the other hand, the one-step classifier shows better accuracy and better prediction on benign and non-recurring cases than the two-step classifier, but it shows lower accuracy on predicting recurring cases, leading to lower sensitivity. Conclusions: We conclude that the two-step classifier with FT and Naïve Bayes is better than the one-step classifier. This work will be helpful in setting the automated screening system in real clinics and highlight clues to improve the accuracy by refining data and algorithm selection in data mining or machine learning processes.

构建预测乳腺癌诊断和预后的自动筛查系统
背景与目的:利用机器学习方法可以帮助临床决策过程,如借助微观特征数据集进行病理诊断。在本研究中,使用乳腺癌威斯康星数据集,通过比较几种分类算法,设计了一种可以预测诊断(良性与恶性)和预后(复发与非复发)的最佳算法(分类器)。方法:将顺序决定诊断和预后的两步算法与同时划分类别的多类分类器的性能进行比较。结果:在两步分类器中,发现功能树(FT)算法对第一步分类效果最好,Naïve贝叶斯算法对第二步分类效果最好。另一方面,与两步分类器相比,一步分类器对良性和非复发病例的准确率更高,预测效果更好,但对复发病例的预测准确率较低,导致灵敏度较低。结论:结合FT和Naïve贝叶斯的两步分类器优于一步分类器。这项工作将有助于在实际诊所中设置自动筛选系统,并通过数据挖掘或机器学习过程中精炼数据和算法选择来提高准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信