Feature gene screening and diagnosis of breast cancer based on the minimum classification error rate criterion

IF 5.2 2区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY
Jingze Song , Chang Cai , Kedong Zhang , Tongshun Liu
{"title":"Feature gene screening and diagnosis of breast cancer based on the minimum classification error rate criterion","authors":"Jingze Song ,&nbsp;Chang Cai ,&nbsp;Kedong Zhang ,&nbsp;Tongshun Liu","doi":"10.1016/j.measurement.2025.117787","DOIUrl":null,"url":null,"abstract":"<div><div>Breast cancer is currently the most common type of malignant cancer in women worldwide, accounting for 31 % of cancers in women, and has been on the rise in terms of both morbidity and mortality. Feature gene screening is essential for diagnosing, prognosis, and timely treatment. This study proposed a breast cancer feature genes screening method based on the minimum classification error rate criterion for accurate breast cancer diagnosis. Firstly, the overlapping area between the two distribution curves of cancer and normal gene expression data, namely, the statistically minimum classification error rate was calculated, and the breast cancer feature genes were then pre-screened from The Cancer Genome Atlas (TCGA) data with the minimum classification error rate criterion. Secondly, the feature genes were further screened based on the Weighted Gene Co-expression Network Analysis (WGCNA) and Protein-Protein Interaction (PPI) network analysis, and the Bayesian network for diagnosing breast cancer was constructed based on the screened genes. Finally, the effectiveness of the genetic screening method was validated using TCGA data within the Bayesian network diagnostic model. Experimental results showed that the method proposed in this paper had an accuracy of 96.67%, precision of 100%, recall of 93.1%, and F1 score of 0.9643, which were improved by 5%, 7.14%, 3.44%, and 5.7% compared to the conventional cancer gene screening methods with differential expression analysis.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"253 ","pages":"Article 117787"},"PeriodicalIF":5.2000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0263224125011467","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Breast cancer is currently the most common type of malignant cancer in women worldwide, accounting for 31 % of cancers in women, and has been on the rise in terms of both morbidity and mortality. Feature gene screening is essential for diagnosing, prognosis, and timely treatment. This study proposed a breast cancer feature genes screening method based on the minimum classification error rate criterion for accurate breast cancer diagnosis. Firstly, the overlapping area between the two distribution curves of cancer and normal gene expression data, namely, the statistically minimum classification error rate was calculated, and the breast cancer feature genes were then pre-screened from The Cancer Genome Atlas (TCGA) data with the minimum classification error rate criterion. Secondly, the feature genes were further screened based on the Weighted Gene Co-expression Network Analysis (WGCNA) and Protein-Protein Interaction (PPI) network analysis, and the Bayesian network for diagnosing breast cancer was constructed based on the screened genes. Finally, the effectiveness of the genetic screening method was validated using TCGA data within the Bayesian network diagnostic model. Experimental results showed that the method proposed in this paper had an accuracy of 96.67%, precision of 100%, recall of 93.1%, and F1 score of 0.9643, which were improved by 5%, 7.14%, 3.44%, and 5.7% compared to the conventional cancer gene screening methods with differential expression analysis.
基于最小分类错误率标准的特征基因筛查与乳腺癌诊断
乳腺癌目前是全世界妇女中最常见的恶性癌症类型,占妇女癌症的31%,并且在发病率和死亡率方面都在上升。特征基因筛选对诊断、预后和及时治疗至关重要。本研究提出了一种基于最小分类错误率标准的乳腺癌特征基因筛选方法,用于乳腺癌的准确诊断。首先计算癌症与正常基因表达数据两条分布曲线的重叠面积,即统计最小分类错误率,然后以最小分类错误率标准从the cancer Genome Atlas (TCGA)数据中预筛选乳腺癌特征基因。其次,基于加权基因共表达网络分析(Weighted Gene Co-expression Network Analysis, WGCNA)和蛋白-蛋白相互作用(Protein-Protein Interaction, PPI)网络分析进一步筛选特征基因,并基于筛选到的基因构建乳腺癌诊断贝叶斯网络。最后,利用贝叶斯网络诊断模型中的TCGA数据验证了遗传筛选方法的有效性。实验结果表明,本文提出的方法准确率为96.67%,精密度为100%,召回率为93.1%,F1评分为0.9643,与常规的差异表达分析的癌症基因筛查方法相比,分别提高了5%、7.14%、3.44%和5.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Measurement
Measurement 工程技术-工程:综合
CiteScore
10.20
自引率
12.50%
发文量
1589
审稿时长
12.1 months
期刊介绍: Contributions are invited on novel achievements in all fields of measurement and instrumentation science and technology. Authors are encouraged to submit novel material, whose ultimate goal is an advancement in the state of the art of: measurement and metrology fundamentals, sensors, measurement instruments, measurement and estimation techniques, measurement data processing and fusion algorithms, evaluation procedures and methodologies for plants and industrial processes, performance analysis of systems, processes and algorithms, mathematical models for measurement-oriented purposes, distributed measurement systems in a connected world.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信