Big data-driven optimal weighted fused features-based ensemble learning classifier for thyroid prediction with heuristic algorithm

IF 0.9 4区 数学 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
K. Hema Priya, K. Valarmathi
{"title":"Big data-driven optimal weighted fused features-based ensemble learning classifier for thyroid prediction with heuristic algorithm","authors":"K. Hema Priya, K. Valarmathi","doi":"10.1007/s10878-025-01304-4","DOIUrl":null,"url":null,"abstract":"<p>Diagnosis of thyroid disease is a most important cause in the field of medicinal research and it is a complex onset axiom. Secretion of Thyroid hormone plays a major role in the regulation of metabolism. Hence, it is very significant to predict thyroid disease in the initial stage, which is helpful for preventing more serious health complications due to thyroid cancer. The diagnostic accuracy of machine leaning-based approaches is greater but these techniques require large amounts of data for the diagnosis process. In the conventional approaches, the time needed for the prediction process is also high. Feature engineering is less investigated in conventional models and hence error produced during the prediction process is high. Hence, in this research work, a machine learning-aided thyroid disease prediction technique is designed to provide higher prediction accuracy and reliability. Initially, the thyroid data is gathered from the standard benchmark resources. Next, the data transformation process is carried out to make the data usable for analysis and visualization. After, the features are extracted using Principal Component Analysis (PCA), “One-Dimensional Convolutional Neural Network Model (1DCNN). Moreover, the statistical features are also extracted for getting more relevant information from the data. The three sets of features such as PCA-based, 1DCNN-based and statistical are concatenated and fed to the “optimal weighted feature selection” process, where the optimal features and weights are tuned by an Improved Archimedes Optimization Algorithm (IAOA). Next, the selected optimally fused features are given to the Ensemble Learning (EL) for predicting the thyroid diseases, where the EL with be suggested by incorporating stacking classifier, XGboost, and Multivariate regression classifier. Ensembling of three different classifiers provides higher thyroid disease prediction accuracy and it makes the decision about normal and abnormal classes. Here, the same IAOA is used for optimizing the parameters of every classifier. The investigational outcomes demonstrate that the proposed ensemble classifier provides higher performance than others. Experimental results prove that the thyroid prediction accuracy of the developed EL approach is 96.30%, precision is 99.67% and F1-score is 97.93%, which is more extensive than the state-of-the-art approaches.</p>","PeriodicalId":50231,"journal":{"name":"Journal of Combinatorial Optimization","volume":"17 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Combinatorial Optimization","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10878-025-01304-4","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Diagnosis of thyroid disease is a most important cause in the field of medicinal research and it is a complex onset axiom. Secretion of Thyroid hormone plays a major role in the regulation of metabolism. Hence, it is very significant to predict thyroid disease in the initial stage, which is helpful for preventing more serious health complications due to thyroid cancer. The diagnostic accuracy of machine leaning-based approaches is greater but these techniques require large amounts of data for the diagnosis process. In the conventional approaches, the time needed for the prediction process is also high. Feature engineering is less investigated in conventional models and hence error produced during the prediction process is high. Hence, in this research work, a machine learning-aided thyroid disease prediction technique is designed to provide higher prediction accuracy and reliability. Initially, the thyroid data is gathered from the standard benchmark resources. Next, the data transformation process is carried out to make the data usable for analysis and visualization. After, the features are extracted using Principal Component Analysis (PCA), “One-Dimensional Convolutional Neural Network Model (1DCNN). Moreover, the statistical features are also extracted for getting more relevant information from the data. The three sets of features such as PCA-based, 1DCNN-based and statistical are concatenated and fed to the “optimal weighted feature selection” process, where the optimal features and weights are tuned by an Improved Archimedes Optimization Algorithm (IAOA). Next, the selected optimally fused features are given to the Ensemble Learning (EL) for predicting the thyroid diseases, where the EL with be suggested by incorporating stacking classifier, XGboost, and Multivariate regression classifier. Ensembling of three different classifiers provides higher thyroid disease prediction accuracy and it makes the decision about normal and abnormal classes. Here, the same IAOA is used for optimizing the parameters of every classifier. The investigational outcomes demonstrate that the proposed ensemble classifier provides higher performance than others. Experimental results prove that the thyroid prediction accuracy of the developed EL approach is 96.30%, precision is 99.67% and F1-score is 97.93%, which is more extensive than the state-of-the-art approaches.

基于启发式算法的大数据驱动最优加权融合特征集成学习甲状腺预测分类器
甲状腺疾病的诊断是医学研究领域的一个重要领域,它是一个复杂的发病公理。甲状腺激素的分泌在调节新陈代谢中起着重要作用。因此,早期预测甲状腺疾病,有助于预防甲状腺癌引起的更严重的健康并发症,具有十分重要的意义。基于机器学习的诊断方法的诊断准确性更高,但这些技术需要大量的诊断过程数据。在传统的方法中,预测过程所需的时间也很高。特征工程在传统模型中研究较少,因此在预测过程中产生的误差很高。因此,在本研究工作中,设计了一种机器学习辅助甲状腺疾病预测技术,以提供更高的预测精度和可靠性。最初,从标准基准资源中收集甲状腺数据。接下来,进行数据转换过程,使数据可用于分析和可视化。然后,使用主成分分析(PCA)、一维卷积神经网络模型(1DCNN)提取特征。此外,为了从数据中获得更多的相关信息,还提取了统计特征。将基于pca、基于1dcnn和统计的三组特征连接起来,并输入“最优加权特征选择”过程,其中最优特征和权重通过改进的阿基米德优化算法(IAOA)进行调整。接下来,将选择的最优融合特征提供给集成学习(EL)用于预测甲状腺疾病,其中EL将结合堆叠分类器,XGboost和多元回归分类器提出。三种不同分类器的集成提供了更高的甲状腺疾病预测精度,并做出了正常和异常类别的决定。这里,同样的IAOA用于优化每个分类器的参数。研究结果表明,所提出的集成分类器比其他分类器具有更高的性能。实验结果表明,该方法的甲状腺预测准确率为96.30%,精密度为99.67%,f1评分为97.93%,比目前的方法更广泛。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Combinatorial Optimization
Journal of Combinatorial Optimization 数学-计算机:跨学科应用
CiteScore
2.00
自引率
10.00%
发文量
83
审稿时长
6 months
期刊介绍: The objective of Journal of Combinatorial Optimization is to advance and promote the theory and applications of combinatorial optimization, which is an area of research at the intersection of applied mathematics, computer science, and operations research and which overlaps with many other areas such as computation complexity, computational biology, VLSI design, communication networks, and management science. It includes complexity analysis and algorithm design for combinatorial optimization problems, numerical experiments and problem discovery with applications in science and engineering. The Journal of Combinatorial Optimization publishes refereed papers dealing with all theoretical, computational and applied aspects of combinatorial optimization. It also publishes reviews of appropriate books and special issues of journals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信