Effects of hybrid non-linear feature extraction method on different data sampling techniques for liver disease prediction

Rubia Yasmin, Ruhul Amin, Md. Shamim Reza
{"title":"Effects of hybrid non-linear feature extraction method on different data sampling techniques for liver disease prediction","authors":"Rubia Yasmin, Ruhul Amin, Md. Shamim Reza","doi":"10.5267/j.jfs.2022.9.005","DOIUrl":null,"url":null,"abstract":"Liver disease indicates inflammatory condition of the liver, liver cirrhosis, cancer, or an overload of toxic substances. A liver transplant may reinstate and extend life if a patient has severe liver disease. In the last few years, machine learning (ML) based diagnosis systems have played a vital role in assessing liver patients which eventually leads to proper treatment and saves human life. In this study, we try to predict liver patients by adopting a hybrid feature extraction method to enhance the performance of the ML algorithm. Medical data frequently exhibits non-linear patterns and class imbalances. This is undesirable for the majority of ML algorithms and degrades performance. Here, we present a hybrid feature space that combines t-SNE, Isomap nonlinear features, and kernel principal components that can explain 90% of the variation in the data as a solution to this issue. Before feeding the ML model, data preprocessing techniques including class balancing, identifying outliers, and impute missing values are used. A simulation study and ensemble learning also conducted to justify the proposed prediction performances. Our suggested hybrid non-linear feature exhibits a 2-20 % improvement over existing studies and the ensemble classifier achieved an ideal and outstanding accuracy of 91.33 %.","PeriodicalId":150615,"journal":{"name":"Journal of Future Sustainability","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Future Sustainability","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5267/j.jfs.2022.9.005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Liver disease indicates inflammatory condition of the liver, liver cirrhosis, cancer, or an overload of toxic substances. A liver transplant may reinstate and extend life if a patient has severe liver disease. In the last few years, machine learning (ML) based diagnosis systems have played a vital role in assessing liver patients which eventually leads to proper treatment and saves human life. In this study, we try to predict liver patients by adopting a hybrid feature extraction method to enhance the performance of the ML algorithm. Medical data frequently exhibits non-linear patterns and class imbalances. This is undesirable for the majority of ML algorithms and degrades performance. Here, we present a hybrid feature space that combines t-SNE, Isomap nonlinear features, and kernel principal components that can explain 90% of the variation in the data as a solution to this issue. Before feeding the ML model, data preprocessing techniques including class balancing, identifying outliers, and impute missing values are used. A simulation study and ensemble learning also conducted to justify the proposed prediction performances. Our suggested hybrid non-linear feature exhibits a 2-20 % improvement over existing studies and the ensemble classifier achieved an ideal and outstanding accuracy of 91.33 %.
混合非线性特征提取方法对不同肝脏疾病预测数据采样技术的影响
肝病指的是肝脏的炎症、肝硬化、癌症或有毒物质过量。如果病人患有严重的肝脏疾病,肝移植可以恢复和延长生命。在过去的几年中,基于机器学习(ML)的诊断系统在评估肝脏患者方面发挥了至关重要的作用,最终导致适当的治疗并挽救了人类的生命。在本研究中,我们尝试采用混合特征提取方法来预测肝脏患者,以提高ML算法的性能。医疗数据经常显示非线性模式和类别不平衡。这对于大多数ML算法来说是不可取的,并且会降低性能。在这里,我们提出了一个混合特征空间,它结合了t-SNE、Isomap非线性特征和内核主成分,可以解释数据中90%的变化,作为这个问题的解决方案。在输入ML模型之前,使用数据预处理技术,包括类平衡,识别异常值和估算缺失值。仿真研究和集成学习也证明了所提出的预测性能。我们提出的混合非线性特征比现有研究提高了2- 20%,集成分类器达到了91.33%的理想和突出的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信