利用高密度脂蛋白转运的细胞因子进行冠状动脉疾病风险分类的逻辑回归和统计正则化技术

Seema Singh Saharan, Pankaj Nagar, Kate Townsend Creasy, Eveline O Stock, James Feng, Mary J Malloy, John P Kane
{"title":"利用高密度脂蛋白转运的细胞因子进行冠状动脉疾病风险分类的逻辑回归和统计正则化技术","authors":"Seema Singh Saharan, Pankaj Nagar, Kate Townsend Creasy, Eveline O Stock, James Feng, Mary J Malloy, John P Kane","doi":"10.1109/csci62032.2023.00114","DOIUrl":null,"url":null,"abstract":"<p><p>Coronary artery disease (CAD) is a leading cause of mortality in the world. It is important to be able to proactively assess the risk of the disease, using novel biomarkers like cytokines that are indicators of inflammation in addition to traditional predictors of risk. Atherosclerosis, the primary cause of CAD, is an inflammatory disease involving cytokines. Identifying which cytokines are specifically altered can advance diagnosis and personalized treatment. Emerging research demonstrates that cytokines are transported on high density lipoproteins (HDL). Therefore, it is important to explore the roles of HDL-associated cytokines in vascular inflammation. Machine Learning (ML) algorithms are enhancing pioneering research from the standpoint of precision medicine. This technology can materially enable the translation of scientific research to clinical practice. In this study we implemented logistic regression and the derived regularized techniques using age and multidimensional cytokine biomarkers with the objective of identification of individuals \"At Risk\" for CAD. These techniques were further empowered by k-fold cross validation and hyper parameter tuning. Of the numerous algorithms investigated, the three most prominent ones, assessed based on area under receiver operating characteristic (AUROC) score are as follows: logistic regression, least absolute shrinkage, and selection operator (LASSO) regression with feature selection and ridge regression with feature selection. Logistic regression demonstrated an AUROC score of .85 with a 95% Confidence Interval CI (.804, .897), LASSO regression achieved a better AUROC score of .875 with a 95% CI (.832, .917) and finally ridge regression with feature selection exhibited the highest AUROC score of .878 with a 95% CI (.837, .92). The 2-sample independent t test proved that the three techniques were statistically significantly different from each other. With regard to the best classification demonstrated by ridge regression with feature selection, the most prominent biomarkers identified for the best classification achieved by ridge regression by feature selection, in the order of importance are as follows: Age, IL-7, RANTES, IFN-gamma, IL-3, GM-CSF, IL-15, IP-10, GCSF, IL-12. The identification and quantification of cytokines transported by HDL provide novel mechanistic insights that can inform the assessment of risk and therapeutic intervention in CAD.</p>","PeriodicalId":93614,"journal":{"name":"Proceedings. International Conference on Computational Science and Computational Intelligence","volume":"2023 ","pages":"652-660"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527457/pdf/","citationCount":"0","resultStr":"{\"title\":\"Logistic Regression and Statistical Regularization Techniques for Risk Classification of Coronary Artery Disease using Cytokines transported by high density lipoproteins.\",\"authors\":\"Seema Singh Saharan, Pankaj Nagar, Kate Townsend Creasy, Eveline O Stock, James Feng, Mary J Malloy, John P Kane\",\"doi\":\"10.1109/csci62032.2023.00114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Coronary artery disease (CAD) is a leading cause of mortality in the world. It is important to be able to proactively assess the risk of the disease, using novel biomarkers like cytokines that are indicators of inflammation in addition to traditional predictors of risk. Atherosclerosis, the primary cause of CAD, is an inflammatory disease involving cytokines. Identifying which cytokines are specifically altered can advance diagnosis and personalized treatment. Emerging research demonstrates that cytokines are transported on high density lipoproteins (HDL). Therefore, it is important to explore the roles of HDL-associated cytokines in vascular inflammation. Machine Learning (ML) algorithms are enhancing pioneering research from the standpoint of precision medicine. This technology can materially enable the translation of scientific research to clinical practice. In this study we implemented logistic regression and the derived regularized techniques using age and multidimensional cytokine biomarkers with the objective of identification of individuals \\\"At Risk\\\" for CAD. These techniques were further empowered by k-fold cross validation and hyper parameter tuning. Of the numerous algorithms investigated, the three most prominent ones, assessed based on area under receiver operating characteristic (AUROC) score are as follows: logistic regression, least absolute shrinkage, and selection operator (LASSO) regression with feature selection and ridge regression with feature selection. Logistic regression demonstrated an AUROC score of .85 with a 95% Confidence Interval CI (.804, .897), LASSO regression achieved a better AUROC score of .875 with a 95% CI (.832, .917) and finally ridge regression with feature selection exhibited the highest AUROC score of .878 with a 95% CI (.837, .92). The 2-sample independent t test proved that the three techniques were statistically significantly different from each other. With regard to the best classification demonstrated by ridge regression with feature selection, the most prominent biomarkers identified for the best classification achieved by ridge regression by feature selection, in the order of importance are as follows: Age, IL-7, RANTES, IFN-gamma, IL-3, GM-CSF, IL-15, IP-10, GCSF, IL-12. The identification and quantification of cytokines transported by HDL provide novel mechanistic insights that can inform the assessment of risk and therapeutic intervention in CAD.</p>\",\"PeriodicalId\":93614,\"journal\":{\"name\":\"Proceedings. International Conference on Computational Science and Computational Intelligence\",\"volume\":\"2023 \",\"pages\":\"652-660\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527457/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Computational Science and Computational Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/csci62032.2023.00114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/19 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Computational Science and Computational Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/csci62032.2023.00114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/19 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

冠状动脉疾病(CAD)是导致全球死亡的主要原因。除了传统的风险预测指标外,利用细胞因子等新型生物标志物作为炎症指标,主动评估疾病风险也非常重要。动脉粥样硬化是导致心血管疾病的主要原因,它是一种涉及细胞因子的炎症性疾病。确定哪些细胞因子发生了特异性改变可以促进诊断和个性化治疗。最新研究表明,细胞因子会在高密度脂蛋白(HDL)上运输。因此,探索高密度脂蛋白相关细胞因子在血管炎症中的作用非常重要。从精准医疗的角度来看,机器学习(ML)算法正在加强开拓性研究。这项技术能使科学研究转化为临床实践。在这项研究中,我们利用年龄和多维细胞因子生物标记物实施了逻辑回归和衍生的正则化技术,目的是识别出有 CAD "风险 "的个体。k 倍交叉验证和超参数调整进一步增强了这些技术的能力。在研究的众多算法中,根据接收者操作特征下面积(AUROC)得分评估,最突出的三种算法如下:逻辑回归、最小绝对收缩和选择算子(LASSO)回归与特征选择以及脊回归与特征选择。逻辑回归的 AUROC 得分为 0.85,95% 置信区间为(.804, 0.897);LASSO 回归的 AUROC 得分为 0.875,95% 置信区间为(.832, 0.917);带有特征选择的脊回归的 AUROC 得分为 0.878,95% 置信区间为(.837, 0.92)。2 样本独立 t 检验证明,这三种技术在统计上有显著差异。关于脊回归与特征选择所显示的最佳分类,脊回归与特征选择所实现的最佳分类中最突出的生物标记物按重要性顺序排列如下:年龄、IL-7、RANTES、IFN-γ、IL-3、GM-CSF、IL-15、IP-10、GCSF、IL-12。高密度脂蛋白转运的细胞因子的鉴定和定量提供了新的机理见解,可为评估 CAD 的风险和治疗干预提供信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Logistic Regression and Statistical Regularization Techniques for Risk Classification of Coronary Artery Disease using Cytokines transported by high density lipoproteins.

Coronary artery disease (CAD) is a leading cause of mortality in the world. It is important to be able to proactively assess the risk of the disease, using novel biomarkers like cytokines that are indicators of inflammation in addition to traditional predictors of risk. Atherosclerosis, the primary cause of CAD, is an inflammatory disease involving cytokines. Identifying which cytokines are specifically altered can advance diagnosis and personalized treatment. Emerging research demonstrates that cytokines are transported on high density lipoproteins (HDL). Therefore, it is important to explore the roles of HDL-associated cytokines in vascular inflammation. Machine Learning (ML) algorithms are enhancing pioneering research from the standpoint of precision medicine. This technology can materially enable the translation of scientific research to clinical practice. In this study we implemented logistic regression and the derived regularized techniques using age and multidimensional cytokine biomarkers with the objective of identification of individuals "At Risk" for CAD. These techniques were further empowered by k-fold cross validation and hyper parameter tuning. Of the numerous algorithms investigated, the three most prominent ones, assessed based on area under receiver operating characteristic (AUROC) score are as follows: logistic regression, least absolute shrinkage, and selection operator (LASSO) regression with feature selection and ridge regression with feature selection. Logistic regression demonstrated an AUROC score of .85 with a 95% Confidence Interval CI (.804, .897), LASSO regression achieved a better AUROC score of .875 with a 95% CI (.832, .917) and finally ridge regression with feature selection exhibited the highest AUROC score of .878 with a 95% CI (.837, .92). The 2-sample independent t test proved that the three techniques were statistically significantly different from each other. With regard to the best classification demonstrated by ridge regression with feature selection, the most prominent biomarkers identified for the best classification achieved by ridge regression by feature selection, in the order of importance are as follows: Age, IL-7, RANTES, IFN-gamma, IL-3, GM-CSF, IL-15, IP-10, GCSF, IL-12. The identification and quantification of cytokines transported by HDL provide novel mechanistic insights that can inform the assessment of risk and therapeutic intervention in CAD.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信