High-throughput DeepPRM-Stellar proteomics coupled with machine learning enables precise quantification of atherosclerosis-stroke progression biomarkers and risk prediction

IF 3.3 3区 化学 Q2 CHEMISTRY, ANALYTICAL
Analyst Pub Date : 2025-07-12 DOI:10.1039/d5an00396b
Ye Liu, Ouyang Hu, Jingyi Wang, Yijie Qiu, Jin Xiao, Xin Cheng, Pengyuan Yang, Ning-Shao Xia, Yueting Xiong, Quan Yuan
{"title":"High-throughput DeepPRM-Stellar proteomics coupled with machine learning enables precise quantification of atherosclerosis-stroke progression biomarkers and risk prediction","authors":"Ye Liu, Ouyang Hu, Jingyi Wang, Yijie Qiu, Jin Xiao, Xin Cheng, Pengyuan Yang, Ning-Shao Xia, Yueting Xiong, Quan Yuan","doi":"10.1039/d5an00396b","DOIUrl":null,"url":null,"abstract":"Predicting the progression of asymptomatic large-artery atherosclerosis (LAA) to acute ischemic stroke (AIS) remains a significant challenge when relying solely on anatomical stenosis. To address this clinical gap, we integrated discovery-phase serum proteomics with machine-learning techniques to identify circulating biomarkers capable of predicting atherosclerotic progression. Utilizing a dual-cohort design (Cohort I: discovery stage, n = 43; Cohort II: validation stage, n = 39), we established a Serum Protein Candidate Biomarker Bank (SPCBB) encompassing 1,484 proteins by harmonizing literature-derived evidence (1369 proteins) with 222 differentially expressed proteins (DEPs) identified through mass spectrometry analysis. Global proteomics revealed that LAA-associated proteins were enriched in cholesterol metabolism, whereas AIS was characterized by the activation of complement/coagulation cascades. We performed targeted validation of 171 peptides (corresponding to 156 proteins) using DeepPRM on the Stellar platform, thereby facilitating machine learning-based optimization of the biomarker panel. The XGBoost algorithm identified two diagnostic signatures: a three-protein panel (RNASE4, HBA1, ATF6B) that differentiates AIS from LAA, with an area under the curve (AUC) of 0.917 and specificity of 80.0%; and a six-protein panel (MRC1, HBA1, GUC2A, HBD, CLEC3B, FLNA) that distinguishes AIS/LAA from healthy controls, achieving an AUC of 0.971 and specificity of 86.0%. To further validate key candidates, we performed ELISA assays for GUCA2A and FLNA, which confirmed their significant elevation in patients with AIS and LAA (p < 0.01), consistent with the proteomics findings. Both internal and external validations confirmed robust performance across cohorts. These validated biomarker panels establish a proteomics-driven framework for serum-based, dynamic monitoring of LAA progression, thereby supporting clinical decision-making aimed at optimizing early stroke prevention in asymptomatic individuals.","PeriodicalId":63,"journal":{"name":"Analyst","volume":"109 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analyst","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d5an00396b","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Predicting the progression of asymptomatic large-artery atherosclerosis (LAA) to acute ischemic stroke (AIS) remains a significant challenge when relying solely on anatomical stenosis. To address this clinical gap, we integrated discovery-phase serum proteomics with machine-learning techniques to identify circulating biomarkers capable of predicting atherosclerotic progression. Utilizing a dual-cohort design (Cohort I: discovery stage, n = 43; Cohort II: validation stage, n = 39), we established a Serum Protein Candidate Biomarker Bank (SPCBB) encompassing 1,484 proteins by harmonizing literature-derived evidence (1369 proteins) with 222 differentially expressed proteins (DEPs) identified through mass spectrometry analysis. Global proteomics revealed that LAA-associated proteins were enriched in cholesterol metabolism, whereas AIS was characterized by the activation of complement/coagulation cascades. We performed targeted validation of 171 peptides (corresponding to 156 proteins) using DeepPRM on the Stellar platform, thereby facilitating machine learning-based optimization of the biomarker panel. The XGBoost algorithm identified two diagnostic signatures: a three-protein panel (RNASE4, HBA1, ATF6B) that differentiates AIS from LAA, with an area under the curve (AUC) of 0.917 and specificity of 80.0%; and a six-protein panel (MRC1, HBA1, GUC2A, HBD, CLEC3B, FLNA) that distinguishes AIS/LAA from healthy controls, achieving an AUC of 0.971 and specificity of 86.0%. To further validate key candidates, we performed ELISA assays for GUCA2A and FLNA, which confirmed their significant elevation in patients with AIS and LAA (p < 0.01), consistent with the proteomics findings. Both internal and external validations confirmed robust performance across cohorts. These validated biomarker panels establish a proteomics-driven framework for serum-based, dynamic monitoring of LAA progression, thereby supporting clinical decision-making aimed at optimizing early stroke prevention in asymptomatic individuals.
高通量DeepPRM-Stellar蛋白质组学与机器学习相结合,可以精确量化动脉粥样硬化-中风进展生物标志物和风险预测
预测无症状大动脉粥样硬化(LAA)到急性缺血性脑卒中(AIS)的进展仍然是一个重大的挑战,当仅仅依靠解剖狭窄。为了解决这一临床空白,我们将发现阶段的血清蛋白质组学与机器学习技术相结合,以确定能够预测动脉粥样硬化进展的循环生物标志物。采用双队列设计(队列1:发现阶段,n = 43;队列II:验证阶段,n = 39),通过将文献证据(1369个蛋白)与通过质谱分析鉴定的222个差异表达蛋白(DEPs)相协调,我们建立了包含1,484个蛋白的血清蛋白候选生物标志物库(SPCBB)。整体蛋白质组学显示,laa相关蛋白在胆固醇代谢中富集,而AIS的特征是补体/凝血级联的激活。我们在Stellar平台上使用DeepPRM对171个肽(对应156个蛋白质)进行了靶向验证,从而促进了基于机器学习的生物标志物面板优化。XGBoost算法鉴定出两种诊断特征:区分AIS和LAA的三蛋白面板(RNASE4、HBA1、ATF6B),曲线下面积(AUC)为0.917,特异性为80.0%;以及将AIS/LAA与健康对照区分开的六蛋白面板(MRC1、HBA1、GUC2A、HBD、cle3b、FLNA), AUC为0.971,特异性为86.0%。为了进一步验证关键候选物,我们对GUCA2A和FLNA进行了ELISA检测,证实它们在AIS和LAA患者中显著升高(p <;0.01),与蛋白质组学研究结果一致。内部和外部验证都证实了整个队列的稳健性能。这些经过验证的生物标志物面板建立了一个蛋白质组学驱动的框架,用于基于血清的LAA进展动态监测,从而支持旨在优化无症状个体早期卒中预防的临床决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Analyst
Analyst 化学-分析化学
CiteScore
7.80
自引率
4.80%
发文量
636
审稿时长
1.9 months
期刊介绍: "Analyst" journal is the home of premier fundamental discoveries, inventions and applications in the analytical and bioanalytical sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信