Multilayer Exponential Family Factor models for integrative analysis and learning disease progression.

IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Qinxia Wang, Yuanjia Wang
{"title":"Multilayer Exponential Family Factor models for integrative analysis and learning disease progression.","authors":"Qinxia Wang, Yuanjia Wang","doi":"10.1093/biostatistics/kxac042","DOIUrl":null,"url":null,"abstract":"<p><p>Current diagnosis of neurological disorders often relies on late-stage clinical symptoms, which poses barriers to developing effective interventions at the premanifest stage. Recent research suggests that biomarkers and subtle changes in clinical markers may occur in a time-ordered fashion and can be used as indicators of early disease. In this article, we tackle the challenges to leverage multidomain markers to learn early disease progression of neurological disorders. We propose to integrate heterogeneous types of measures from multiple domains (e.g., discrete clinical symptoms, ordinal cognitive markers, continuous neuroimaging, and blood biomarkers) using a hierarchical Multilayer Exponential Family Factor (MEFF) model, where the observations follow exponential family distributions with lower-dimensional latent factors. The latent factors are decomposed into shared factors across multiple domains and domain-specific factors, where the shared factors provide robust information to perform extensive phenotyping and partition patients into clinically meaningful and biologically homogeneous subgroups. Domain-specific factors capture remaining unique variations for each domain. The MEFF model also captures nonlinear trajectory of disease progression and orders critical events of neurodegeneration measured by each marker. To overcome computational challenges, we fit our model by approximate inference techniques for large-scale data. We apply the developed method to Parkinson's Progression Markers Initiative data to integrate biological, clinical, and cognitive markers arising from heterogeneous distributions. The model learns lower-dimensional representations of Parkinson's disease (PD) and the temporal ordering of the neurodegeneration of PD.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"203-219"},"PeriodicalIF":1.8000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10939400/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biostatistics/kxac042","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Current diagnosis of neurological disorders often relies on late-stage clinical symptoms, which poses barriers to developing effective interventions at the premanifest stage. Recent research suggests that biomarkers and subtle changes in clinical markers may occur in a time-ordered fashion and can be used as indicators of early disease. In this article, we tackle the challenges to leverage multidomain markers to learn early disease progression of neurological disorders. We propose to integrate heterogeneous types of measures from multiple domains (e.g., discrete clinical symptoms, ordinal cognitive markers, continuous neuroimaging, and blood biomarkers) using a hierarchical Multilayer Exponential Family Factor (MEFF) model, where the observations follow exponential family distributions with lower-dimensional latent factors. The latent factors are decomposed into shared factors across multiple domains and domain-specific factors, where the shared factors provide robust information to perform extensive phenotyping and partition patients into clinically meaningful and biologically homogeneous subgroups. Domain-specific factors capture remaining unique variations for each domain. The MEFF model also captures nonlinear trajectory of disease progression and orders critical events of neurodegeneration measured by each marker. To overcome computational challenges, we fit our model by approximate inference techniques for large-scale data. We apply the developed method to Parkinson's Progression Markers Initiative data to integrate biological, clinical, and cognitive markers arising from heterogeneous distributions. The model learns lower-dimensional representations of Parkinson's disease (PD) and the temporal ordering of the neurodegeneration of PD.

用于综合分析和学习疾病进展的多层指数族因子模型
目前对神经系统疾病的诊断通常依赖于晚期临床症状,这对在疾病显现前阶段制定有效的干预措施造成了障碍。最近的研究表明,生物标志物和临床标志物的微妙变化可能以时间顺序的方式发生,并可用作早期疾病的指标。在本文中,我们将探讨如何利用多域标记物来了解神经系统疾病的早期进展。我们建议使用分层多层指数族因子(MEFF)模型整合来自多个领域的异构类型的测量指标(如离散临床症状、序数认知标记、连续神经影像学和血液生物标记),在该模型中,观测值遵循具有低维潜在因子的指数族分布。潜在因子被分解为跨多个领域的共享因子和领域特异性因子,其中共享因子为进行广泛的表型分析和将患者划分为有临床意义的生物同质亚组提供了可靠的信息。领域特异性因子捕捉每个领域的其余独特变化。MEFF 模型还能捕捉疾病进展的非线性轨迹,并对每个标记物所测量的神经退行性变的关键事件进行排序。为了克服计算上的挑战,我们采用近似推理技术来拟合大规模数据的模型。我们将所开发的方法应用于帕金森病进展标志物倡议数据,以整合来自异质分布的生物、临床和认知标志物。该模型可以学习帕金森病(PD)的低维表征和帕金森病神经变性的时间顺序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biostatistics
Biostatistics 生物-数学与计算生物学
CiteScore
5.10
自引率
4.80%
发文量
45
审稿时长
6-12 weeks
期刊介绍: Among the important scientific developments of the 20th century is the explosive growth in statistical reasoning and methods for application to studies of human health. Examples include developments in likelihood methods for inference, epidemiologic statistics, clinical trials, survival analysis, and statistical genetics. Substantive problems in public health and biomedical research have fueled the development of statistical methods, which in turn have improved our ability to draw valid inferences from data. The objective of Biostatistics is to advance statistical science and its application to problems of human health and disease, with the ultimate goal of advancing the public''s health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信