Multivariate functional partial least squares for classification using longitudinal data.

IF 0.8 4区 生物学 Q4 BIOLOGY
Sonia Dembowska, Alejandro F Frangi, Jeanine Houwing-Duistermaat, Haiyan Liu
{"title":"Multivariate functional partial least squares for classification using longitudinal data.","authors":"Sonia Dembowska, Alejandro F Frangi, Jeanine Houwing-Duistermaat, Haiyan Liu","doi":"10.19272/202111402007","DOIUrl":null,"url":null,"abstract":"The use of statistical methods to predict outcomes using high dimensional datasets in medicine is becoming increasingly popular for forecasting and monitoring patient health. Our work is motivated by a longitudinal dataset containing 1H NMR spectra of metabolites of 18 patients undergoing a kidney transplant alongside their graft outcomes that fall into one of three categories: acute rejection, delayed graft function and primary function. We proposed a functional partial least squares (FPLS) model that extends existing PLS methods for the analysis of longitudinally measured scalar omics datasets to the case of longitudinally measured functional datasets. We designed an iterative algorithm to link multiple time points, and then applied our proposed method to analyse the data from kidney transplant patients. Finally, we compared the AUC of our method to the AUC of the univariate methods which only use the information of one time-point information. It appeared that our method outperforms the existing methods. A simulation study was performed to mimic the kidney transplant dataset but with a larger sample size and different scenarios performed to evaluate the performance of the new method in larger datasets. We consider scenarios which vary in the difficulty to distinguish the two groups. It appeared that the three time-points model performs better than any of the individual models with average AUCs of 0.909 and 0.811 respectively.","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":"114 1-2 1","pages":"75-88"},"PeriodicalIF":0.8000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Biology Forum","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.19272/202111402007","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 1

Abstract

The use of statistical methods to predict outcomes using high dimensional datasets in medicine is becoming increasingly popular for forecasting and monitoring patient health. Our work is motivated by a longitudinal dataset containing 1H NMR spectra of metabolites of 18 patients undergoing a kidney transplant alongside their graft outcomes that fall into one of three categories: acute rejection, delayed graft function and primary function. We proposed a functional partial least squares (FPLS) model that extends existing PLS methods for the analysis of longitudinally measured scalar omics datasets to the case of longitudinally measured functional datasets. We designed an iterative algorithm to link multiple time points, and then applied our proposed method to analyse the data from kidney transplant patients. Finally, we compared the AUC of our method to the AUC of the univariate methods which only use the information of one time-point information. It appeared that our method outperforms the existing methods. A simulation study was performed to mimic the kidney transplant dataset but with a larger sample size and different scenarios performed to evaluate the performance of the new method in larger datasets. We consider scenarios which vary in the difficulty to distinguish the two groups. It appeared that the three time-points model performs better than any of the individual models with average AUCs of 0.909 and 0.811 respectively.
多元泛函偏最小二乘分类使用纵向数据。
在医学中,使用统计方法预测使用高维数据集的结果在预测和监测患者健康方面越来越受欢迎。我们的工作是由一个纵向数据集激发的,该数据集包含18名接受肾移植的患者的代谢物的1H NMR光谱,以及他们的移植结果,这些移植结果属于三类之一:急性排斥反应,移植功能延迟和主要功能。我们提出了一个功能偏最小二乘(FPLS)模型,将现有的用于纵向测量标量组学数据集分析的PLS方法扩展到纵向测量功能数据集的情况。我们设计了一个迭代算法来链接多个时间点,然后将我们提出的方法应用于肾移植患者的数据分析。最后,将该方法的AUC与仅使用一个时间点信息的单变量方法的AUC进行了比较。看来我们的方法优于现有的方法。为了模拟肾脏移植数据集,进行了一项模拟研究,但样本量更大,并进行了不同的场景,以评估新方法在更大数据集中的性能。我们考虑不同难度的场景来区分这两组。结果表明,三个时间点模型的平均auc分别为0.909和0.811,优于任何单个模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Theoretical Biology Forum
Theoretical Biology Forum Agricultural and Biological Sciences-General Agricultural and Biological Sciences
CiteScore
1.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信