MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning for Computational Phenotyping.

Proceedings of machine learning research Pub Date : 2023-12-01

Yifei Ren, Jian Lou, Li Xiong, Joyce C Ho, Xiaoqian Jiang, Sivasubramanium Venkatraman Bhavani

{"title":"MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning for Computational Phenotyping.","authors":"Yifei Ren, Jian Lou, Li Xiong, Joyce C Ho, Xiaoqian Jiang, Sivasubramanium Venkatraman Bhavani","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Tensor factorization has received increasing interest due to its intrinsic ability to capture latent factors in multi-dimensional data with many applications including Electronic Health Records (EHR) mining. PARAFAC2 and its variants have been proposed to address irregular tensors where one of the tensor modes is not aligned, e.g., different patients in EHRs may have different length of records. PARAFAC2 has been successfully applied to EHRs for extracting meaningful medical concepts (phenotypes). Despite recent advancements, current models' predictability and interpretability are not satisfactory, which limits its utility for downstream analysis. In this paper, we propose MULTIPAR: a supervised irregular tensor factorization with multi-task learning for computational phenotyping. MULTIPAR is flexible to incorporate both static (e.g. in-hospital mortality prediction) and continuous or dynamic (e.g. the need for ventilation) tasks. By supervising the tensor factorization with downstream prediction tasks and leveraging information from multiple related predictive tasks, MULTIPAR can yield not only more meaningful phenotypes but also better predictive performance for downstream tasks. We conduct extensive experiments on two real-world temporal EHR datasets to demonstrate that MULTIPAR is scalable and achieves better tensor fit with more meaningful subgroups and stronger predictive performance compared to existing state-of-the-art methods. The implementation of MULTIPAR is available.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"225 ","pages":"498-511"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611252/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of machine learning research","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Tensor factorization has received increasing interest due to its intrinsic ability to capture latent factors in multi-dimensional data with many applications including Electronic Health Records (EHR) mining. PARAFAC2 and its variants have been proposed to address irregular tensors where one of the tensor modes is not aligned, e.g., different patients in EHRs may have different length of records. PARAFAC2 has been successfully applied to EHRs for extracting meaningful medical concepts (phenotypes). Despite recent advancements, current models' predictability and interpretability are not satisfactory, which limits its utility for downstream analysis. In this paper, we propose MULTIPAR: a supervised irregular tensor factorization with multi-task learning for computational phenotyping. MULTIPAR is flexible to incorporate both static (e.g. in-hospital mortality prediction) and continuous or dynamic (e.g. the need for ventilation) tasks. By supervising the tensor factorization with downstream prediction tasks and leveraging information from multiple related predictive tasks, MULTIPAR can yield not only more meaningful phenotypes but also better predictive performance for downstream tasks. We conduct extensive experiments on two real-world temporal EHR datasets to demonstrate that MULTIPAR is scalable and achieves better tensor fit with more meaningful subgroups and stronger predictive performance compared to existing state-of-the-art methods. The implementation of MULTIPAR is available.

本刊更多论文

基于多任务学习的不规则张量分解。

张量分解由于其固有的捕获多维数据中潜在因素的能力而受到越来越多的关注，包括电子健康记录（EHR）挖掘在内的许多应用。PARAFAC2及其变体已被提出用于解决其中一个张量模式未对齐的不规则张量，例如，电子病历中的不同患者可能具有不同长度的记录。PARAFAC2已成功应用于电子病历，用于提取有意义的医学概念（表型）。尽管最近取得了进展，但当前模型的可预测性和可解释性并不令人满意，这限制了其在下游分析中的效用。在本文中，我们提出了MULTIPAR：一种具有多任务学习的有监督不规则张量分解算法。MULTIPAR可以灵活地纳入静态（如住院死亡率预测）和连续或动态（如需要通风）任务。通过监督下游预测任务的张量分解，并利用来自多个相关预测任务的信息，MULTIPAR不仅可以产生更有意义的表型，还可以为下游任务提供更好的预测性能。我们在两个真实世界的时间EHR数据集上进行了广泛的实验，以证明MULTIPAR是可扩展的，与现有的最先进的方法相比，它具有更好的张量拟合和更有意义的子组，并且具有更强的预测性能。MULTIPAR的实现是可用的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of machine learning research

自引率

0.00%

发文量