基于张量因子的表型分析使用组信息:他汀类药物疗效的案例研究

Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics Pub Date : 2017-08-20 DOI:10.1145/3107411.3107423

Jingyun Choi, Yejin Kim, Hun‐Sung Kim, I. Choi, Hwanjo Yu

{"title":"基于张量因子的表型分析使用组信息:他汀类药物疗效的案例研究","authors":"Jingyun Choi, Yejin Kim, Hun‐Sung Kim, I. Choi, Hwanjo Yu","doi":"10.1145/3107411.3107423","DOIUrl":null,"url":null,"abstract":"To automatically extract medical concepts from raw electronic health records (EHRs), several applications based on machine learning techniques have been proposed. Among the various techniques, tensor factorization methods have attracted considerable attention because tensor representations can capture interactions among high-dimensional EHRs. Most of the existing tensor factorization methods for computational phenotyping are only designed to derive individual phenotypes that approximate the original data. However, deriving grouped phenotypes is desirable because patients form natural groups of interest (i.e., efficacy of treatment and disease categories). In this paper, we propose Supervised Non-negative Tensor Factorization with Multinomial Logistic Regression (SNTFL) to derive grouped phenotypes that are discriminative. We define a discriminative constraint to derive grouped phenotypes and jointly optimize a multinomial logistic regression during the tensor factorization process. Our case study on a hyperlipidemia dataset demonstrates that our proposed method obtains better discrimination on patient groups compared to the baselines and successfully discovers meaningful patient subgroups.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Tensor-Factorization-Based Phenotyping using Group Information: Case Study on the Efficacy of Statins\",\"authors\":\"Jingyun Choi, Yejin Kim, Hun‐Sung Kim, I. Choi, Hwanjo Yu\",\"doi\":\"10.1145/3107411.3107423\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To automatically extract medical concepts from raw electronic health records (EHRs), several applications based on machine learning techniques have been proposed. Among the various techniques, tensor factorization methods have attracted considerable attention because tensor representations can capture interactions among high-dimensional EHRs. Most of the existing tensor factorization methods for computational phenotyping are only designed to derive individual phenotypes that approximate the original data. However, deriving grouped phenotypes is desirable because patients form natural groups of interest (i.e., efficacy of treatment and disease categories). In this paper, we propose Supervised Non-negative Tensor Factorization with Multinomial Logistic Regression (SNTFL) to derive grouped phenotypes that are discriminative. We define a discriminative constraint to derive grouped phenotypes and jointly optimize a multinomial logistic regression during the tensor factorization process. Our case study on a hyperlipidemia dataset demonstrates that our proposed method obtains better discrimination on patient groups compared to the baselines and successfully discovers meaningful patient subgroups.\",\"PeriodicalId\":246388,\"journal\":{\"name\":\"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3107411.3107423\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3107411.3107423","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

为了从原始电子健康记录(EHRs)中自动提取医学概念，已经提出了几种基于机器学习技术的应用。在各种技术中，张量分解方法由于张量表示可以捕获高维电子病历之间的相互作用而引起了相当大的关注。大多数现有的用于计算表型的张量分解方法仅用于推导近似原始数据的个体表型。然而，获得分组表型是可取的，因为患者形成感兴趣的自然组(即治疗效果和疾病类别)。在本文中，我们提出了监督非负张量分解与多项逻辑回归(SNTFL)，以获得具有判别性的分组表型。我们定义了一个判别约束来推导分组表型，并在张量分解过程中共同优化多项逻辑回归。我们对高脂血症数据集的案例研究表明，与基线相比，我们提出的方法在患者组上获得了更好的区分，并成功发现了有意义的患者亚组。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Tensor-Factorization-Based Phenotyping using Group Information: Case Study on the Efficacy of Statins

To automatically extract medical concepts from raw electronic health records (EHRs), several applications based on machine learning techniques have been proposed. Among the various techniques, tensor factorization methods have attracted considerable attention because tensor representations can capture interactions among high-dimensional EHRs. Most of the existing tensor factorization methods for computational phenotyping are only designed to derive individual phenotypes that approximate the original data. However, deriving grouped phenotypes is desirable because patients form natural groups of interest (i.e., efficacy of treatment and disease categories). In this paper, we propose Supervised Non-negative Tensor Factorization with Multinomial Logistic Regression (SNTFL) to derive grouped phenotypes that are discriminative. We define a discriminative constraint to derive grouped phenotypes and jointly optimize a multinomial logistic regression during the tensor factorization process. Our case study on a hyperlipidemia dataset demonstrates that our proposed method obtains better discrimination on patient groups compared to the baselines and successfully discovers meaningful patient subgroups.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics

自引率

0.00%

发文量