Bayesian multitask learning for medicine recommendation based on online patient reviews.

IF 5.4 3区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Bioinformatics Pub Date : 2023-08-01 DOI:10.1093/bioinformatics/btad491

Yichen Cheng, Yusen Xia, Xinlei Wang

{"title":"Bayesian multitask learning for medicine recommendation based on online patient reviews.","authors":"Yichen Cheng, Yusen Xia, Xinlei Wang","doi":"10.1093/bioinformatics/btad491","DOIUrl":null,"url":null,"abstract":"Motivation: We propose a drug recommendation model that integrates information from both structured data (patient demographic information) and unstructured texts (patient reviews). It is based on multitask learning to predict review ratings of several satisfaction-related measures for a given medicine, where related tasks can learn from each other for prediction. The learned models can then be applied to new patients for drug recommendation. This is fundamentally different from most recommender systems in e-commerce, which do not work well for new customers (referred to as the cold-start problem). To extract information from review texts, we employ both topic modeling and sentiment analysis. We further incorporate variable selection into the model via Bayesian LASSO, which aims to filter out irrelevant features. To our best knowledge, this is the first Bayesian multitask learning method for ordinal responses. We are also the first to apply multitask learning to medicine recommendation. The sample code and data are made available at GitHub: https://github.com/thrushcyc-github/BMull.Results: We evaluate the proposed method on two sets of drug reviews involving 17 depression/high blood pressure-related drugs. Overall, our method performs better than existing benchmark methods in terms of accuracy and AUC (area under the receiver operating characteristic curve). It is effective even with a small sample size and only a few available features, and more robust to possible noninformative covariates. Due to our model explainability, insights generated from our model may work as a useful reference for doctors. In practice, however, a final decision should be carefully made by combining the information from the proposed recommender with doctors' domain knowledge and past experience.Availability and implementation: The sample code and data are publicly available at GitHub: https://github.com/thrushcyc-github/BMull.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10425196/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btad491","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation: We propose a drug recommendation model that integrates information from both structured data (patient demographic information) and unstructured texts (patient reviews). It is based on multitask learning to predict review ratings of several satisfaction-related measures for a given medicine, where related tasks can learn from each other for prediction. The learned models can then be applied to new patients for drug recommendation. This is fundamentally different from most recommender systems in e-commerce, which do not work well for new customers (referred to as the cold-start problem). To extract information from review texts, we employ both topic modeling and sentiment analysis. We further incorporate variable selection into the model via Bayesian LASSO, which aims to filter out irrelevant features. To our best knowledge, this is the first Bayesian multitask learning method for ordinal responses. We are also the first to apply multitask learning to medicine recommendation. The sample code and data are made available at GitHub: https://github.com/thrushcyc-github/BMull.

Results: We evaluate the proposed method on two sets of drug reviews involving 17 depression/high blood pressure-related drugs. Overall, our method performs better than existing benchmark methods in terms of accuracy and AUC (area under the receiver operating characteristic curve). It is effective even with a small sample size and only a few available features, and more robust to possible noninformative covariates. Due to our model explainability, insights generated from our model may work as a useful reference for doctors. In practice, however, a final decision should be carefully made by combining the information from the proposed recommender with doctors' domain knowledge and past experience.

Availability and implementation: The sample code and data are publicly available at GitHub: https://github.com/thrushcyc-github/BMull.

Abstract Image

查看原文本刊更多论文

基于在线患者评论的贝叶斯多任务学习药物推荐。

动机：我们提出了一种药物推荐模型，该模型整合了结构化数据（患者人口信息）和非结构化文本（患者评论）中的信息。该模型以多任务学习为基础，预测给定药物的多个满意度相关指标的评论评级，相关任务可以相互学习以进行预测。然后，学习到的模型可应用于新患者的药物推荐。这与电子商务中的大多数推荐系统有本质区别，后者对新客户效果不佳（被称为冷启动问题）。为了从评论文本中提取信息，我们采用了主题建模和情感分析两种方法。我们还通过贝叶斯 LASSO 将变量选择纳入模型，旨在过滤掉不相关的特征。据我们所知，这是第一种针对序数反应的贝叶斯多任务学习方法。我们也是第一个将多任务学习应用于医药推荐的人。示例代码和数据可在 GitHub 上获取：https://github.com/thrushcyc-github/BMull.Results：我们在涉及 17 种抑郁症/高血压相关药物的两组药物评论中对所提出的方法进行了评估。总体而言，我们的方法在准确率和 AUC（接收者工作特征曲线下面积）方面都优于现有的基准方法。即使样本量较小，只有几个可用特征，我们的方法也很有效，而且对可能存在的非信息协变量也更加稳健。由于我们的模型具有可解释性，从我们的模型中得出的见解可作为医生的有用参考。但在实践中，应将建议的推荐器提供的信息与医生的领域知识和以往经验相结合，谨慎做出最终决定：示例代码和数据可在 GitHub 上公开获取：https://github.com/thrushcyc-github/BMull。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioinformatics 生物-生化研究方法

CiteScore

11.20

自引率

5.20%

发文量

753

审稿时长

2.1 months

期刊介绍： The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.