Journal of Biomedical Informatics最新文献

筛选
英文 中文
Computational frameworks integrating deep learning and statistical models in mining multimodal omics data 集成深度学习和统计模型的计算框架,用于挖掘多模态 omics 数据。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-01 DOI: 10.1016/j.jbi.2024.104629
Leann Lac , Carson K. Leung , Pingzhao Hu
{"title":"Computational frameworks integrating deep learning and statistical models in mining multimodal omics data","authors":"Leann Lac ,&nbsp;Carson K. Leung ,&nbsp;Pingzhao Hu","doi":"10.1016/j.jbi.2024.104629","DOIUrl":"10.1016/j.jbi.2024.104629","url":null,"abstract":"<div><h3>Background</h3><p>In health research, multimodal omics data analysis is widely used to address important clinical and biological questions. Traditional statistical methods rely on the strong assumptions of distribution. Statistical methods such as testing and differential expression are commonly used in omics analysis. Deep learning, on the other hand, is an advanced computer science technique that is powerful in mining high-dimensional omics data for prediction tasks. Recently, integrative frameworks or methods have been developed for omics studies that combine statistical models and deep learning algorithms.</p></div><div><h3>Methods and results</h3><p>The aim of these integrative frameworks is to combine the strengths of both statistical methods and deep learning algorithms to improve prediction accuracy while also providing interpretability and explainability. This review report discusses the current state-of-the-art integrative frameworks, their limitations, and potential future directions in survival and time-to-event longitudinal analysis, dimension reduction and clustering, regression and classification, feature selection, and causal and transfer learning.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140326629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clinical trial recommendations using Semantics-Based inductive inference and knowledge graph embeddings 使用基于语义的归纳推理和知识图嵌入进行临床试验推荐。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-30 DOI: 10.1016/j.jbi.2024.104627
Murthy V. Devarakonda, Smita Mohanty, Raja Rao Sunkishala, Nag Mallampalli, Xiong Liu
{"title":"Clinical trial recommendations using Semantics-Based inductive inference and knowledge graph embeddings","authors":"Murthy V. Devarakonda,&nbsp;Smita Mohanty,&nbsp;Raja Rao Sunkishala,&nbsp;Nag Mallampalli,&nbsp;Xiong Liu","doi":"10.1016/j.jbi.2024.104627","DOIUrl":"10.1016/j.jbi.2024.104627","url":null,"abstract":"<div><h3>Objective</h3><p>Designing a new clinical trial entails many decisions, such as defining a cohort and setting the study objectives to name a few, and therefore can benefit from recommendations based on exhaustive mining of past clinical trial records. This study proposes an approach based on knowledge graph embeddings and semantics-driven inductive inference for generating such recommendations.</p></div><div><h3>Method</h3><p>The proposed recommendation methodology is based on neural embeddings trained on first-of-its-kind knowledge graph constructed from clinical trials data. The methodology includes design of a knowledge graph for clinical trial data, evaluation of various knowledge graph embedding techniques for it, application of a novel inductive inference method using these embeddings, and generation of recommendations for clinical trial design. The study uses freely available data from <em>clinicaltrials.gov</em> and related sources.</p></div><div><h3>Results</h3><p>The proposed approach for recommendations obtained relevance scores ranging from 70% to 83%. These scores were determined by evaluating the text similarity of recommended elements to actual elements used in clinical trials that are in progress. Furthermore, the most pertinent recommendations were consistently located towards the top of the list, indicating the effectiveness of our method.</p></div><div><h3>Conclusion</h3><p>Our study suggests that inductive inference using node semantics is a viable approach for generating recommendations using graphs neural embeddings, and that there is a potential for improvement in training graph embeddings using node semantics.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140335758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Participant flow diagrams for health equity in AI 人工智能健康公平的参与者流程图。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-27 DOI: 10.1016/j.jbi.2024.104631
Jacob G. Ellen , João Matos , Martin Viola , Jack Gallifant , Justin Quion , Leo Anthony Celi , Nebal S. Abu Hussein
{"title":"Participant flow diagrams for health equity in AI","authors":"Jacob G. Ellen ,&nbsp;João Matos ,&nbsp;Martin Viola ,&nbsp;Jack Gallifant ,&nbsp;Justin Quion ,&nbsp;Leo Anthony Celi ,&nbsp;Nebal S. Abu Hussein","doi":"10.1016/j.jbi.2024.104631","DOIUrl":"10.1016/j.jbi.2024.104631","url":null,"abstract":"<div><p>Selection bias can arise through many aspects of a study, including recruitment, inclusion/exclusion criteria, input-level exclusion and outcome-level exclusion, and often reflects the underrepresentation of populations historically disadvantaged in medical research. The effects of selection bias can be further amplified when non-representative samples are used in artificial intelligence (AI) and machine learning (ML) applications to construct clinical algorithms. Building on the “Data Cards” initiative for transparency in AI research, we advocate for the addition of a participant flow diagram for AI studies detailing relevant sociodemographic and/or clinical characteristics of excluded participants across study phases, with the goal of identifying potential algorithmic biases before their clinical implementation. We include both a model for this flow diagram as well as a brief case study explaining how it could be implemented in practice. Through standardized reporting of participant flow diagrams, we aim to better identify potential inequities embedded in AI applications, facilitating more reliable and equitable clinical algorithms.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140318369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction 模型调整还是提示调整?用于临床概念和关系提取的大型语言模型研究
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-26 DOI: 10.1016/j.jbi.2024.104630
Cheng Peng , Xi Yang , Kaleb E Smith , Zehao Yu , Aokun Chen , Jiang Bian , Yonghui Wu
{"title":"Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction","authors":"Cheng Peng ,&nbsp;Xi Yang ,&nbsp;Kaleb E Smith ,&nbsp;Zehao Yu ,&nbsp;Aokun Chen ,&nbsp;Jiang Bian ,&nbsp;Yonghui Wu","doi":"10.1016/j.jbi.2024.104630","DOIUrl":"10.1016/j.jbi.2024.104630","url":null,"abstract":"<div><h3>Objective</h3><p>To develop soft prompt-based learning architecture for large language models (LLMs), examine prompt-tuning using frozen/unfrozen LLMs, and assess their abilities in transfer learning and few-shot learning.</p></div><div><h3>Methods</h3><p>We developed a soft prompt-based learning architecture and compared 4 strategies including (1) fine-tuning without prompts; (2) hard-prompting with unfrozen LLMs; (3) soft-prompting with unfrozen LLMs; and (4) soft-prompting with frozen LLMs. We evaluated GatorTron, a clinical LLM with up to 8.9 billion parameters, and compared GatorTron with 4 existing transformer models for clinical concept and relation extraction on 2 benchmark datasets for adverse drug events and social determinants of health (SDoH). We evaluated the few-shot learning ability and generalizability for cross-institution applications.</p></div><div><h3>Results and Conclusion</h3><p>When LLMs are unfrozen, GatorTron-3.9B with soft prompting achieves the best strict F1-scores of 0.9118 and 0.8604 for concept extraction, outperforming the traditional fine-tuning and hard prompt-based models by 0.6 ∼ 3.1 % and 1.2 ∼ 2.9 %, respectively; GatorTron-345 M with soft prompting achieves the best F1-scores of 0.8332 and 0.7488 for end-to-end relation extraction, outperforming other two models by 0.2 ∼ 2 % and 0.6 ∼ 11.7 %, respectively. When LLMs are frozen, small LLMs have a big gap to be competitive with unfrozen models; scaling LLMs up to billions of parameters makes frozen LLMs competitive with unfrozen models. Soft prompting with a frozen GatorTron-8.9B model achieved the best performance for cross-institution evaluation. We demonstrate that (1) machines can learn soft prompts better than hard prompts composed by human, (2) frozen LLMs have good few-shot learning ability and generalizability for cross-institution applications, (3) frozen LLMs reduce computing cost to 2.5 ∼ 6 % of previous methods using unfrozen LLMs, and (4) frozen LLMs require large models (e.g., over several billions of parameters) for good performance.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140318368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records 开发基于深度学习的策略,从电子健康记录中预测非酒精性脂肪肝患者罹患肝细胞癌的风险。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-22 DOI: 10.1016/j.jbi.2024.104626
Zhao Li , Lan Lan , Yujia Zhou , Ruoxing Li , Kenneth D. Chavin , Hua Xu , Liang Li , David J.H. Shih , W. Jim Zheng
{"title":"Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records","authors":"Zhao Li ,&nbsp;Lan Lan ,&nbsp;Yujia Zhou ,&nbsp;Ruoxing Li ,&nbsp;Kenneth D. Chavin ,&nbsp;Hua Xu ,&nbsp;Liang Li ,&nbsp;David J.H. Shih ,&nbsp;W. Jim Zheng","doi":"10.1016/j.jbi.2024.104626","DOIUrl":"10.1016/j.jbi.2024.104626","url":null,"abstract":"<div><h3>Objective</h3><p>The accuracy of deep learning models for many disease prediction problems is affected by time-varying covariates, rare incidence, covariate imbalance and delayed diagnosis when using structured electronic health records data. The situation is further exasperated when predicting the risk of one disease on condition of another disease, such as the hepatocellular carcinoma risk among patients with nonalcoholic fatty liver disease due to slow, chronic progression, the scarce of data with both disease conditions and the sex bias of the diseases. The goal of this study is to investigate the extent to which the aforementioned issues influence deep learning performance, and then devised strategies to tackle these challenges. These strategies were applied to improve hepatocellular carcinoma risk prediction among patients with nonalcoholic fatty liver disease.</p></div><div><h3>Methods</h3><p>We evaluated two representative deep learning models in the task of predicting the occurrence of hepatocellular carcinoma in a cohort of patients with nonalcoholic fatty liver disease (n = 220,838) from a national EHR database. The disease prediction task was carefully formulated as a classification problem while taking censorship and the length of follow-up into consideration.</p></div><div><h3>Results</h3><p>We developed a novel backward masking scheme to deal with the issue of delayed diagnosis which is very common in EHR data analysis and evaluate how the length of longitudinal information after the index date affects disease prediction. We observed that modeling time-varying covariates improved the performance of the algorithms and transfer learning mitigated reduced performance caused by the lack of data. In addition, covariate imbalance, such as sex bias in data impaired performance. Deep learning models trained on one sex and evaluated in the other sex showed reduced performance, indicating the importance of assessing covariate imbalance while preparing data for model training.</p></div><div><h3>Conclusions</h3><p>The strategies developed in this work can significantly improve the performance of hepatocellular carcinoma risk prediction among patients with nonalcoholic fatty liver disease. Furthermore, our novel strategies can be generalized to apply to other disease risk predictions using structured electronic health records, especially for disease risks on condition of another disease.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140193906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive performance evaluation, comparison, and integration of computational methods for detecting and estimating cross-contamination of human samples in cancer next-generation sequencing analysis 癌症新一代测序分析中检测和估计人类样本交叉污染的计算方法的综合性能评估、比较和整合。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-12 DOI: 10.1016/j.jbi.2024.104625
Huijuan Chen , Bing Wang , Lili Cai , Xiaotian Yang , Yali Hu , Yiran Zhang , Xue Leng , Wen Liu , Dongjie Fan , Beifang Niu , Qiming Zhou
{"title":"A comprehensive performance evaluation, comparison, and integration of computational methods for detecting and estimating cross-contamination of human samples in cancer next-generation sequencing analysis","authors":"Huijuan Chen ,&nbsp;Bing Wang ,&nbsp;Lili Cai ,&nbsp;Xiaotian Yang ,&nbsp;Yali Hu ,&nbsp;Yiran Zhang ,&nbsp;Xue Leng ,&nbsp;Wen Liu ,&nbsp;Dongjie Fan ,&nbsp;Beifang Niu ,&nbsp;Qiming Zhou","doi":"10.1016/j.jbi.2024.104625","DOIUrl":"10.1016/j.jbi.2024.104625","url":null,"abstract":"<div><p>Cross-sample contamination is one of the major issues in next-generation sequencing (NGS)-based molecular assays. This type of contamination, even at very low levels, can significantly impact the results of an analysis, especially in the detection of somatic alterations in tumor samples. Several contamination identification tools have been developed and implemented as a crucial quality-control step in the routine NGS bioinformatic pipeline. However, no study has been published to comprehensively and systematically investigate, evaluate, and compare these computational methods in the cancer NGS analysis. In this study, we comprehensively investigated nine state-of-the-art computational methods for detecting cross-sample contamination. To explore their application in cancer NGS analysis, we further compared the performance of five representative tools by qualitative and quantitative analyses using <em>in silico</em> and simulated experimental NGS data. The results showed that Conpair achieved the best performance for identifying contamination and predicting the level of contamination in solid tumors NGS analysis. Moreover, based on Conpair, we developed a Python script, Contamination Source Predictor (ConSPr), to identify the source of contamination. We anticipate that this comprehensive survey and the proposed tool for predicting the source of contamination will assist researchers in selecting appropriate cross-contamination detection tools in cancer NGS analysis and inspire the development of computational methods for detecting sample cross-contamination and identifying its source in the future.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140119567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ParTRE: A relational triple extraction model of complicated entities and imbalanced relations in Parkinson’s disease ParTRE:帕金森病复杂实体和不平衡关系的三重关系提取模型。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-11 DOI: 10.1016/j.jbi.2024.104624
Xiaoming Zhang , Can Yu , Rui Yan
{"title":"ParTRE: A relational triple extraction model of complicated entities and imbalanced relations in Parkinson’s disease","authors":"Xiaoming Zhang ,&nbsp;Can Yu ,&nbsp;Rui Yan","doi":"10.1016/j.jbi.2024.104624","DOIUrl":"10.1016/j.jbi.2024.104624","url":null,"abstract":"<div><p>The relational triple extraction of unstructured medical texts about Parkinson’s disease is critical for the construction of a medical knowledge graph. However, the triple entities in Parkinson’s disease are usually complicated and overlapped, which impedes the accuracy of triple extraction, especially in the case of rarely available corpus. Therefore, this study first builds a corpus about Parkinson’s disease. Then, a tagging-based three-stage relational triple extraction model is proposed, named ParTRE. To enhance the contextual representation of sentences, the proposed model employs BiLSTM modules to capture fine-grained semantic information. Additionally, a conditional normalization layer is used so that entity pairs can be extracted accurately from two complementary directions. As for the imbalanced relationship categories, an adaptive loss function strategy based on focal loss is derived by assigning different weights to relationship categories and reducing the loss of easy-to-classify samples. The model performance is evaluated on the Parkinson’s corpus and public datasets. The results indicate that the proposed model achieves an overall F1-score of 93.3 % on the Parkinson’s corpus and comparable performance on public datasets compared with the state-of-the-art methods. Moreover, a satisfactory result is achieved by the proposed model on conquering the overlapped entities and imbalanced relationship categories. Owing to demonstrated availability and validity, the proposed method can be integrated with medical knowledge graphs and therefore benefits medical intelligence.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140101644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedFSA: Hybrid and federated framework for functional status ascertainment across institutions FedFSA:跨机构功能状态确定的混合联合框架。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-06 DOI: 10.1016/j.jbi.2024.104623
Sunyang Fu , Heling Jia , Maria Vassilaki , Vipina K. Keloth , Yifang Dang , Yujia Zhou , Muskan Garg , Ronald C. Petersen , Jennifer St Sauver , Sungrim Moon , Liwei Wang , Andrew Wen , Fang Li , Hua Xu , Cui Tao , Jungwei Fan , Hongfang Liu , Sunghwan Sohn
{"title":"FedFSA: Hybrid and federated framework for functional status ascertainment across institutions","authors":"Sunyang Fu ,&nbsp;Heling Jia ,&nbsp;Maria Vassilaki ,&nbsp;Vipina K. Keloth ,&nbsp;Yifang Dang ,&nbsp;Yujia Zhou ,&nbsp;Muskan Garg ,&nbsp;Ronald C. Petersen ,&nbsp;Jennifer St Sauver ,&nbsp;Sungrim Moon ,&nbsp;Liwei Wang ,&nbsp;Andrew Wen ,&nbsp;Fang Li ,&nbsp;Hua Xu ,&nbsp;Cui Tao ,&nbsp;Jungwei Fan ,&nbsp;Hongfang Liu ,&nbsp;Sunghwan Sohn","doi":"10.1016/j.jbi.2024.104623","DOIUrl":"10.1016/j.jbi.2024.104623","url":null,"abstract":"<div><h3>Introduction</h3><p>Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients’ functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions.</p></div><div><h3>Methods</h3><p>FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs.</p></div><div><h3>Results</h3><p>ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance.</p></div><div><h3>Conclusion</h3><p>NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140065218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets 人工智能驱动的药物警戒:基于基准数据集的临床文本药物不良事件检测中的机器学习和深度学习综述。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-05 DOI: 10.1016/j.jbi.2024.104621
Yiming Li , Wei Tao , Zehan Li , Zenan Sun , Fang Li , Susan Fenton , Hua Xu , Cui Tao
{"title":"Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets","authors":"Yiming Li ,&nbsp;Wei Tao ,&nbsp;Zehan Li ,&nbsp;Zenan Sun ,&nbsp;Fang Li ,&nbsp;Susan Fenton ,&nbsp;Hua Xu ,&nbsp;Cui Tao","doi":"10.1016/j.jbi.2024.104621","DOIUrl":"10.1016/j.jbi.2024.104621","url":null,"abstract":"<div><h3>Objective</h3><p>The primary objective of this review is to investigate the effectiveness of machine learning and deep learning methodologies in the context of extracting adverse drug events (ADEs) from clinical benchmark datasets. We conduct an in-depth analysis, aiming to compare the merits and drawbacks of both machine learning and deep learning techniques, particularly within the framework of named-entity recognition (NER) and relation classification (RC) tasks related to ADE extraction. Additionally, our focus extends to the examination of specific features and their impact on the overall performance of these methodologies. In a broader perspective, our research extends to ADE extraction from various sources, including biomedical literature, social media data, and drug labels, removing the limitation to exclusively machine learning or deep learning methods.</p></div><div><h3>Methods</h3><p>We conducted an extensive literature review on PubMed using the query “(((machine learning [Medical Subject Headings (MeSH) Terms]) OR (deep learning [MeSH Terms])) AND (adverse drug event [MeSH Terms])) AND (extraction)”, and supplemented this with a snowballing approach to review 275 references sourced from retrieved articles.</p></div><div><h3>Results</h3><p>In our analysis, we included twelve articles for review. For the NER task, deep learning models outperformed machine learning models. In the RC task, gradient Boosting, multilayer perceptron and random forest models excelled. The Bidirectional Encoder Representations from Transformers (BERT) model consistently achieved the best performance in the end-to-end task. Future efforts in the end-to-end task should prioritize improving NER accuracy, especially for 'ADE' and 'Reason'.</p></div><div><h3>Conclusion</h3><p>These findings hold significant implications for advancing the field of ADE extraction and pharmacovigilance, ultimately contributing to improved drug safety monitoring and healthcare outcomes.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140049580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis 对 ChatGPT 生成的医疗回复进行评估:系统回顾与荟萃分析。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-03-01 DOI: 10.1016/j.jbi.2024.104620
Qiuhong Wei , Zhengxiong Yao , Ying Cui , Bo Wei , Zhezhen Jin , Ximing Xu
{"title":"Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis","authors":"Qiuhong Wei ,&nbsp;Zhengxiong Yao ,&nbsp;Ying Cui ,&nbsp;Bo Wei ,&nbsp;Zhezhen Jin ,&nbsp;Ximing Xu","doi":"10.1016/j.jbi.2024.104620","DOIUrl":"10.1016/j.jbi.2024.104620","url":null,"abstract":"<div><h3>Objective</h3><p>Large language models (LLMs) such as ChatGPT are increasingly explored in medical domains. However, the absence of standard guidelines for performance evaluation has led to methodological inconsistencies. This study aims to summarize the available evidence on evaluating ChatGPT’s performance in answering medical questions and provide direction for future research.</p></div><div><h3>Methods</h3><p>An extensive literature search was conducted on June 15, 2023, across ten medical databases. The keyword used was “ChatGPT,” without restrictions on publication type, language, or date. Studies evaluating ChatGPT's performance in answering medical questions were included. Exclusions comprised review articles, comments, patents, non-medical evaluations of ChatGPT, and preprint studies. Data was extracted on general study characteristics, question sources, conversation processes, assessment metrics, and performance of ChatGPT. An evaluation framework for LLM in medical inquiries was proposed by integrating insights from selected literature. This study is registered with PROSPERO, CRD42023456327.</p></div><div><h3>Results</h3><p>A total of 3520 articles were identified, of which 60 were reviewed and summarized in this paper and 17 were included in the <em>meta</em>-analysis. ChatGPT displayed an overall integrated accuracy of 56 % (95 % CI: 51 %–60 %, I<sup>2</sup> = 87 %) in addressing medical queries. However, the studies varied in question resource, question-asking process, and evaluation metrics. As per our proposed evaluation framework, many studies failed to report methodological details, such as the date of inquiry, version of ChatGPT, and inter-rater consistency.</p></div><div><h3>Conclusion</h3><p>This review reveals ChatGPT's potential in addressing medical inquiries, but the heterogeneity of the study design and insufficient reporting might affect the results’ reliability. Our proposed evaluation framework provides insights for the future study design and transparent reporting of LLM in responding to medical questions.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140094085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信