Journal of Biomedical Informatics最新文献

筛选
英文 中文
A survey of recent methods for addressing AI fairness and bias in biomedicine 解决生物医学中人工智能公平性和偏见的最新方法概览
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-25 DOI: 10.1016/j.jbi.2024.104646
Yifan Yang , Mingquan Lin , Han Zhao , Yifan Peng , Furong Huang , Zhiyong Lu
{"title":"A survey of recent methods for addressing AI fairness and bias in biomedicine","authors":"Yifan Yang ,&nbsp;Mingquan Lin ,&nbsp;Han Zhao ,&nbsp;Yifan Peng ,&nbsp;Furong Huang ,&nbsp;Zhiyong Lu","doi":"10.1016/j.jbi.2024.104646","DOIUrl":"https://doi.org/10.1016/j.jbi.2024.104646","url":null,"abstract":"<div><h3>Objectives</h3><p>Artificial intelligence (AI) systems have the potential to revolutionize clinical practices, including improving diagnostic accuracy and surgical decision-making, while also reducing costs and manpower. However, it is important to recognize that these systems may perpetuate social inequities or demonstrate biases, such as those based on race or gender. Such biases can occur before, during, or after the development of AI models, making it critical to understand and address potential biases to enable the accurate and reliable application of AI models in clinical settings. To mitigate bias concerns during model development, we surveyed recent publications on different debiasing methods in the fields of biomedical natural language processing (NLP) or computer vision (CV). Then we discussed the methods, such as data perturbation and adversarial learning, that have been applied in the biomedical domain to address bias.</p></div><div><h3>Methods</h3><p>We performed our literature search on PubMed, ACM digital library, and IEEE Xplore of relevant articles published between January 2018 and December 2023 using multiple combinations of keywords. We then filtered the result of 10,041 articles automatically with loose constraints, and manually inspected the abstracts of the remaining 890 articles to identify the 55 articles included in this review. Additional articles in the references are also included in this review. We discuss each method and compare its strengths and weaknesses. Finally, we review other potential methods from the general domain that could be applied to biomedicine to address bias and improve fairness.</p></div><div><h3>Results</h3><p>The bias of AIs in biomedicine can originate from multiple sources such as insufficient data, sampling bias and the use of health-irrelevant features or race-adjusted algorithms. Existing debiasing methods that focus on algorithms can be categorized into distributional or algorithmic. Distributional methods include data augmentation, data perturbation, data reweighting methods, and federated learning. Algorithmic approaches include unsupervised representation learning, adversarial learning, disentangled representation learning, loss-based methods and causality-based methods.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000649/pdfft?md5=463472b5f244a8f4a4f49707c8ee30a5&pid=1-s2.0-S1532046424000649-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140807086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for longitudinal latent factor modelling of treatment response in clinical trials with applications to Psoriatic Arthritis and Rheumatoid Arthritis 应用于银屑病关节炎和类风湿关节炎的临床试验治疗反应纵向潜在因素建模框架。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-18 DOI: 10.1016/j.jbi.2024.104641
Fabian Falck , Xuan Zhu , Sahra Ghalebikesabi , Matthias Kormaksson , Marc Vandemeulebroecke , Cong Zhang , Ruvie Martin , Stephen Gardiner , Chun Hei Kwok , Dominique M. West , Luis Santos , Chengeng Tian , Yu Pang , Aimee Readie , Gregory Ligozio , Kunal K. Gandhi , Thomas E. Nichols , Ann-Marie Mallon , Luke Kelly , David Ohlssen , George Nicholson
{"title":"A framework for longitudinal latent factor modelling of treatment response in clinical trials with applications to Psoriatic Arthritis and Rheumatoid Arthritis","authors":"Fabian Falck ,&nbsp;Xuan Zhu ,&nbsp;Sahra Ghalebikesabi ,&nbsp;Matthias Kormaksson ,&nbsp;Marc Vandemeulebroecke ,&nbsp;Cong Zhang ,&nbsp;Ruvie Martin ,&nbsp;Stephen Gardiner ,&nbsp;Chun Hei Kwok ,&nbsp;Dominique M. West ,&nbsp;Luis Santos ,&nbsp;Chengeng Tian ,&nbsp;Yu Pang ,&nbsp;Aimee Readie ,&nbsp;Gregory Ligozio ,&nbsp;Kunal K. Gandhi ,&nbsp;Thomas E. Nichols ,&nbsp;Ann-Marie Mallon ,&nbsp;Luke Kelly ,&nbsp;David Ohlssen ,&nbsp;George Nicholson","doi":"10.1016/j.jbi.2024.104641","DOIUrl":"10.1016/j.jbi.2024.104641","url":null,"abstract":"<div><h3>Objective:</h3><p>Clinical trials involve the collection of a wealth of data, comprising multiple diverse measurements performed at baseline and follow-up visits over the course of a trial. The most common primary analysis is restricted to a single, potentially composite endpoint at one time point. While such an analytical focus promotes simple and replicable conclusions, it does not necessarily fully capture the multi-faceted effects of a drug in a complex disease setting. Therefore, to complement existing approaches, we set out here to design a longitudinal multivariate analytical framework that accepts as input an entire clinical trial database, comprising all measurements, patients, and time points across multiple trials.</p></div><div><h3>Methods:</h3><p>Our framework composes probabilistic principal component analysis with a longitudinal linear mixed effects model, thereby enabling clinical interpretation of multivariate results, while handling data missing at random, and incorporating covariates and covariance structure in a computationally efficient and principled way.</p></div><div><h3>Results:</h3><p>We illustrate our approach by applying it to four phase III clinical trials of secukinumab in Psoriatic Arthritis (PsA) and Rheumatoid Arthritis (RA). We identify three clinically plausible latent factors that collectively explain 74.5% of empirical variation in the longitudinal patient database. We estimate longitudinal trajectories of these factors, thereby enabling joint characterisation of disease progression and drug effect. We perform benchmarking experiments demonstrating our method’s competitive performance at estimating average treatment effects compared to existing statistical and machine learning methods, and showing that our modular approach leads to relatively computationally efficient model fitting.</p></div><div><h3>Conclusion:</h3><p>Our multivariate longitudinal framework has the potential to illuminate the properties of existing composite endpoint methods, and to enable the development of novel clinical endpoints that provide enhanced and complementary perspectives on treatment response.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000595/pdfft?md5=3de9b41dab738037e61bb81f8ae6793d&pid=1-s2.0-S1532046424000595-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140778740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Special issue on learning from multiple data sources for decision making in health care 关于从多种数据源学习医疗决策的特刊。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-16 DOI: 10.1016/j.jbi.2024.104645
Fabio Stella, Francesco Calimeri, Mauro Dragoni
{"title":"Special issue on learning from multiple data sources for decision making in health care","authors":"Fabio Stella,&nbsp;Francesco Calimeri,&nbsp;Mauro Dragoni","doi":"10.1016/j.jbi.2024.104645","DOIUrl":"10.1016/j.jbi.2024.104645","url":null,"abstract":"","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000637/pdfft?md5=970bb673d585bee2ae375a6334fe3cb0&pid=1-s2.0-S1532046424000637-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140758452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying gene expression programs in single-cell RNA-seq data using linear correlation explanation 利用线性相关解释识别单细胞 RNA-seq 数据中的基因表达程序。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-15 DOI: 10.1016/j.jbi.2024.104644
Yulia I. Nussbaum , K.S.M. Tozammel Hossain , Jussuf Kaifi , Wesley C. Warren , Chi-Ren Shyu , Jonathan B. Mitchem
{"title":"Identifying gene expression programs in single-cell RNA-seq data using linear correlation explanation","authors":"Yulia I. Nussbaum ,&nbsp;K.S.M. Tozammel Hossain ,&nbsp;Jussuf Kaifi ,&nbsp;Wesley C. Warren ,&nbsp;Chi-Ren Shyu ,&nbsp;Jonathan B. Mitchem","doi":"10.1016/j.jbi.2024.104644","DOIUrl":"10.1016/j.jbi.2024.104644","url":null,"abstract":"<div><h3>Objective</h3><p>Gene expression analysis through single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of gene regulation in diverse cell types, tissues, and organisms. While existing methods primarily focus on identifying cell type-specific gene expression programs (GEPs), the characterization of GEPs associated with biological processes and stimuli responses remains limited. In this study, we aim to infer biologically meaningful GEPs that are associated with both cellular phenotypes and activity programs directly from scRNA-seq data.</p></div><div><h3>Methods</h3><p>We applied linear CorEx, a machine-learning-based approach, to infer GEPs by grouping genes based on total correlation optimization function in simulated and real-world scRNA-seq datasets. Additionally, we utilized a transfer learning approach to project CorEx-inferred GEPs to other scRNA-seq datasets.</p></div><div><h3>Results</h3><p>By leveraging total correlation optimization, linear CorEx groups genes and demonstrates superior performance in identifying cell types and activity programs compared to similar methods using simulated data. Furthermore, we apply this same approach to real-world scRNA-seq data from the mouse dentate gyrus and embryonic colon development, uncovering biologically relevant GEPs related to cell types, developmental ages, and cell cycle programs. We also demonstrate the potential for transfer learning by evaluating similar datasets, showcasing the cross-species sensitivity of linear CorEx.</p></div><div><h3>Conclusion</h3><p>Our findings validate linear CorEx as a valuable tool for comprehensively analyzing complex signals in scRNA-seq data, leading to deeper insights into gene expression dynamics, cellular heterogeneity, and regulatory mechanisms.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140762959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records MixEHR-SurG:从电子健康记录中推断死亡率相关主题的联合比例危险和引导主题模型
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-15 DOI: 10.1016/j.jbi.2024.104638
Yixuan Li , Archer Y. Yang , Ariane Marelli , Yue Li
{"title":"MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records","authors":"Yixuan Li ,&nbsp;Archer Y. Yang ,&nbsp;Ariane Marelli ,&nbsp;Yue Li","doi":"10.1016/j.jbi.2024.104638","DOIUrl":"https://doi.org/10.1016/j.jbi.2024.104638","url":null,"abstract":"<div><p>Survival models can help medical practitioners to evaluate the prognostic importance of clinical variables to patient outcomes such as mortality or hospital readmission and subsequently design personalized treatment regimes. Electronic Health Records (EHRs) hold the promise for large-scale survival analysis based on systematically recorded clinical features for each patient. However, existing survival models either do not scale to high dimensional and multi-modal EHR data or are difficult to interpret. In this study, we present a supervised topic model called MixEHR-SurG to simultaneously integrate heterogeneous EHR data and model survival hazard. Our contributions are three-folds: (1) integrating EHR topic inference with Cox proportional hazards likelihood; (2) integrating patient-specific topic hyperparameters using the PheCode concepts such that each topic can be identified with exactly one PheCode-associated phenotype; (3) multi-modal survival topic inference. This leads to a highly interpretable survival topic model that can infer PheCode-specific phenotype topics associated with patient mortality. We evaluated MixEHR-SurG using a simulated dataset and two real-world EHR datasets: the Quebec Congenital Heart Disease (CHD) data consisting of 8211 subjects with 75,187 outpatient claim records of 1767 unique ICD codes; the MIMIC-III consisting of 1458 subjects with multi-modal EHR records. Compared to the baselines, MixEHR-SurG achieved a superior dynamic AUROC for mortality prediction, with a mean AUROC score of 0.89 in the simulation dataset and a mean AUROC of 0.645 on the CHD dataset. Qualitatively, MixEHR-SurG associates severe cardiac conditions with high mortality risk among the CHD patients after the first heart failure hospitalization and critical brain injuries with increased mortality among the MIMIC-III patients after their ICU discharge. Together, the integration of the Cox proportional hazards model and EHR topic inference in MixEHR-SurG not only leads to competitive mortality prediction but also meaningful phenotype topics for in-depth survival analysis. The software is available at GitHub: <span>https://github.com/li-lab-mcgill/MixEHR-SurG</span><svg><path></path></svg>.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S153204642400056X/pdfft?md5=2e00f7bfa6e4631f19ffcdcf5d0e1985&pid=1-s2.0-S153204642400056X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140558953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias 从临床叙述中识别健康的社会决定因素:绩效、文件比率和潜在偏差研究
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-14 DOI: 10.1016/j.jbi.2024.104642
Zehao Yu , Cheng Peng , Xi Yang , Chong Dang , Prakash Adekkanattu , Braja Gopal Patra , Yifan Peng , Jyotishman Pathak , Debbie L. Wilson , Ching-Yuan Chang , Wei-Hsuan Lo-Ciganic , Thomas J. George , William R. Hogan , Yi Guo , Jiang Bian , Yonghui Wu
{"title":"Identifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias","authors":"Zehao Yu ,&nbsp;Cheng Peng ,&nbsp;Xi Yang ,&nbsp;Chong Dang ,&nbsp;Prakash Adekkanattu ,&nbsp;Braja Gopal Patra ,&nbsp;Yifan Peng ,&nbsp;Jyotishman Pathak ,&nbsp;Debbie L. Wilson ,&nbsp;Ching-Yuan Chang ,&nbsp;Wei-Hsuan Lo-Ciganic ,&nbsp;Thomas J. George ,&nbsp;William R. Hogan ,&nbsp;Yi Guo ,&nbsp;Jiang Bian ,&nbsp;Yonghui Wu","doi":"10.1016/j.jbi.2024.104642","DOIUrl":"https://doi.org/10.1016/j.jbi.2024.104642","url":null,"abstract":"<div><h3>Objective</h3><p>To develop a natural language processing (NLP) package to extract social determinants of health (SDoH) from clinical narratives, examine the bias among race and gender groups, test the generalizability of extracting SDoH for different disease groups, and examine population-level extraction ratio.</p></div><div><h3>Methods</h3><p>We developed SDoH corpora using clinical notes identified at the University of Florida (UF) Health. We systematically compared 7 transformer-based large language models (LLMs) and developed an open-source package – SODA (i.e., SOcial DeterminAnts) to facilitate SDoH extraction from clinical narratives. We examined the performance and potential bias of SODA for different race and gender groups, tested the generalizability of SODA using two disease domains including cancer and opioid use, and explored strategies for improvement. We applied SODA to extract 19 categories of SDoH from the breast (n = 7,971), lung (n = 11,804), and colorectal cancer (n = 6,240) cohorts to assess patient-level extraction ratio and examine the differences among race and gender groups.</p></div><div><h3>Results</h3><p>We developed an SDoH corpus using 629 clinical notes of cancer patients with annotations of 13,193 SDoH concepts/attributes from 19 categories of SDoH, and another cross-disease validation corpus using 200 notes from opioid use patients with 4,342 SDoH concepts/attributes. We compared 7 transformer models and the GatorTron model achieved the best mean average strict/lenient F1 scores of 0.9122 and 0.9367 for SDoH concept extraction and 0.9584 and 0.9593 for linking attributes to SDoH concepts. There is a small performance gap (∼4%) between Males and Females, but a large performance gap (&gt;16 %) among race groups. The performance dropped when we applied the cancer SDoH model to the opioid cohort; fine-tuning using a smaller opioid SDoH corpus improved the performance. The extraction ratio varied in the three cancer cohorts, in which 10 SDoH could be extracted from over 70 % of cancer patients, but 9 SDoH could be extracted from less than 70 % of cancer patients. Individuals from the White and Black groups have a higher extraction ratio than other minority race groups.</p></div><div><h3>Conclusions</h3><p>Our SODA package achieved good performance in extracting 19 categories of SDoH from clinical narratives. The SODA package with pre-trained transformer models is available at <span>https://github.com/uf-hobi-informatics-lab/SODA_Docker</span><svg><path></path></svg>.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140554235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variation in monitoring: Glucose measurement in the ICU as a case study to preempt spurious correlations 监测中的差异:以重症监护室的血糖测量为例研究如何避免虚假相关性
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-14 DOI: 10.1016/j.jbi.2024.104643
Khushboo Teotia , Yueran Jia , Naira Link Woite , Leo Anthony Celi , João Matos , Tristan Struja
{"title":"Variation in monitoring: Glucose measurement in the ICU as a case study to preempt spurious correlations","authors":"Khushboo Teotia ,&nbsp;Yueran Jia ,&nbsp;Naira Link Woite ,&nbsp;Leo Anthony Celi ,&nbsp;João Matos ,&nbsp;Tristan Struja","doi":"10.1016/j.jbi.2024.104643","DOIUrl":"https://doi.org/10.1016/j.jbi.2024.104643","url":null,"abstract":"<div><h3>Objective</h3><p>Health inequities can be influenced by demographic factors such as race and ethnicity, proficiency in English, and biological sex. Disparities may manifest as differential likelihood of testing which correlates directly with the likelihood of an intervention to address an abnormal finding. Our retrospective observational study evaluated the presence of variation in glucose measurements in the Intensive Care Unit (ICU).</p></div><div><h3>Methods</h3><p>Using the MIMIC-IV database (2008–2019), a single-center, academic referral hospital in Boston (USA), we identified adult patients meeting sepsis-3 criteria. Exclusion criteria were diabetic ketoacidosis, ICU length of stay under 1 day, and unknown race or ethnicity. We performed a logistic regression analysis to assess differential likelihoods of glucose measurements on day 1. A negative binomial regression was fitted to assess the frequency of subsequent glucose readings. Analyses were adjusted for relevant clinical confounders, and performed across three disparity proxy axes: race and ethnicity, sex, and English proficiency.</p></div><div><h3>Results</h3><p>We studied 24,927 patients, of which 19.5% represented racial and ethnic minority groups, 42.4% were female, and 9.8% had limited English proficiency. No significant differences were found for glucose measurement on day 1 in the ICU. This pattern was consistent irrespective of the axis of analysis, i.e. race and ethnicity, sex, or English proficiency. Conversely, subsequent measurement frequency revealed potential disparities. Specifically, males (incidence rate ratio (IRR) 1.06, 95% confidence interval (CI) 1.01 – 1.21), patients who identify themselves as Hispanic (IRR 1.11, 95% CI 1.01 – 1.21), or Black (IRR 1.06, 95% CI 1.01 – 1.12), and patients being English proficient (IRR 1.08, 95% CI 1.01 – 1.15) had higher chances of subsequent glucose readings.</p></div><div><h3>Conclusion</h3><p>We found disparities in ICU glucose measurements among patients with sepsis, albeit the magnitude was small. Variation in disease monitoring is a source of data bias that may lead to spurious correlations when modeling health data.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140638798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging generative AI for clinical evidence synthesis needs to ensure trustworthiness 利用生成式人工智能进行临床证据合成需要确保可信度
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-10 DOI: 10.1016/j.jbi.2024.104640
Gongbo Zhang , Qiao Jin , Denis Jered McInerney , Yong Chen , Fei Wang , Curtis L. Cole , Qian Yang , Yanshan Wang , Bradley A Malin , Mor Peleg , Byron C. Wallace , Zhiyong Lu , Chunhua Weng , Yifan Peng
{"title":"Leveraging generative AI for clinical evidence synthesis needs to ensure trustworthiness","authors":"Gongbo Zhang ,&nbsp;Qiao Jin ,&nbsp;Denis Jered McInerney ,&nbsp;Yong Chen ,&nbsp;Fei Wang ,&nbsp;Curtis L. Cole ,&nbsp;Qian Yang ,&nbsp;Yanshan Wang ,&nbsp;Bradley A Malin ,&nbsp;Mor Peleg ,&nbsp;Byron C. Wallace ,&nbsp;Zhiyong Lu ,&nbsp;Chunhua Weng ,&nbsp;Yifan Peng","doi":"10.1016/j.jbi.2024.104640","DOIUrl":"https://doi.org/10.1016/j.jbi.2024.104640","url":null,"abstract":"<div><p>Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence. The rapid growth of medical evidence, which can be obtained from various sources, poses a challenge in collecting, appraising, and synthesizing the evidential information. Recent advancements in generative AI, exemplified by large language models, hold promise in facilitating the arduous task. However, developing accountable, fair, and inclusive models remains a complicated undertaking. In this perspective, we discuss the trustworthiness of generative AI in the context of automated summarization of medical evidence.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140551678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering clinical drug-drug interactions with known pharmacokinetics mechanisms using spontaneous reporting systems and electronic health records 利用自发报告系统和电子健康记录发现已知药代动力学机制的临床药物相互作用
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-06 DOI: 10.1016/j.jbi.2024.104639
Eugene Jeong , Yu Su , Lang Li , You Chen
{"title":"Discovering clinical drug-drug interactions with known pharmacokinetics mechanisms using spontaneous reporting systems and electronic health records","authors":"Eugene Jeong ,&nbsp;Yu Su ,&nbsp;Lang Li ,&nbsp;You Chen","doi":"10.1016/j.jbi.2024.104639","DOIUrl":"https://doi.org/10.1016/j.jbi.2024.104639","url":null,"abstract":"<div><h3>Objective</h3><p>Although the mechanisms behind pharmacokinetic (PK) drug-drug interactions (DDIs) are well-documented, bridging the gap between this knowledge and clinical evidence of DDIs, especially for serious adverse drug reactions (SADRs), remains challenging. While leveraging the FDA Adverse Event Reporting System (FAERS) database along with disproportionality analysis tends to detect a vast number of DDI signals, this abundance complicates further investigation, such as validation through clinical trials. Our study proposed a framework to efficiently prioritize these signals and assessed their reliability using multi-source Electronic Health Records (EHR) to identify top candidates for further investigation.</p></div><div><h3>Methods</h3><p>We analyzed FAERS data spanning from January 2004 to March 2023, employing four established disproportionality methods: Proportional Reporting Ratio (PRR), Reporting Odds Ratio (ROR), Multi-item Gamma Poisson Shrinker (MGPS), and Bayesian Confidence Propagating Neural Network (BCPNN). Building upon these models, we developed four ranking models to prioritize DDI-SADR signals and cross-referenced signals with DrugBank. To validate the top-ranked signals, we employed longitudinal EHRs from Vanderbilt University Medical Center and the All of Us research program. The performance of each model was assessed by counting how many of the top-ranked signals were confirmed by EHRs and calculating the average ranking of these confirmed signals.</p></div><div><h3>Results</h3><p>Out of 189 DDI-SADR signals identified by all four disproportionality methods, only two were documented in the DrugBank database. By prioritizing the top 20 signals as determined by each of the four disproportionality methods and our four ranking models, 58 unique DDI-SADR signals were selected for EHR validations. Of these, five signals were confirmed. The ranking model, which integrated the MGPS and BCPNN, demonstrated superior performance by assigning the highest priority to those five EHR-confirmed signals.</p></div><div><h3>Conclusion</h3><p>The fusion of disproportionality analysis with ranking models, validated through multi-source EHRs, presents a groundbreaking approach to pharmacovigilance. Our study's confirmation of five significant DDI-SADRs, previously unrecorded in the DrugBank database, highlights the essential role of advanced data analysis techniques in identifying ADRs.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140543767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic categorization of self-acknowledged limitations in randomized controlled trial publications 对随机对照试验出版物中自我承认的局限性进行自动分类。
IF 4.5 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-04-01 DOI: 10.1016/j.jbi.2024.104628
Mengfei Lan , Mandy Cheng , Linh Hoang , Gerben ter Riet , Halil Kilicoglu
{"title":"Automatic categorization of self-acknowledged limitations in randomized controlled trial publications","authors":"Mengfei Lan ,&nbsp;Mandy Cheng ,&nbsp;Linh Hoang ,&nbsp;Gerben ter Riet ,&nbsp;Halil Kilicoglu","doi":"10.1016/j.jbi.2024.104628","DOIUrl":"10.1016/j.jbi.2024.104628","url":null,"abstract":"<div><h3>Objective:</h3><p>Acknowledging study limitations in a scientific publication is a crucial element in scientific transparency and progress. However, limitation reporting is often inadequate. Natural language processing (NLP) methods could support automated reporting checks, improving research transparency. In this study, our objective was to develop a dataset and NLP methods to detect and categorize self-acknowledged limitations (e.g., sample size, blinding) reported in randomized controlled trial (RCT) publications.</p></div><div><h3>Methods:</h3><p>We created a data model of limitation types in RCT studies and annotated a corpus of 200 full-text RCT publications using this data model. We fine-tuned BERT-based sentence classification models to recognize the limitation sentences and their types. To address the small size of the annotated corpus, we experimented with data augmentation approaches, including Easy Data Augmentation (EDA) and Prompt-Based Data Augmentation (PromDA). We applied the best-performing model to a set of about 12K RCT publications to characterize self-acknowledged limitations at larger scale.</p></div><div><h3>Results:</h3><p>Our data model consists of 15 categories and 24 sub-categories (e.g., Population and its sub-category DiagnosticCriteria). We annotated 1090 instances of limitation types in 952 sentences (4.8 limitation sentences and 5.5 limitation types per article). A fine-tuned PubMedBERT model for limitation sentence classification improved upon our earlier model by about 1.5 absolute percentage points in F<sub>1</sub> score (0.821 vs. 0.8) with statistical significance (<span><math><mrow><mi>p</mi><mo>&lt;</mo><mo>.</mo><mn>001</mn></mrow></math></span>). Our best-performing limitation type classification model, PubMedBERT fine-tuning with PromDA (Output View), achieved an F<sub>1</sub> score of 0.7, improving upon the vanilla PubMedBERT model by 2.7 percentage points, with statistical significance (<span><math><mrow><mi>p</mi><mo>&lt;</mo><mo>.</mo><mn>001</mn></mrow></math></span>).</p></div><div><h3>Conclusion:</h3><p>The model could support automated screening tools which can be used by journals to draw the authors’ attention to reporting issues. Automatic extraction of limitations from RCT publications could benefit peer review and evidence synthesis, and support advanced methods to search and aggregate the evidence from the clinical trial literature.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000467/pdfft?md5=21e7d266d966f37fd3d70f62db4c894b&pid=1-s2.0-S1532046424000467-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140318367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信