Journal of Biomedical Informatics最新文献_第3页

Review of tools to support Target Trial Emulation 回顾支持目标试验仿真的工具。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-26 DOI: 10.1016/j.jbi.2025.104897

Christina A. van Hal , Elmer V. Bernstam , Todd R. Johnson

{"title":"Review of tools to support Target Trial Emulation","authors":"Christina A. van Hal , Elmer V. Bernstam , Todd R. Johnson","doi":"10.1016/j.jbi.2025.104897","DOIUrl":"10.1016/j.jbi.2025.104897","url":null,"abstract":"<div><h3>Objective:</h3><div>Randomized Controlled Trials (RCTs) are the gold standard for clinical evidence, but ethical and practical constraints sometimes necessitate or warrant the use of observational data. The aim of this study is to identify informatics tools that support the design and conduct of Target Trial Emulations (TTEs), a framework for designing observational studies that closely emulate RCTs so as to minimize biases that often arise when using real-world evidence (RWE) to estimate causal effects.</div></div><div><h3>Methods:</h3><div>We divided the process of conducting TTEs into three phases and seven steps. We then systematically reviewed the literature to identify currently available tools that support one or more of the seven steps required to conduct a TTE. For each tool, we noted which step or steps the tool supports.</div></div><div><h3>Results:</h3><div>7625 papers were included in the initial review, with 76 meeting our inclusion criteria. Our review identified 24 distinct tools applicable to the three phases of TTE. Specifically, 3 tools support the Design Phase, 5 support the Implementation Phase, and 19 support the Analysis Phase, with some tools applicable to multiple phases.</div></div><div><h3>Conclusion:</h3><div>This review revealed significant gaps in tool support for the Design Phase of TTEs, while support for the Implementation and Analysis phases was highly variable. No single tool currently supports all aspects of TTEs from start to finish and few tools are interoperable, meaning they cannot be easily integrated into a unified workflow. The results highlight the need for further development of informatics tools for supporting TTEs.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104897"},"PeriodicalIF":4.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145185991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Drug repositioning with metapath guidance and adaptive negative sampling enhancement 基于路径引导和自适应负采样增强的药物重新定位。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-25 DOI: 10.1016/j.jbi.2025.104916

Yaozheng Zhou , Xingyu Shi , Lingfeng Wang , Jin Xu , Demin Li , Congzhou Chen

{"title":"Drug repositioning with metapath guidance and adaptive negative sampling enhancement","authors":"Yaozheng Zhou , Xingyu Shi , Lingfeng Wang , Jin Xu , Demin Li , Congzhou Chen","doi":"10.1016/j.jbi.2025.104916","DOIUrl":"10.1016/j.jbi.2025.104916","url":null,"abstract":"<div><h3>Objective:</h3><div>Drug repositioning plays a pivotal role in expediting the drug discovery pipeline. The rapid development of computational methods has opened new avenues for predicting drug-disease associations (DDAs). Despite advancements in existing methodologies, challenges such as insufficient exploration of diverse relationships in heterogeneous biological networks and inadequate quality of negative samples have persisted.</div></div><div><h3>Methods:</h3><div>In this study, we introduce DRMGNE, a novel drug repositioning framework that harnesses metapath-guided learning and adaptive negative enhancement for DDA prediction. DRMGNE initiates with an autoencoder to extract semantic features based on similarity matrices. Subsequently, a comprehensive set of metapaths is designed to generate subgraphs, and graph convolutional networks are utilized to extract enriched node representations reflecting topological structures. Furthermore, the adaptive negative enhancement strategy is employed to improve the quality of negative samples, ensuring balanced learning.</div></div><div><h3>Results:</h3><div>Experimental evaluations demonstrate that DRMGNE outperforms state-of-the-art algorithms across three benchmark datasets. Additionally, case studies and molecular docking validations further underscore its potential in facilitating drug discovery and accelerating drug repurposing efforts.</div></div><div><h3>Conclusion:</h3><div>DRMGNE is a novel framework for DDA prediction that leverages metapath-based guidance and adaptive negative enhancement. Experiments on benchmark datasets show superior performance over existing methods, underscoring its potential impact in drug discovery.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104916"},"PeriodicalIF":4.5,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145182165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The crisis of biomedical foundation models. 生物医学基础模型的危机。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-25 DOI: 10.1016/j.jbi.2025.104917

Fei Wang

引用次数: 0

Cross-scale semantic fusion integration of dual pathway models in drug repositioning 药物重新定位中双通路模型的跨尺度语义融合整合。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-25 DOI: 10.1016/j.jbi.2025.104914

Mingxuan Li, Shuai Li, Zhen Li, Mandong Hu

{"title":"Cross-scale semantic fusion integration of dual pathway models in drug repositioning","authors":"Mingxuan Li, Shuai Li, Zhen Li, Mandong Hu","doi":"10.1016/j.jbi.2025.104914","DOIUrl":"10.1016/j.jbi.2025.104914","url":null,"abstract":"<div><div>Drug Repositioning (DR) represents an innovative drug development strategy that significantly reduces both cost and time by identifying new therapeutic indications for approved drugs. Current methods primarily focus on extracting information from drug–disease networks, but often overlook critical local structural details between nodes. This study introduces CSDPDR, a novel Dual-branch graph neural network that integrates Topology Feature Information and Salient Feature Information to enhance drug repositioning accuracy and efficiency. Through the Topology-aware branch with Adaptive Residual Graph Attention and the Saliency-aware branch with Score-Driven Top-K Convolutional Graph Pooling, the model can capture both large-scale topology patterns and fine-grained local information. Furthermore, our approach effectively alleviate graph sparsity issues through meta-path-based network enhancement and confidence-based filtering mechanisms. Comparative experiments on two benchmark datasets an additional dataset demonstrate that CSDPDR significantly outperforms several state-of-the-art baseline methods. Case studies on Alzheimer’s disease and breast neoplasms further validate the model’s practical applicability and effectiveness.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104914"},"PeriodicalIF":4.5,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145182173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measuring and visualizing healthcare process variability 测量和可视化医疗保健过程可变性。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-23 DOI: 10.1016/j.jbi.2025.104918

Pengfei Yin , Abel Armas Cervantes , Daniel Capurro

{"title":"Measuring and visualizing healthcare process variability","authors":"Pengfei Yin , Abel Armas Cervantes , Daniel Capurro","doi":"10.1016/j.jbi.2025.104918","DOIUrl":"10.1016/j.jbi.2025.104918","url":null,"abstract":"<div><h3>Importance</h3><div>Understanding factors that contribute to clinical variability in patient care is critical, as unwarranted variability can lead to increased adverse events and prolonged hospital stays. Determining when this variability becomes excessive can be a step in optimizing patient outcomes and healthcare efficiency.</div></div><div><h3>Objective</h3><div>Explore the association between clinical variation and clinical outcomes. This study aims to identify the point in time when the relationship between clinical variation and length of stay (LOS) becomes significant.</div></div><div><h3>Methods</h3><div>This cohort study uses MIMIC-IV, a dataset collecting electronic health records of the Beth Israel Deaconess Medical Center in the United States. We focused on adult patients who underwent elective coronary bypass surgery, generating 847 patient observations. Demographic factors such as age, race, insurance type, and the Charlson Comorbidity Index (CCI) were recorded. We performed a variability analysis where patients’ clinical processes are represented as sequences of events. The data was segmented based on the initial day of recorded activity to establish observation windows. Using a regression analysis, we identified the temporal window where variability’s impact on LOS becomes independently significant.</div></div><div><h3>Result</h3><div>Regression analysis revealed that patients in the top 20 % of the variability distance group experienced an 81 % increase in LOS (95 % CI: 1.72 to 1.91, p < 0.001). Insurance types, such as Medicare and Other, were associated with 18 % (95 % CI: 0.73 to 0.92, p < 0.001) and 21 % (95 % CI: 0.71 to 0.88, p < 0.001) decreases in LOS, respectively. Neither age nor race significantly affected LOS, but a higher CCI was associated with a 3.3 % increase in LOS (95 % CI: 1.02 to 1.05, p < 0.001). These findings indicate that higher variability and CCI significantly influence LOS, with insurance type also playing a crucial role.</div></div><div><h3>Conclusion</h3><div>In the studied cohort, patient journeys with greater variability were associated with longer LOS with a dose–response relationship: the higher the variability, the longer LOS. This study presents a standardized way to measure and visualize variability in clinical processes and measure its impact on patient-relevant outcomes.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"170 ","pages":"Article 104918"},"PeriodicalIF":4.5,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145149215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LFVDNet: Low-frequency variable-driven network for medical time series LFVDNet：医疗时间序列的低频变量驱动网络。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-23 DOI: 10.1016/j.jbi.2025.104913

Yue Zhang , Dengqun Sun , Lei Li , Jian Zhou , Xiuquan Du , Shuo Li

{"title":"LFVDNet: Low-frequency variable-driven network for medical time series","authors":"Yue Zhang , Dengqun Sun , Lei Li , Jian Zhou , Xiuquan Du , Shuo Li","doi":"10.1016/j.jbi.2025.104913","DOIUrl":"10.1016/j.jbi.2025.104913","url":null,"abstract":"<div><h3>Objective:</h3><div>Medical time series, a type of multivariate time series with missing values, is widely used to predict time series analysis, the “impute first, then predict” end-to-end architecture is used to address this issue. However, existing methods are likely to lead to the loss of uniqueness and key information of low-frequency sampled variables (LFSVs) when dealing with them. In this paper, we aim to develop a method that effectively handles LFSVs, preserving their distinctive characteristics and essential information throughout the modeling process.</div></div><div><h3>Methods:</h3><div>We propose a novel end-to-end method named <em><strong>L</strong>ow-<strong>F</strong>requency <strong>V</strong>ariable-<strong>D</strong>riven network</em> (LFVDNet) for medical time series analysis. Specifically, the Time-Aware Imputer (TA) module encodes the observed values and critical time information, and uses the attention mechanism to establish an association between the observed values and the missing values. TA adopts channel-independent strategy to prevent interference from high-frequency sampled variables (HFSVs) on LFSVs, thereby preserving the unique information contained in LFSVs. The Offset-Selection Module (OS) independently selects data points for each variable through offsets, avoiding the natural disadvantages of LFSVs in selection-based imputation, thus solving the problem of the loss of key information of LFSVs. LFVDNet is the first method for analyzing multivariate time series with missing values that emphasizes the effective utilization of LFSVs.</div></div><div><h3>Results:</h3><div>We carried out the experiments on four public datasets and the experimental results indicate that LFVDNet has better robustness and performance. All code is available at <span><span>https://github.com/dxqllp/LFVDNet</span><svg><path></path></svg></span>.</div></div><div><h3>Conclusions:</h3><div>This study proposes a novel method for medical time series analysis, namely LFVDNet, which aims to effectively utilize LFSVs. Specifically, we have designed the TA module, which performs imputation through temporal correlations. The OS module, on the other hand, performs selective imputation based on a data point selection strategy. We have verified the effectiveness of this method on four datasets constructed from PhysioNet 2012 and MIMIC-IV.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104913"},"PeriodicalIF":4.5,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145149181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of Single-Cell perturbation response based on Direction-Constrained diffusion Schrödinger Bridge 基于方向约束扩散的单细胞微扰响应预测Schrödinger桥

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-21 DOI: 10.1016/j.jbi.2025.104915

Yiqing Luo , Lin Liu , Yaxin Fu , Yi Deng , Lin Tang

{"title":"Prediction of Single-Cell perturbation response based on Direction-Constrained diffusion Schrödinger Bridge","authors":"Yiqing Luo , Lin Liu , Yaxin Fu , Yi Deng , Lin Tang","doi":"10.1016/j.jbi.2025.104915","DOIUrl":"10.1016/j.jbi.2025.104915","url":null,"abstract":"<div><h3>Objective</h3><div>Predicting transcriptional responses to external perturbations at the single-cell level is essential for understanding gene regulatory networks, drug discovery, and personalized interventions. The exponential increase in perturbation conditions creates data sparsity, making it difficult to capture dynamic responses and necessitating computational modeling.</div></div><div><h3>Methods</h3><div>We present Direction-Constrained Diffusion Schrödinger Bridge (DC-DSB), a generative framework that learns probabilistic trajectories between unperturbed and post-perturbation distributions by minimizing path-space KL divergence. To enhance conditional control, DC-DSB integrates hierarchical representations derived from experimental variables and biological prior knowledge. We further introduce a direction-constrained conditioning strategy that injects condition signals along the biologically relevant perturbation trajectory, thereby improving modeling quality and training stability.</div></div><div><h3>Results</h3><div>DC-DSB improves expression prediction accuracy and generalization to unseen combinations over baselines. By modeling dynamic expression trajectories and co-expression structures under perturbation, DC-DSB enables the discovery of synergistic and antagonistic gene interactions and supports the progressive reconstruction of regulatory pathways.</div></div><div><h3>Conclusion</h3><div>DC-DSB provides a biologically consistent and generalizable framework for single-cell perturbation modeling. Its trajectory-based and condition-aware architecture overcomes the limitations of static mappings and facilitates downstream analyses in gene regulation and drug discovery.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"170 ","pages":"Article 104915"},"PeriodicalIF":4.5,"publicationDate":"2025-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MPCM-RRG: Multi-modal Prompt Collaboration Mechanism for Radiology Report Generation MPCM-RRG：放射学报告生成的多模式快速协作机制。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-17 DOI: 10.1016/j.jbi.2025.104912

Yumian Yu , Guoheng Huang , Zhe Tan , Jiahui Shi , Ming Li , Chi-Man Pun , Fuchen Zheng , Shiqiang Ma , Shuqiang Wang , Long He

{"title":"MPCM-RRG: Multi-modal Prompt Collaboration Mechanism for Radiology Report Generation","authors":"Yumian Yu , Guoheng Huang , Zhe Tan , Jiahui Shi , Ming Li , Chi-Man Pun , Fuchen Zheng , Shiqiang Ma , Shuqiang Wang , Long He","doi":"10.1016/j.jbi.2025.104912","DOIUrl":"10.1016/j.jbi.2025.104912","url":null,"abstract":"<div><div>The task of medical report generation involves automatically creating descriptive text reports from medical images, with the aim of alleviating the workload of physicians and enhancing diagnostic efficiency. However, although many existing medical report generation models based on the Transformer framework consider structural information in medical images, they ignore the interference of confounding factors on these structures, which limits the model’s ability to effectively capture rich and critical lesion information. Furthermore, these models often struggle to address the significant imbalance between normal and abnormal content in actual reports, leading to challenges in accurately describing abnormalities. To address these limitations, we propose the Multi-modal Prompt Collaboration Mechanism for Radiology Report Generation Model (MPCM-RRG). This model consists of three key components: the Visual Causal Prompting Module (VCP), the Textual Prompt-Guided Feature Enhancement Module (TPGF), and the Visual–Textual Semantic Consistency Module (VTSC). The VCP module uses chest X-ray masks as visual prompts and incorporates causal inference principles to help the model minimize the influence of irrelevant regions. Through causal intervention, the model can learn the causal relationships between the pathological regions in the image and the corresponding findings described in the report. The TPGF module tackles the imbalance between abnormal and normal text by integrating detailed textual prompts, which also guide the model to focus on lesion areas using a multi-head attention mechanism. The VTSC module promotes alignment between the visual and textual representations through contrastive consistency loss, fostering greater interaction and collaboration between the visual and textual prompts. Experimental results demonstrate that MPCM-RRG outperforms other methods on the IU X-ray and MIMIC-CXR datasets, highlighting its effectiveness in generating high-quality medical reports.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"170 ","pages":"Article 104912"},"PeriodicalIF":4.5,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145091719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SynthMedic: Utilizing large language models for synthetic discharge summary generation, correction and validation SynthMedic：利用大型语言模型生成、校正和验证综合放电摘要。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-15 DOI: 10.1016/j.jbi.2025.104906

Georgi Grazhdanski , Vasil Vasilev , Sylvia Vassileva , Dimitar Taskov , Izabel Antova , Ivan Koychev , Svetla Boytcheva

{"title":"SynthMedic: Utilizing large language models for synthetic discharge summary generation, correction and validation","authors":"Georgi Grazhdanski , Vasil Vasilev , Sylvia Vassileva , Dimitar Taskov , Izabel Antova , Ivan Koychev , Svetla Boytcheva","doi":"10.1016/j.jbi.2025.104906","DOIUrl":"10.1016/j.jbi.2025.104906","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Synthetic clinical texts can improve transparency and reduce bias and costs when training and evaluating specialized language models in the medical domain. Synthetic texts are freely shareable, as they contain no real patient information, and can be customized for a specific task. The objective of this study is to develop a methodology for generating, validating, and correcting synthetic discharge summaries using LLMs without requiring any real patient data.</div></div><div><h3>Methods:</h3><div>The proposed approach uses an LLM to generate synthetic discharge summaries for specific diseases and standard medical references from Merck Manuals to ground the generation in internationally accepted medical practices. We validate the generated summaries using LLMs as well as by human expert validation. In addition, we propose a method for automatic correction of the generated discharge summaries using Knowledge Graphs to ensure medical factual correctness.</div></div><div><h3>Results:</h3><div>The conducted human expert evaluation shows that the generated synthetic discharge summaries are credible and factually accurate when provided with the medical reference context. The generated summaries achieve a System Usability Score of 94.35% based on a comprehensive rubric evaluated by medical professionals and a score of 93.65% on the Faithfulness metric evaluated by an LLM.</div></div><div><h3>Conclusions:</h3><div>The proposed methodology can be utilized to generate high-quality synthetic discharge summaries for various diseases. The generated synthetic corpus consists of 900 discharge summaries in English representing nine socially significant diseases and is publicly available under an open license. The community can take advantage of the corpus and proposed methodology to train complex machine learning models, helping medical professionals in their daily work without using real patient data.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"170 ","pages":"Article 104906"},"PeriodicalIF":4.5,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145080834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Advancing causal inference in medicine using biobank data 利用生物银行数据推进医学因果推理。

IF 4.5 2区医学

Journal of Biomedical Informatics Pub Date : 2025-09-13 DOI: 10.1016/j.jbi.2025.104903

Hadasa Kaufman , Nadav Rappoport , Amir Gilad , Michal Linial

{"title":"Advancing causal inference in medicine using biobank data","authors":"Hadasa Kaufman , Nadav Rappoport , Amir Gilad , Michal Linial","doi":"10.1016/j.jbi.2025.104903","DOIUrl":"10.1016/j.jbi.2025.104903","url":null,"abstract":"<div><div>Causal inference from observational medical record data is critical for advancing precision and personalization in healthcare. Recently, biobanks – collections of biological samples linked with genetic, lifestyle, environmental, and health-related data – have emerged as valuable resources for large-scale population studies. By integrating these resources, biobanks offer a harmonized repository of diverse data for each individual, capturing real-world medical events, including procedures, treatments, and diagnoses. However, these resources are often affected by confounding factors, selection biases, and missing information, posing significant challenges to drawing valid causal conclusions. While randomized controlled trials (RCTs) remain the gold standard for drug development and medical decision-making, the growing availability of observational data highlights the need for robust causal inference methodologies. This study provides an overview of methods for inferring the effect of a treatment on an outcome from observational data applicable to biobank data, focusing on the unique challenges they address. Our objective is to introduce current methods used for causal discovery in observational medical data. We discuss classic and modern methodologies that offer significant opportunities alongside the difficulty in reaching causality. We cover statistical methods designed for large-scale biobanks that have the potential to improve clinical decision-making, guide public health policies, and drive further research.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104903"},"PeriodicalIF":4.5,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145069728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0