Journal of Biomedical Informatics最新文献_第9页

An attention-based framework for integrating WSI and genomic data in cancer survival prediction 将WSI和基因组数据整合到癌症生存预测中的基于注意力的框架。

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-06-06 DOI: 10.1016/j.jbi.2025.104836

Genlang Chen , Sixuan Sui , Jiajian Zhang , Xuan Liu , Ping Cai

{"title":"An attention-based framework for integrating WSI and genomic data in cancer survival prediction","authors":"Genlang Chen , Sixuan Sui , Jiajian Zhang , Xuan Liu , Ping Cai","doi":"10.1016/j.jbi.2025.104836","DOIUrl":"10.1016/j.jbi.2025.104836","url":null,"abstract":"<div><h3>Objective:</h3><div>Cancer survival prediction plays a vital role in enhancing medical decision-making and optimizing patient management. Accurate survival estimation enables healthcare providers to develop personalized treatment plans, improve treatment outcomes, and identify high-risk patients for timely intervention. However, existing methods often rely on single-modality data or suffer from excessive computational complexity, limiting their practical application and the full potential of multimodal integration.</div></div><div><h3>Methods:</h3><div>To address these challenges, we propose a novel multimodal survival prediction framework that integrates Whole Slide Image (WSI) and genomic data. The framework employs attention mechanisms to model intra-modal and inter-modal correlations, effectively capturing complex dependencies within and between modalities. Additionally, locality-sensitive hashing is applied to optimize the self-attention mechanism, significantly reducing computational costs while maintaining predictive performance, enabling the model to handle large-scale or high-resolution WSI datasets efficiently.</div></div><div><h3>Results:</h3><div>Extensive experiments on the TCGA-BLCA dataset validate the effectiveness of the proposed approach. The results demonstrate that integrating WSI and genomic data improves survival prediction accuracy compared to unimodal methods. The optimized self-attention mechanism further enhances model efficiency, allowing for practical implementation on large datasets.</div></div><div><h3>Conclusion:</h3><div>The proposed framework provides a robust and efficient solution for cancer survival prediction by leveraging multimodal data integration and optimized attention mechanisms. This study highlights the importance of multimodal learning in medical applications and offers a promising direction for future advancements in AI-driven clinical decision support systems.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104836"},"PeriodicalIF":4.0,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144248035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Monitoring strategies for continuous evaluation of deployed clinical prediction models 对部署的临床预测模型进行持续评估的监测策略。

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-06-05 DOI: 10.1016/j.jbi.2025.104854

Grace Y.E. Kim , Conor K. Corbin , François Grolleau , Michael Baiocchi , Jonathan H. Chen

{"title":"Monitoring strategies for continuous evaluation of deployed clinical prediction models","authors":"Grace Y.E. Kim , Conor K. Corbin , François Grolleau , Michael Baiocchi , Jonathan H. Chen","doi":"10.1016/j.jbi.2025.104854","DOIUrl":"10.1016/j.jbi.2025.104854","url":null,"abstract":"<div><h3>Objective:</h3><div>As machine learning adoption in clinical practice continues to grow, deployed classifiers must be continuously monitored and updated (retrained) to protect against data drift that stems from inevitable changes, including evolving medical practices and shifting patient populations. However, successful clinical machine learning classifiers will lead to a change in care which may change the distribution of features, labels, and their relationship. For example, “high risk” cases that were correctly identified by the model may ultimately get labeled as “low risk” thanks to an intervention prompted by the model’s alert. Classifier surveillance systems naive to such deployment-induced feedback loops will estimate lower model performance and lead to degraded future classifier retrains. The objective of this study is to simulate the impact of these feedback loops, propose feedback aware monitoring strategies as a solution, and assess the performance of these alternative monitoring strategies through simulations.</div></div><div><h3>Methods:</h3><div>We propose Adherence Weighted and Sampling Weighted Monitoring as two feedback loop-aware surveillance strategies. Through simulation we evaluate their ability to accurately appraise post deployment model performance and to initiate safe and accurate classifier retraining.</div></div><div><h3>Results:</h3><div>Measured across accuracy, area under the receiver operating characteristic curve, average precision, brier score, expected calibration error, F1, precision, sensitivity, and specificity, in the presence of feedback loops, Adherence Weighted and Sampling Weighted strategies have the highest fidelity to the ground truth classifier performance while standard approaches yield the most inaccurate estimations. Furthermore, in simulations with true data drift, retraining using standard unweighted approaches results in a AUROC score of 0.52 (drop from 0.72). In contrast, retraining based on Adherence Weighted and Sampling Weighted strategies recover performance to 0.67 which is comparable to what a new model trained from scratch on the existing and shifted data would obtain.</div></div><div><h3>Conclusion:</h3><div>Compared to standard approaches, Adherence Weighted and Sampling Weighted strategies yield more accurate classifier performance estimates, measured according to the no-treatment potential outcome. Retraining based on these strategies bring stronger performance recovery when tested against data drift and feedback loops than do standard approaches.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104854"},"PeriodicalIF":4.0,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144248036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GRU-TV: Time- and Velocity-aware Gated Recurrent Unit for patient representation GRU-TV：时间和速度感知门控复发单元的病人代表

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-06-04 DOI: 10.1016/j.jbi.2025.104855

Ningtao Liu , Shuiping Gou , Ruoxi Gao , Binxiao Su , Wenbo Liu , Claire K.S. Park , Shuwei Xing , Jing Yuan , Aaron Fenster

{"title":"GRU-TV: Time- and Velocity-aware Gated Recurrent Unit for patient representation","authors":"Ningtao Liu , Shuiping Gou , Ruoxi Gao , Binxiao Su , Wenbo Liu , Claire K.S. Park , Shuwei Xing , Jing Yuan , Aaron Fenster","doi":"10.1016/j.jbi.2025.104855","DOIUrl":"10.1016/j.jbi.2025.104855","url":null,"abstract":"<div><h3>Objective:</h3><div>The multivariate clinical temporal series (MCTS) extracted from electronic health records (EHRs) can characterize the dynamic physiological processes. Previous deep patient representation models were proposed to address imputation values and irregular sampling in MCTS. However, the change in physiological status, particularly instantaneous velocity, has not received adequate attention.</div></div><div><h3>Methods:</h3><div>To address this gap, we propose a Time- and Velocity-aware Gated Recurrent Unit (GRU-TV) model for patient representation learning. In the GRU-TV model, we apply the neural ordinary differential equation to describe the instantaneous velocity of the patient’s physiological status. This instantaneous velocity is embedded in the hidden state updating process of the basic GRU model for the awareness of uneven time intervals. Besides, the forward propagation of the GRU-TV model also incorporates this instantaneous velocity to enable the perception of non-uniform changes in the patient’s physiological status over time.</div></div><div><h3>Results:</h3><div>The performance of the GRU-TV model is evaluated on multiple clinical concerns across two real-world datasets. The average AUC for the sub-tasks on the complete, 70% sampled, and 50% sampled PhysioNet2012 datasets are 0.89, 0.84, and 0.83, respectively. The average AUC for the acute care phenotype classification on the complete, 20% sampled, and 10% sampled MIMIC-III datasets are 0.84, 0.82, and 0.80, respectively. The mean absolute deviation of the length-of-stay regression task is 1.84 days.</div></div><div><h3>Conclusion:</h3><div>The superior performance underscores the importance of instantaneous physiological changes in patient representation and clinical decision-making, particularly under challenging data conditions.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104855"},"PeriodicalIF":4.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GatorCLR: Personalized predictions of patient outcomes on electronic health records using self-supervised contrastive graph representation GatorCLR：使用自我监督对比图表示对电子健康记录中的患者结果进行个性化预测

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-06-02 DOI: 10.1016/j.jbi.2025.104851

Yuxi Liu , Zhenhao Zhang , Jiacong Mi , Shirui Pan , Tianlong Chen , Yi Guo , Xing He , Jiang Bian

{"title":"GatorCLR: Personalized predictions of patient outcomes on electronic health records using self-supervised contrastive graph representation","authors":"Yuxi Liu , Zhenhao Zhang , Jiacong Mi , Shirui Pan , Tianlong Chen , Yi Guo , Xing He , Jiang Bian","doi":"10.1016/j.jbi.2025.104851","DOIUrl":"10.1016/j.jbi.2025.104851","url":null,"abstract":"<div><h3>Objective:</h3><div>Recently, there has been growing interest in analyzing large amounts of Electronic Health Record (EHR) data. Patient outcome prediction is a major area of interest in EHR analysis that focuses on predicting the future health status of patients using structured data types, such as diagnoses, medications, and procedures collected from longitudinal EHR data. We investigate and design self-supervised learning (SSL) paradigms to learn high-quality representations from longitudinal EHR data, aiming to effectively capture longitudinal relationships and patterns for improved patient outcome predictions.</div></div><div><h3>Methods:</h3><div>We propose an end-to-end, novel, and robust model called GatorCLR that aligns with the contrastive SSL paradigm. GatorCLR incorporates graph analysis-based patient modeling into longitudinal EHR data, generating graph representations of nodes and edges representing patients, their relationships, and similarities. A two-layer augmentation technique is further incorporated in our GatorCLR that generates consistent, identity-preserving augmentations from graph representations.</div></div><div><h3>Results:</h3><div>We evaluate our approach using real-world EHR datasets. Experimental results indicate that our GatorCLR delivers meaningful and robust performance across multiple clinical tasks and datasets and provides transparency of the model decisions.</div></div><div><h3>Conclusion:</h3><div>The proposed approach presents a significant step toward developing a foundation model with longitudinal EHR data, capable of making informed predictions and adaptable to various downstream use cases and tasks. This study should, therefore, be of value to practitioners wishing to leverage longitudinal EHR data for predictive analytics.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104851"},"PeriodicalIF":4.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-view based heterogeneous graph contrastive learning for drug–target interaction prediction 基于多视图异构图对比学习的药物-靶标相互作用预测。

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-06-02 DOI: 10.1016/j.jbi.2025.104852

Chao Li , Lichao Zhang , Guoyi Sun , Lingtao Su

{"title":"Multi-view based heterogeneous graph contrastive learning for drug–target interaction prediction","authors":"Chao Li , Lichao Zhang , Guoyi Sun , Lingtao Su","doi":"10.1016/j.jbi.2025.104852","DOIUrl":"10.1016/j.jbi.2025.104852","url":null,"abstract":"<div><div>Drug–Target Interaction (DTI) prediction plays a pivotal role in accelerating drug discovery and development by identifying novel interactions between drugs and targets. Most previous studies on Drug–Protein Pair (DPP) networks have primarily focused on learning their topological structures. However, two key challenges remain: the integration of topological and semantic information is often insufficient, and the representation diversity may be diminished during graph convolution operations, affecting the expressiveness of learned features. To address the above challenges, we propose a novel paradigm named Multi-view Based Heterogeneous Graph Contrastive Learning for Drug–Target Interaction Prediction (HGCML-DTI). Specifically, we initially establish a drug–protein heterogeneous graph, followed by employing a weighted Graph Convolutional Network (GCN) to derive vector representations for both drug and protein nodes. Subsequently, we individually construct the topology and semantic graphs for DPP and integrate them to form a unified public graph. A multi-channel graph neural network is employed to learn DPP representations. To preserve representation diversity and enhance discriminative ability, a multi-view contrastive learning strategy is introduced. Then, a Multilayer Perceptron (MLP) neural network is used to recognize DTI. To prove the effectiveness of this work, extensive experiments are conducted on six real-world datasets, and comparisons are made with seven competitive baselines. The results demonstrate that the proposed HGCML-DTI significantly outperforms state-of-the-art methods. This work highlights the importance of combining multi-view learning and contrastive strategies to advance the field of DTI prediction. Source codes are available at <span><span>https://github.com/7A13/HGCML-DTI</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104852"},"PeriodicalIF":4.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144225569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Focused digital cohort selection from social media using the metric backbone of biomedical knowledge graphs 使用生物医学知识图谱的度量主干从社交媒体中进行集中的数字队列选择。

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-06-01 DOI: 10.1016/j.jbi.2025.104847

Ziqi Guo , Jack Felag , Jordan C. Rozum , Rion Brattig Correia , Xuan Wang , Luis M. Rocha

{"title":"Focused digital cohort selection from social media using the metric backbone of biomedical knowledge graphs","authors":"Ziqi Guo , Jack Felag , Jordan C. Rozum , Rion Brattig Correia , Xuan Wang , Luis M. Rocha","doi":"10.1016/j.jbi.2025.104847","DOIUrl":"10.1016/j.jbi.2025.104847","url":null,"abstract":"<div><div>Social media data allows researchers to construct large <em>digital cohorts</em> — groups of users who post health-related content — to study the interplay between human behavior and medical treatment. Identifying the users most relevant to a specific health problem is, however, a challenge in that social media sites vary in the generality of their discourse. While X (formerly Twitter), Instagram, and Facebook cater to wide ranging topics, Reddit subgroups and dedicated patient advocacy forums trade in much more specific, biomedically-relevant discourse.</div><div>To filter relevant users on any social media, we have developed a general method and tested it on epilepsy discourse. We analyzed the text from posts by users who mention epilepsy drugs at least once in the general-purpose social media sites X and Instagram, the epilepsy-focused Reddit subgroup (r/Epilepsy), and the Epilepsy Foundation of America (EFA) forums. We used a curated medical terminology dictionary to generate a knowledge graph (KG) from each social media site, whereby nodes represent terms, and edge weights denote the strength of association between pairs of terms in the collected text.</div><div>Our method is based on computing the metric backbone of each KG, which yields the (sparsified) subgraph of edges that participate in shortest paths. By comparing the subset of users who contribute to the backbone to the subset who do not, we show that epilepsy-focused social media users contribute to the KG backbone in much higher proportion than do general-purpose social media users. Furthermore, using human annotation of Instagram posts, we demonstrate that users who do not contribute to the backbone are much more likely to use dictionary terms in a manner inconsistent with their biomedical meaning and are rightly excluded from the cohort of interest.</div><div>Our metric backbone approach, thus, has several benefits: it yields focused user cohorts who engage in discourse relevant to a targeted biomedical problem; unlike engagement-based approaches, it can retain low-engagement users who nonetheless contribute meaningful biomedical insights and filter out very vocal users who contribute no relevant content, it is parameter-free, algebraically principled, does not require classifiers or human-curation, and is simple to compute with the open-source code we provide.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104847"},"PeriodicalIF":4.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144215947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A trajectory-informed model for detecting drug-drug-host interaction from real-world data 从真实世界数据中检测药物-药物-宿主相互作用的轨迹知情模型

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-05-31 DOI: 10.1016/j.jbi.2025.104859

Yi Shi , Anna Sun , Hongmei Nan , Yuedi Yang , Jing Xu , Michael T Eadon , Jing Su , Pengyue Zhang

{"title":"A trajectory-informed model for detecting drug-drug-host interaction from real-world data","authors":"Yi Shi , Anna Sun , Hongmei Nan , Yuedi Yang , Jing Xu , Michael T Eadon , Jing Su , Pengyue Zhang","doi":"10.1016/j.jbi.2025.104859","DOIUrl":"10.1016/j.jbi.2025.104859","url":null,"abstract":"<div><h3>Objective</h3><div>Adverse drug event (ADE) is a significant challenge to public health. Since data mining methods have been developed to identify signals of drug-drug interaction-induced (DDI-induced) or drug-host interaction-induced (DHI-induced) ADE from real-world data, we aim to develop a new method to detect adverse drug-drug interaction with a special awareness on patient characteristics.</div></div><div><h3>Methods</h3><div>We developed a trajectory-informed model (TIM) to identify signals of adverse DDI with a special awareness on patient characteristics (i.e., drug-drug-host interaction [DDHI]). We also proposed a study design based on an optimal selection of within-subject and between-subjects controls for detecting ADEs from real-world data. We analyzed a large-scale US administrative claims data and conducted a simulation study.</div></div><div><h3>Results</h3><div>In administrative claims data analysis, we developed optimally matched case-control datasets for potential ADEs including acute kidney injury and gastrointestinal bleeding. We identified that an optimal selection of controls had a higher AUC compared to traditional designs for ADE detection (AUCs: 0.79–0.80 vs. 0.56–0.76). We observed that TIM detected more signals than reference methods (odds ratios: 1.13–3.18, P < 0.01), and found that 36 % of all signals generated by TIM were DDHI signals. In a simulation study, we demonstrated that TIM had an empirical false discovery rate (FDR) less than the desired value of 0.05, as well as > 1.4-fold higher probabilities of detection of DDHI signals than reference methods.</div></div><div><h3>Conclusions</h3><div>TIM had a high probability to identify signals of adverse DDI and DDHI in a high-throughput ADE mining while controlling false positive rate. A significant portion of drug-drug combinations were associated with an increased risk of ADEs only in specific patient subpopulations. Optimal selection of within-subject and between-subjects controls could improve the performance of ADE data mining.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104859"},"PeriodicalIF":4.0,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144204446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SigPhi-Med: A lightweight vision-language assistant for biomedicine SigPhi-Med：用于生物医学的轻量级视觉语言助手

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-05-31 DOI: 10.1016/j.jbi.2025.104849

Feizhong Zhou, Xingyue Liu, Qiao Zeng, Zhuhan Li, Hanguang Xiao

{"title":"SigPhi-Med: A lightweight vision-language assistant for biomedicine","authors":"Feizhong Zhou, Xingyue Liu, Qiao Zeng, Zhuhan Li, Hanguang Xiao","doi":"10.1016/j.jbi.2025.104849","DOIUrl":"10.1016/j.jbi.2025.104849","url":null,"abstract":"<div><h3>Background:</h3><div>Recent advancements in general multimodal large language models (MLLMs) have led to substantial improvements in the performance of biomedical MLLMs across diverse medical tasks, exhibiting significant transformative potential. However, the large number of parameters in MLLMs necessitates substantial computational resources during both training and inference stages, thereby limiting their feasibility in resource-constrained clinical settings. This study aims to develop a lightweight biomedical multimodal small language model (MSLM) to mitigate this limitation.</div></div><div><h3>Methods:</h3><div>We replaced the large language model (LLM) in MLLMs with the small language model (SLM), resulting in a significant reduction in the number of parameters. To ensure that the model maintains strong performance on biomedical tasks, we systematically analyzed the effects of key components of biomedical MSLMs, including the SLM, vision encoder, training strategy, and training data, on model performance. Based on these analyses, we implemented specific optimizations for the model.</div></div><div><h3>Results:</h3><div>Experiments demonstrate that the performance of biomedical MSLMs is significantly influenced by the parameter count of the SLM component, the pre-training strategy and resolution of the vision encoder component, and both the quality and quantity of the training data. Compared to several state-of-the-art models, including LLaVA-Med-v1.5 (7B), LLaVA-Med (13B) and Med-MoE (2.7B × 4), our optimized model, SigPhi-Med, with only 4.2B parameters, achieves significantly superior overall performance across the VQA-RAD, SLAKE, and Path-VQA medical visual question-answering (VQA) benchmarks.</div></div><div><h3>Conclusions:</h3><div>This study highlights the significant potential of biomedical MSLMs in biomedical applications, presenting a more cost-effective approach for deploying AI assistants in healthcare settings. Additionally, our analysis of MSLMs key components provides valuable insights for their development in other specialized domains. Our code is available at <span><span>https://github.com/NyKxo1/SigPhi-Med</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104849"},"PeriodicalIF":4.0,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144189961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Do it faster with PICOS: Generative AI-Assisted systematic review screening 使用PICOS：生成式人工智能辅助的系统审查筛选可以更快地完成。

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-05-28 DOI: 10.1016/j.jbi.2025.104860

Sai Krishna Vallamchetla , Omar Abdelkader , Ali Elnaggar , Doaa Ramadan , Md Manjurul Islam Shourav , Irbaz B. Riaz , Michelle P. Lin

{"title":"Do it faster with PICOS: Generative AI-Assisted systematic review screening","authors":"Sai Krishna Vallamchetla , Omar Abdelkader , Ali Elnaggar , Doaa Ramadan , Md Manjurul Islam Shourav , Irbaz B. Riaz , Michelle P. Lin","doi":"10.1016/j.jbi.2025.104860","DOIUrl":"10.1016/j.jbi.2025.104860","url":null,"abstract":"<div><h3>Background</h3><div>Systematic reviews (SRs) require substantial time and human resources, especially during the screening phase. Large Language Models (LLMs) have shown the potential to expedite screening. However, their use in generating structured PICOS (Population, Intervention/Exposure, Comparison, Outcome, Study design) summaries from title and abstract to assist human reviewers during screening remains unexplored.</div></div><div><h3>Objective</h3><div>To assess the impact of open-source (Mistral-Nemo-Instruct-2407) LLM-generated structured PICOS summaries on the speed and accuracy of title and abstract screening.</div></div><div><h3>Methods</h3><div>Four neurology trainees were grouped into two pairs based on previous screening experience. Pair A (A1, A2) consisted of less experienced trainees (1–2 SR), while Pair B (B1, B2) consisted of more experienced trainees (≥3 SR). Reviewers A1 and B1 received titles, abstracts, and LLM-generated structured PICOS summaries for each article. Reviewers A2 and B2 received only titles and abstracts. All reviewers independently screened the same set of 1,003 articles using predefined eligibility criteria. Screening times were recorded, and performance metrics were calculated.</div></div><div><h3>Results</h3><div>PICOS-assisted reviewers screened significantly faster (A1: 116 min; B1: 90 min) than those without (A2: 463 min; B2: 370 min), with approximately 75% reduction in screening workload. Sensitivity was perfect for PICOS-assisted reviewers (100%), whereas it was lower for those without assistance (88.0% and 92.0%). Furthermore, PICOS-assisted reviewers demonstrated higher accuracy (99.9%), specificity (99.9), F1 scores (98.0%), and strong inter-rater reliability (Cohen’s Kappa of 99.8%). Less experienced reviewer with PICOS assistance(A1) outperformed experienced reviewer(B2) without assistance in both efficiency and sensitivity<strong>.</strong></div></div><div><h3>Conclusion</h3><div>LLM-generated PICOS summaries enhance the speed and accuracy of title and abstract screening by providing an additional layer of structured information. With PICOS assistance, less experienced reviewer surpassed their more experienced peers. Future research should explore the applicability of this novel method across diverse fields outside of neurology and its integration into fully automated systems.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104860"},"PeriodicalIF":4.0,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144187104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computational strategies in nutrigenetics: Constructing a reference dataset of nutrition-associated genetic polymorphisms 营养遗传学中的计算策略：构建营养相关遗传多态性的参考数据集

IF 4 2区医学

Journal of Biomedical Informatics Pub Date : 2025-05-26 DOI: 10.1016/j.jbi.2025.104845

Giovanni Maria De Filippis , Maria Monticelli , Alessandra Pollice , Tiziana Angrisano , Bruno Hay Mele , Viola Calabrò

{"title":"Computational strategies in nutrigenetics: Constructing a reference dataset of nutrition-associated genetic polymorphisms","authors":"Giovanni Maria De Filippis , Maria Monticelli , Alessandra Pollice , Tiziana Angrisano , Bruno Hay Mele , Viola Calabrò","doi":"10.1016/j.jbi.2025.104845","DOIUrl":"10.1016/j.jbi.2025.104845","url":null,"abstract":"<div><h3>Objective:</h3><div>This study aims to create a comprehensive dataset of human genetic polymorphisms associated with nutrition by integrating data from multiple sources, including the LitVar database, PubMed, and the GWAS catalog. This consolidated resource is intended to facilitate research in nutrigenetics by providing a reliable foundation to explore genetic polymorphisms linked to nutrition-related traits.</div></div><div><h3>Methods:</h3><div>We developed a data integration pipeline to assemble and analyze the dataset. It performs data retrieval from LitVar and PubMed and merges the data to produce a unified dataset. Comprehensive MeSH queries are defined to extract relevant genetic associations, which are then cross-referenced with the GWAS data.</div></div><div><h3>Results:</h3><div>The resulting dataset aggregates extensive information on genetic polymorphisms and nutrition-related traits. Through MeSH query, we identified key genes and SNPs associated with nutrition-related traits. Cross-referencing with GWAS data provided insights on potential effects or risk alleles associated with this genetic polymorphisms. The co-occurrence analysis revealed meaningful gene-diet interactions, advancing personalized nutrition and nutrigenomics research.</div></div><div><h3>Conclusion:</h3><div>The dataset presented in this study consolidates and organizes information on genetic polymorphisms associated with nutrition, facilitating detailed exploration of gene-diet interactions. This resource advances personalized nutrition interventions and nutrigenomics research. The dataset is publicly accessible at <span><span>https://zenodo.org/records/14052302</span><svg><path></path></svg></span>, its adaptable structure ensures applicability in a broad range of genetic investigations.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104845"},"PeriodicalIF":4.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144154934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0