Journal of Biomedical Informatics最新文献

筛选
英文 中文
Monitoring strategies for continuous evaluation of deployed clinical prediction models 对部署的临床预测模型进行持续评估的监测策略。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-06-05 DOI: 10.1016/j.jbi.2025.104854
Grace Y.E. Kim , Conor K. Corbin , François Grolleau , Michael Baiocchi , Jonathan H. Chen
{"title":"Monitoring strategies for continuous evaluation of deployed clinical prediction models","authors":"Grace Y.E. Kim ,&nbsp;Conor K. Corbin ,&nbsp;François Grolleau ,&nbsp;Michael Baiocchi ,&nbsp;Jonathan H. Chen","doi":"10.1016/j.jbi.2025.104854","DOIUrl":"10.1016/j.jbi.2025.104854","url":null,"abstract":"<div><h3>Objective:</h3><div>As machine learning adoption in clinical practice continues to grow, deployed classifiers must be continuously monitored and updated (retrained) to protect against data drift that stems from inevitable changes, including evolving medical practices and shifting patient populations. However, successful clinical machine learning classifiers will lead to a change in care which may change the distribution of features, labels, and their relationship. For example, “high risk” cases that were correctly identified by the model may ultimately get labeled as “low risk” thanks to an intervention prompted by the model’s alert. Classifier surveillance systems naive to such deployment-induced feedback loops will estimate lower model performance and lead to degraded future classifier retrains. The objective of this study is to simulate the impact of these feedback loops, propose feedback aware monitoring strategies as a solution, and assess the performance of these alternative monitoring strategies through simulations.</div></div><div><h3>Methods:</h3><div>We propose Adherence Weighted and Sampling Weighted Monitoring as two feedback loop-aware surveillance strategies. Through simulation we evaluate their ability to accurately appraise post deployment model performance and to initiate safe and accurate classifier retraining.</div></div><div><h3>Results:</h3><div>Measured across accuracy, area under the receiver operating characteristic curve, average precision, brier score, expected calibration error, F1, precision, sensitivity, and specificity, in the presence of feedback loops, Adherence Weighted and Sampling Weighted strategies have the highest fidelity to the ground truth classifier performance while standard approaches yield the most inaccurate estimations. Furthermore, in simulations with true data drift, retraining using standard unweighted approaches results in a AUROC score of 0.52 (drop from 0.72). In contrast, retraining based on Adherence Weighted and Sampling Weighted strategies recover performance to 0.67 which is comparable to what a new model trained from scratch on the existing and shifted data would obtain.</div></div><div><h3>Conclusion:</h3><div>Compared to standard approaches, Adherence Weighted and Sampling Weighted strategies yield more accurate classifier performance estimates, measured according to the no-treatment potential outcome. Retraining based on these strategies bring stronger performance recovery when tested against data drift and feedback loops than do standard approaches.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104854"},"PeriodicalIF":4.0,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144248036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GRU-TV: Time- and Velocity-aware Gated Recurrent Unit for patient representation GRU-TV:时间和速度感知门控复发单元的病人代表
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-06-04 DOI: 10.1016/j.jbi.2025.104855
Ningtao Liu , Shuiping Gou , Ruoxi Gao , Binxiao Su , Wenbo Liu , Claire K.S. Park , Shuwei Xing , Jing Yuan , Aaron Fenster
{"title":"GRU-TV: Time- and Velocity-aware Gated Recurrent Unit for patient representation","authors":"Ningtao Liu ,&nbsp;Shuiping Gou ,&nbsp;Ruoxi Gao ,&nbsp;Binxiao Su ,&nbsp;Wenbo Liu ,&nbsp;Claire K.S. Park ,&nbsp;Shuwei Xing ,&nbsp;Jing Yuan ,&nbsp;Aaron Fenster","doi":"10.1016/j.jbi.2025.104855","DOIUrl":"10.1016/j.jbi.2025.104855","url":null,"abstract":"<div><h3>Objective:</h3><div>The multivariate clinical temporal series (MCTS) extracted from electronic health records (EHRs) can characterize the dynamic physiological processes. Previous deep patient representation models were proposed to address imputation values and irregular sampling in MCTS. However, the change in physiological status, particularly instantaneous velocity, has not received adequate attention.</div></div><div><h3>Methods:</h3><div>To address this gap, we propose a Time- and Velocity-aware Gated Recurrent Unit (GRU-TV) model for patient representation learning. In the GRU-TV model, we apply the neural ordinary differential equation to describe the instantaneous velocity of the patient’s physiological status. This instantaneous velocity is embedded in the hidden state updating process of the basic GRU model for the awareness of uneven time intervals. Besides, the forward propagation of the GRU-TV model also incorporates this instantaneous velocity to enable the perception of non-uniform changes in the patient’s physiological status over time.</div></div><div><h3>Results:</h3><div>The performance of the GRU-TV model is evaluated on multiple clinical concerns across two real-world datasets. The average AUC for the sub-tasks on the complete, 70% sampled, and 50% sampled PhysioNet2012 datasets are 0.89, 0.84, and 0.83, respectively. The average AUC for the acute care phenotype classification on the complete, 20% sampled, and 10% sampled MIMIC-III datasets are 0.84, 0.82, and 0.80, respectively. The mean absolute deviation of the length-of-stay regression task is 1.84 days.</div></div><div><h3>Conclusion:</h3><div>The superior performance underscores the importance of instantaneous physiological changes in patient representation and clinical decision-making, particularly under challenging data conditions.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104855"},"PeriodicalIF":4.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GatorCLR: Personalized predictions of patient outcomes on electronic health records using self-supervised contrastive graph representation GatorCLR:使用自我监督对比图表示对电子健康记录中的患者结果进行个性化预测
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-06-02 DOI: 10.1016/j.jbi.2025.104851
Yuxi Liu , Zhenhao Zhang , Jiacong Mi , Shirui Pan , Tianlong Chen , Yi Guo , Xing He , Jiang Bian
{"title":"GatorCLR: Personalized predictions of patient outcomes on electronic health records using self-supervised contrastive graph representation","authors":"Yuxi Liu ,&nbsp;Zhenhao Zhang ,&nbsp;Jiacong Mi ,&nbsp;Shirui Pan ,&nbsp;Tianlong Chen ,&nbsp;Yi Guo ,&nbsp;Xing He ,&nbsp;Jiang Bian","doi":"10.1016/j.jbi.2025.104851","DOIUrl":"10.1016/j.jbi.2025.104851","url":null,"abstract":"<div><h3>Objective:</h3><div>Recently, there has been growing interest in analyzing large amounts of Electronic Health Record (EHR) data. Patient outcome prediction is a major area of interest in EHR analysis that focuses on predicting the future health status of patients using structured data types, such as diagnoses, medications, and procedures collected from longitudinal EHR data. We investigate and design self-supervised learning (SSL) paradigms to learn high-quality representations from longitudinal EHR data, aiming to effectively capture longitudinal relationships and patterns for improved patient outcome predictions.</div></div><div><h3>Methods:</h3><div>We propose an end-to-end, novel, and robust model called GatorCLR that aligns with the contrastive SSL paradigm. GatorCLR incorporates graph analysis-based patient modeling into longitudinal EHR data, generating graph representations of nodes and edges representing patients, their relationships, and similarities. A two-layer augmentation technique is further incorporated in our GatorCLR that generates consistent, identity-preserving augmentations from graph representations.</div></div><div><h3>Results:</h3><div>We evaluate our approach using real-world EHR datasets. Experimental results indicate that our GatorCLR delivers meaningful and robust performance across multiple clinical tasks and datasets and provides transparency of the model decisions.</div></div><div><h3>Conclusion:</h3><div>The proposed approach presents a significant step toward developing a foundation model with longitudinal EHR data, capable of making informed predictions and adaptable to various downstream use cases and tasks. This study should, therefore, be of value to practitioners wishing to leverage longitudinal EHR data for predictive analytics.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104851"},"PeriodicalIF":4.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-view based heterogeneous graph contrastive learning for drug–target interaction prediction 基于多视图异构图对比学习的药物-靶标相互作用预测。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-06-02 DOI: 10.1016/j.jbi.2025.104852
Chao Li , Lichao Zhang , Guoyi Sun , Lingtao Su
{"title":"Multi-view based heterogeneous graph contrastive learning for drug–target interaction prediction","authors":"Chao Li ,&nbsp;Lichao Zhang ,&nbsp;Guoyi Sun ,&nbsp;Lingtao Su","doi":"10.1016/j.jbi.2025.104852","DOIUrl":"10.1016/j.jbi.2025.104852","url":null,"abstract":"<div><div>Drug–Target Interaction (DTI) prediction plays a pivotal role in accelerating drug discovery and development by identifying novel interactions between drugs and targets. Most previous studies on Drug–Protein Pair (DPP) networks have primarily focused on learning their topological structures. However, two key challenges remain: the integration of topological and semantic information is often insufficient, and the representation diversity may be diminished during graph convolution operations, affecting the expressiveness of learned features. To address the above challenges, we propose a novel paradigm named Multi-view Based Heterogeneous Graph Contrastive Learning for Drug–Target Interaction Prediction (HGCML-DTI). Specifically, we initially establish a drug–protein heterogeneous graph, followed by employing a weighted Graph Convolutional Network (GCN) to derive vector representations for both drug and protein nodes. Subsequently, we individually construct the topology and semantic graphs for DPP and integrate them to form a unified public graph. A multi-channel graph neural network is employed to learn DPP representations. To preserve representation diversity and enhance discriminative ability, a multi-view contrastive learning strategy is introduced. Then, a Multilayer Perceptron (MLP) neural network is used to recognize DTI. To prove the effectiveness of this work, extensive experiments are conducted on six real-world datasets, and comparisons are made with seven competitive baselines. The results demonstrate that the proposed HGCML-DTI significantly outperforms state-of-the-art methods. This work highlights the importance of combining multi-view learning and contrastive strategies to advance the field of DTI prediction. Source codes are available at <span><span>https://github.com/7A13/HGCML-DTI</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104852"},"PeriodicalIF":4.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144225569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Focused digital cohort selection from social media using the metric backbone of biomedical knowledge graphs 使用生物医学知识图谱的度量主干从社交媒体中进行集中的数字队列选择。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-06-01 DOI: 10.1016/j.jbi.2025.104847
Ziqi Guo , Jack Felag , Jordan C. Rozum , Rion Brattig Correia , Xuan Wang , Luis M. Rocha
{"title":"Focused digital cohort selection from social media using the metric backbone of biomedical knowledge graphs","authors":"Ziqi Guo ,&nbsp;Jack Felag ,&nbsp;Jordan C. Rozum ,&nbsp;Rion Brattig Correia ,&nbsp;Xuan Wang ,&nbsp;Luis M. Rocha","doi":"10.1016/j.jbi.2025.104847","DOIUrl":"10.1016/j.jbi.2025.104847","url":null,"abstract":"<div><div>Social media data allows researchers to construct large <em>digital cohorts</em> — groups of users who post health-related content — to study the interplay between human behavior and medical treatment. Identifying the users most relevant to a specific health problem is, however, a challenge in that social media sites vary in the generality of their discourse. While X (formerly Twitter), Instagram, and Facebook cater to wide ranging topics, Reddit subgroups and dedicated patient advocacy forums trade in much more specific, biomedically-relevant discourse.</div><div>To filter relevant users on any social media, we have developed a general method and tested it on epilepsy discourse. We analyzed the text from posts by users who mention epilepsy drugs at least once in the general-purpose social media sites X and Instagram, the epilepsy-focused Reddit subgroup (r/Epilepsy), and the Epilepsy Foundation of America (EFA) forums. We used a curated medical terminology dictionary to generate a knowledge graph (KG) from each social media site, whereby nodes represent terms, and edge weights denote the strength of association between pairs of terms in the collected text.</div><div>Our method is based on computing the metric backbone of each KG, which yields the (sparsified) subgraph of edges that participate in shortest paths. By comparing the subset of users who contribute to the backbone to the subset who do not, we show that epilepsy-focused social media users contribute to the KG backbone in much higher proportion than do general-purpose social media users. Furthermore, using human annotation of Instagram posts, we demonstrate that users who do not contribute to the backbone are much more likely to use dictionary terms in a manner inconsistent with their biomedical meaning and are rightly excluded from the cohort of interest.</div><div>Our metric backbone approach, thus, has several benefits: it yields focused user cohorts who engage in discourse relevant to a targeted biomedical problem; unlike engagement-based approaches, it can retain low-engagement users who nonetheless contribute meaningful biomedical insights and filter out very vocal users who contribute no relevant content, it is parameter-free, algebraically principled, does not require classifiers or human-curation, and is simple to compute with the open-source code we provide.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104847"},"PeriodicalIF":4.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144215947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A trajectory-informed model for detecting drug-drug-host interaction from real-world data 从真实世界数据中检测药物-药物-宿主相互作用的轨迹知情模型
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-31 DOI: 10.1016/j.jbi.2025.104859
Yi Shi , Anna Sun , Hongmei Nan , Yuedi Yang , Jing Xu , Michael T Eadon , Jing Su , Pengyue Zhang
{"title":"A trajectory-informed model for detecting drug-drug-host interaction from real-world data","authors":"Yi Shi ,&nbsp;Anna Sun ,&nbsp;Hongmei Nan ,&nbsp;Yuedi Yang ,&nbsp;Jing Xu ,&nbsp;Michael T Eadon ,&nbsp;Jing Su ,&nbsp;Pengyue Zhang","doi":"10.1016/j.jbi.2025.104859","DOIUrl":"10.1016/j.jbi.2025.104859","url":null,"abstract":"<div><h3>Objective</h3><div>Adverse drug event (ADE) is a significant challenge to public health. Since data mining methods have been developed to identify signals of drug-drug interaction-induced (DDI-induced) or drug-host interaction-induced (DHI-induced) ADE from real-world data, we aim to develop a new method to detect adverse drug-drug interaction with a special awareness on patient characteristics.</div></div><div><h3>Methods</h3><div>We developed a trajectory-informed model (TIM) to identify signals of adverse DDI with a special awareness on patient characteristics (i.e., drug-drug-host interaction [DDHI]). We also proposed a study design based on an optimal selection of within-subject and between-subjects controls for detecting ADEs from real-world data. We analyzed a large-scale US administrative claims data and conducted a simulation study.</div></div><div><h3>Results</h3><div>In administrative claims data analysis, we developed optimally matched case-control datasets for potential ADEs including acute kidney injury and gastrointestinal bleeding. We identified that an optimal selection of controls had a higher AUC compared to traditional designs for ADE detection (AUCs: 0.79–0.80 vs. 0.56–0.76). We observed that TIM detected more signals than reference methods (odds ratios: 1.13–3.18, P &lt; 0.01), and found that 36 % of all signals generated by TIM were DDHI signals. In a simulation study, we demonstrated that TIM had an empirical false discovery rate (FDR) less than the desired value of 0.05, as well as &gt; 1.4-fold higher probabilities of detection of DDHI signals than reference methods.</div></div><div><h3>Conclusions</h3><div>TIM had a high probability to identify signals of adverse DDI and DDHI in a high-throughput ADE mining while controlling false positive rate. A significant portion of drug-drug combinations were associated with an increased risk of ADEs only in specific patient subpopulations. Optimal selection of within-subject and between-subjects controls could improve the performance of ADE data mining.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104859"},"PeriodicalIF":4.0,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144204446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SigPhi-Med: A lightweight vision-language assistant for biomedicine SigPhi-Med:用于生物医学的轻量级视觉语言助手
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-31 DOI: 10.1016/j.jbi.2025.104849
Feizhong Zhou, Xingyue Liu, Qiao Zeng, Zhuhan Li, Hanguang Xiao
{"title":"SigPhi-Med: A lightweight vision-language assistant for biomedicine","authors":"Feizhong Zhou,&nbsp;Xingyue Liu,&nbsp;Qiao Zeng,&nbsp;Zhuhan Li,&nbsp;Hanguang Xiao","doi":"10.1016/j.jbi.2025.104849","DOIUrl":"10.1016/j.jbi.2025.104849","url":null,"abstract":"<div><h3>Background:</h3><div>Recent advancements in general multimodal large language models (MLLMs) have led to substantial improvements in the performance of biomedical MLLMs across diverse medical tasks, exhibiting significant transformative potential. However, the large number of parameters in MLLMs necessitates substantial computational resources during both training and inference stages, thereby limiting their feasibility in resource-constrained clinical settings. This study aims to develop a lightweight biomedical multimodal small language model (MSLM) to mitigate this limitation.</div></div><div><h3>Methods:</h3><div>We replaced the large language model (LLM) in MLLMs with the small language model (SLM), resulting in a significant reduction in the number of parameters. To ensure that the model maintains strong performance on biomedical tasks, we systematically analyzed the effects of key components of biomedical MSLMs, including the SLM, vision encoder, training strategy, and training data, on model performance. Based on these analyses, we implemented specific optimizations for the model.</div></div><div><h3>Results:</h3><div>Experiments demonstrate that the performance of biomedical MSLMs is significantly influenced by the parameter count of the SLM component, the pre-training strategy and resolution of the vision encoder component, and both the quality and quantity of the training data. Compared to several state-of-the-art models, including LLaVA-Med-v1.5 (7B), LLaVA-Med (13B) and Med-MoE (2.7B × 4), our optimized model, SigPhi-Med, with only 4.2B parameters, achieves significantly superior overall performance across the VQA-RAD, SLAKE, and Path-VQA medical visual question-answering (VQA) benchmarks.</div></div><div><h3>Conclusions:</h3><div>This study highlights the significant potential of biomedical MSLMs in biomedical applications, presenting a more cost-effective approach for deploying AI assistants in healthcare settings. Additionally, our analysis of MSLMs key components provides valuable insights for their development in other specialized domains. Our code is available at <span><span>https://github.com/NyKxo1/SigPhi-Med</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104849"},"PeriodicalIF":4.0,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144189961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do it faster with PICOS: Generative AI-Assisted systematic review screening 使用PICOS:生成式人工智能辅助的系统审查筛选可以更快地完成。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-28 DOI: 10.1016/j.jbi.2025.104860
Sai Krishna Vallamchetla , Omar Abdelkader , Ali Elnaggar , Doaa Ramadan , Md Manjurul Islam Shourav , Irbaz B. Riaz , Michelle P. Lin
{"title":"Do it faster with PICOS: Generative AI-Assisted systematic review screening","authors":"Sai Krishna Vallamchetla ,&nbsp;Omar Abdelkader ,&nbsp;Ali Elnaggar ,&nbsp;Doaa Ramadan ,&nbsp;Md Manjurul Islam Shourav ,&nbsp;Irbaz B. Riaz ,&nbsp;Michelle P. Lin","doi":"10.1016/j.jbi.2025.104860","DOIUrl":"10.1016/j.jbi.2025.104860","url":null,"abstract":"<div><h3>Background</h3><div>Systematic reviews (SRs) require substantial time and human resources, especially during the screening phase. Large Language Models (LLMs) have shown the potential to expedite screening. However, their use in generating structured PICOS (Population, Intervention/Exposure, Comparison, Outcome, Study design) summaries from title and abstract to assist human reviewers during screening remains unexplored.</div></div><div><h3>Objective</h3><div>To assess the impact of open-source (Mistral-Nemo-Instruct-2407) LLM-generated structured PICOS summaries on the speed and accuracy of title and abstract screening.</div></div><div><h3>Methods</h3><div>Four neurology trainees were grouped into two pairs based on previous screening experience. Pair A (A1, A2) consisted of less experienced trainees (1–2 SR), while Pair B (B1, B2) consisted of more experienced trainees (≥3 SR). Reviewers A1 and B1 received titles, abstracts, and LLM-generated structured PICOS summaries for each article. Reviewers A2 and B2 received only titles and abstracts. All reviewers independently screened the same set of 1,003 articles using predefined eligibility criteria. Screening times were recorded, and performance metrics were calculated.</div></div><div><h3>Results</h3><div>PICOS-assisted reviewers screened significantly faster (A1: 116 min; B1: 90 min) than those without (A2: 463 min; B2: 370 min), with approximately 75% reduction in screening workload. Sensitivity was perfect for PICOS-assisted reviewers (100%), whereas it was lower for those without assistance (88.0% and 92.0%). Furthermore, PICOS-assisted reviewers demonstrated higher accuracy (99.9%), specificity (99.9), F1 scores (98.0%), and strong inter-rater reliability (Cohen’s Kappa of 99.8%). Less experienced reviewer with PICOS assistance(A1) outperformed experienced reviewer(B2) without assistance in both efficiency and sensitivity<strong>.</strong></div></div><div><h3>Conclusion</h3><div>LLM-generated PICOS summaries enhance the speed and accuracy of title and abstract screening by providing an additional layer of structured information. With PICOS assistance, less experienced reviewer surpassed their more experienced peers. Future research should explore the applicability of this novel method across diverse fields outside of neurology and its integration into fully automated systems.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104860"},"PeriodicalIF":4.0,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144187104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational strategies in nutrigenetics: Constructing a reference dataset of nutrition-associated genetic polymorphisms 营养遗传学中的计算策略:构建营养相关遗传多态性的参考数据集
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-26 DOI: 10.1016/j.jbi.2025.104845
Giovanni Maria De Filippis , Maria Monticelli , Alessandra Pollice , Tiziana Angrisano , Bruno Hay Mele , Viola Calabrò
{"title":"Computational strategies in nutrigenetics: Constructing a reference dataset of nutrition-associated genetic polymorphisms","authors":"Giovanni Maria De Filippis ,&nbsp;Maria Monticelli ,&nbsp;Alessandra Pollice ,&nbsp;Tiziana Angrisano ,&nbsp;Bruno Hay Mele ,&nbsp;Viola Calabrò","doi":"10.1016/j.jbi.2025.104845","DOIUrl":"10.1016/j.jbi.2025.104845","url":null,"abstract":"<div><h3>Objective:</h3><div>This study aims to create a comprehensive dataset of human genetic polymorphisms associated with nutrition by integrating data from multiple sources, including the LitVar database, PubMed, and the GWAS catalog. This consolidated resource is intended to facilitate research in nutrigenetics by providing a reliable foundation to explore genetic polymorphisms linked to nutrition-related traits.</div></div><div><h3>Methods:</h3><div>We developed a data integration pipeline to assemble and analyze the dataset. It performs data retrieval from LitVar and PubMed and merges the data to produce a unified dataset. Comprehensive MeSH queries are defined to extract relevant genetic associations, which are then cross-referenced with the GWAS data.</div></div><div><h3>Results:</h3><div>The resulting dataset aggregates extensive information on genetic polymorphisms and nutrition-related traits. Through MeSH query, we identified key genes and SNPs associated with nutrition-related traits. Cross-referencing with GWAS data provided insights on potential effects or risk alleles associated with this genetic polymorphisms. The co-occurrence analysis revealed meaningful gene-diet interactions, advancing personalized nutrition and nutrigenomics research.</div></div><div><h3>Conclusion:</h3><div>The dataset presented in this study consolidates and organizes information on genetic polymorphisms associated with nutrition, facilitating detailed exploration of gene-diet interactions. This resource advances personalized nutrition interventions and nutrigenomics research. The dataset is publicly accessible at <span><span>https://zenodo.org/records/14052302</span><svg><path></path></svg></span>, its adaptable structure ensures applicability in a broad range of genetic investigations.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104845"},"PeriodicalIF":4.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144154934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Navigating regulatory challenges across the life cycle of a SaMD 在SaMD的整个生命周期中应对监管挑战
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-21 DOI: 10.1016/j.jbi.2025.104856
Martina Francesconi , Miriam Cangi , Silvia Tamarri , Noemi Conditi , Chiara Menicucci , Alice Ravizza , Luisa Cattaneo , Elisabetta Bianchini
{"title":"Navigating regulatory challenges across the life cycle of a SaMD","authors":"Martina Francesconi ,&nbsp;Miriam Cangi ,&nbsp;Silvia Tamarri ,&nbsp;Noemi Conditi ,&nbsp;Chiara Menicucci ,&nbsp;Alice Ravizza ,&nbsp;Luisa Cattaneo ,&nbsp;Elisabetta Bianchini","doi":"10.1016/j.jbi.2025.104856","DOIUrl":"10.1016/j.jbi.2025.104856","url":null,"abstract":"<div><h3>Objective</h3><div>Software as medical devices (SaMDs) have become part of clinical practice and the management of the development and control processes of the documentation associated with them are an integral part of many medical realities. The European Regulation, MDR (EU) 2017/745, introduces a classification rule (rule 11, Annex VIII) specifically for software, which provides more explicit requirements than in the past, leading to classification of many software to higher risk and therefore to more complex certification processes. In this context, planning and awareness of possible regulatory strategies and related standards are fundamental for the key stakeholders, but this complex landscape can be perceived as fragmented. The aim of this work is to provide an amalgamated overview of how the current EU normative framework integrates into the various phases of the life-cycle of a medical device software, trying to ensure its safe and effective development.</div></div><div><h3>Methods</h3><div>In addition to the MDR, the main normative references relevant to the medical device software sector were taken into consideration. Specifically, the IEC 62304 standard clarifies the main processes of the software life-cycle, including the analysis of problems and changes, and the IEC 82304 standard completes its management by addressing activities relating to post-market phases and requirements. In addition, the various steps include also key points such as risk identification and control (ISO 14971), design, implementation and validation of usability requirements (IEC 62366) and in general the quality of the context in which the software is developed and maintained (ISO 13485). The application of these standards can support the activities of the various stakeholders and facilitate evidence of compliance with the regulatory requirements by MDR.</div></div><div><h3>Results</h3><div>Based on the software life cycle, a mapping of the requirements from the entire normative framework analyzed over the various phases was implemented.</div></div><div><h3>Conclusions</h3><div>A detailed and integrated picture of the regulatory context behind the life cycle of a SaMD has been provided: this can facilitate the implementation of a balanced and effective approach, including key aspects, such as risk management and usability processes, and ensuring safety for the end user.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104856"},"PeriodicalIF":4.0,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144131252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信