Journal of Biomedical Informatics最新文献

筛选
英文 中文
A trajectory-informed model for detecting drug-drug-host interaction from real-world data 从真实世界数据中检测药物-药物-宿主相互作用的轨迹知情模型
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-31 DOI: 10.1016/j.jbi.2025.104859
Yi Shi , Anna Sun , Hongmei Nan , Yuedi Yang , Jing Xu , Michael T Eadon , Jing Su , Pengyue Zhang
{"title":"A trajectory-informed model for detecting drug-drug-host interaction from real-world data","authors":"Yi Shi ,&nbsp;Anna Sun ,&nbsp;Hongmei Nan ,&nbsp;Yuedi Yang ,&nbsp;Jing Xu ,&nbsp;Michael T Eadon ,&nbsp;Jing Su ,&nbsp;Pengyue Zhang","doi":"10.1016/j.jbi.2025.104859","DOIUrl":"10.1016/j.jbi.2025.104859","url":null,"abstract":"<div><h3>Objective</h3><div>Adverse drug event (ADE) is a significant challenge to public health. Since data mining methods have been developed to identify signals of drug-drug interaction-induced (DDI-induced) or drug-host interaction-induced (DHI-induced) ADE from real-world data, we aim to develop a new method to detect adverse drug-drug interaction with a special awareness on patient characteristics.</div></div><div><h3>Methods</h3><div>We developed a trajectory-informed model (TIM) to identify signals of adverse DDI with a special awareness on patient characteristics (i.e., drug-drug-host interaction [DDHI]). We also proposed a study design based on an optimal selection of within-subject and between-subjects controls for detecting ADEs from real-world data. We analyzed a large-scale US administrative claims data and conducted a simulation study.</div></div><div><h3>Results</h3><div>In administrative claims data analysis, we developed optimally matched case-control datasets for potential ADEs including acute kidney injury and gastrointestinal bleeding. We identified that an optimal selection of controls had a higher AUC compared to traditional designs for ADE detection (AUCs: 0.79–0.80 vs. 0.56–0.76). We observed that TIM detected more signals than reference methods (odds ratios: 1.13–3.18, P &lt; 0.01), and found that 36 % of all signals generated by TIM were DDHI signals. In a simulation study, we demonstrated that TIM had an empirical false discovery rate (FDR) less than the desired value of 0.05, as well as &gt; 1.4-fold higher probabilities of detection of DDHI signals than reference methods.</div></div><div><h3>Conclusions</h3><div>TIM had a high probability to identify signals of adverse DDI and DDHI in a high-throughput ADE mining while controlling false positive rate. A significant portion of drug-drug combinations were associated with an increased risk of ADEs only in specific patient subpopulations. Optimal selection of within-subject and between-subjects controls could improve the performance of ADE data mining.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104859"},"PeriodicalIF":4.0,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144204446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SigPhi-Med: A lightweight vision-language assistant for biomedicine SigPhi-Med:用于生物医学的轻量级视觉语言助手
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-31 DOI: 10.1016/j.jbi.2025.104849
Feizhong Zhou, Xingyue Liu, Qiao Zeng, Zhuhan Li, Hanguang Xiao
{"title":"SigPhi-Med: A lightweight vision-language assistant for biomedicine","authors":"Feizhong Zhou,&nbsp;Xingyue Liu,&nbsp;Qiao Zeng,&nbsp;Zhuhan Li,&nbsp;Hanguang Xiao","doi":"10.1016/j.jbi.2025.104849","DOIUrl":"10.1016/j.jbi.2025.104849","url":null,"abstract":"<div><h3>Background:</h3><div>Recent advancements in general multimodal large language models (MLLMs) have led to substantial improvements in the performance of biomedical MLLMs across diverse medical tasks, exhibiting significant transformative potential. However, the large number of parameters in MLLMs necessitates substantial computational resources during both training and inference stages, thereby limiting their feasibility in resource-constrained clinical settings. This study aims to develop a lightweight biomedical multimodal small language model (MSLM) to mitigate this limitation.</div></div><div><h3>Methods:</h3><div>We replaced the large language model (LLM) in MLLMs with the small language model (SLM), resulting in a significant reduction in the number of parameters. To ensure that the model maintains strong performance on biomedical tasks, we systematically analyzed the effects of key components of biomedical MSLMs, including the SLM, vision encoder, training strategy, and training data, on model performance. Based on these analyses, we implemented specific optimizations for the model.</div></div><div><h3>Results:</h3><div>Experiments demonstrate that the performance of biomedical MSLMs is significantly influenced by the parameter count of the SLM component, the pre-training strategy and resolution of the vision encoder component, and both the quality and quantity of the training data. Compared to several state-of-the-art models, including LLaVA-Med-v1.5 (7B), LLaVA-Med (13B) and Med-MoE (2.7B × 4), our optimized model, SigPhi-Med, with only 4.2B parameters, achieves significantly superior overall performance across the VQA-RAD, SLAKE, and Path-VQA medical visual question-answering (VQA) benchmarks.</div></div><div><h3>Conclusions:</h3><div>This study highlights the significant potential of biomedical MSLMs in biomedical applications, presenting a more cost-effective approach for deploying AI assistants in healthcare settings. Additionally, our analysis of MSLMs key components provides valuable insights for their development in other specialized domains. Our code is available at <span><span>https://github.com/NyKxo1/SigPhi-Med</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104849"},"PeriodicalIF":4.0,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144189961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do it faster with PICOS: Generative AI-Assisted systematic review screening 使用PICOS:生成式人工智能辅助的系统审查筛选可以更快地完成。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-28 DOI: 10.1016/j.jbi.2025.104860
Sai Krishna Vallamchetla , Omar Abdelkader , Ali Elnaggar , Doaa Ramadan , Md Manjurul Islam Shourav , Irbaz B. Riaz , Michelle P. Lin
{"title":"Do it faster with PICOS: Generative AI-Assisted systematic review screening","authors":"Sai Krishna Vallamchetla ,&nbsp;Omar Abdelkader ,&nbsp;Ali Elnaggar ,&nbsp;Doaa Ramadan ,&nbsp;Md Manjurul Islam Shourav ,&nbsp;Irbaz B. Riaz ,&nbsp;Michelle P. Lin","doi":"10.1016/j.jbi.2025.104860","DOIUrl":"10.1016/j.jbi.2025.104860","url":null,"abstract":"<div><h3>Background</h3><div>Systematic reviews (SRs) require substantial time and human resources, especially during the screening phase. Large Language Models (LLMs) have shown the potential to expedite screening. However, their use in generating structured PICOS (Population, Intervention/Exposure, Comparison, Outcome, Study design) summaries from title and abstract to assist human reviewers during screening remains unexplored.</div></div><div><h3>Objective</h3><div>To assess the impact of open-source (Mistral-Nemo-Instruct-2407) LLM-generated structured PICOS summaries on the speed and accuracy of title and abstract screening.</div></div><div><h3>Methods</h3><div>Four neurology trainees were grouped into two pairs based on previous screening experience. Pair A (A1, A2) consisted of less experienced trainees (1–2 SR), while Pair B (B1, B2) consisted of more experienced trainees (≥3 SR). Reviewers A1 and B1 received titles, abstracts, and LLM-generated structured PICOS summaries for each article. Reviewers A2 and B2 received only titles and abstracts. All reviewers independently screened the same set of 1,003 articles using predefined eligibility criteria. Screening times were recorded, and performance metrics were calculated.</div></div><div><h3>Results</h3><div>PICOS-assisted reviewers screened significantly faster (A1: 116 min; B1: 90 min) than those without (A2: 463 min; B2: 370 min), with approximately 75% reduction in screening workload. Sensitivity was perfect for PICOS-assisted reviewers (100%), whereas it was lower for those without assistance (88.0% and 92.0%). Furthermore, PICOS-assisted reviewers demonstrated higher accuracy (99.9%), specificity (99.9), F1 scores (98.0%), and strong inter-rater reliability (Cohen’s Kappa of 99.8%). Less experienced reviewer with PICOS assistance(A1) outperformed experienced reviewer(B2) without assistance in both efficiency and sensitivity<strong>.</strong></div></div><div><h3>Conclusion</h3><div>LLM-generated PICOS summaries enhance the speed and accuracy of title and abstract screening by providing an additional layer of structured information. With PICOS assistance, less experienced reviewer surpassed their more experienced peers. Future research should explore the applicability of this novel method across diverse fields outside of neurology and its integration into fully automated systems.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"168 ","pages":"Article 104860"},"PeriodicalIF":4.0,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144187104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational strategies in nutrigenetics: Constructing a reference dataset of nutrition-associated genetic polymorphisms 营养遗传学中的计算策略:构建营养相关遗传多态性的参考数据集
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-26 DOI: 10.1016/j.jbi.2025.104845
Giovanni Maria De Filippis , Maria Monticelli , Alessandra Pollice , Tiziana Angrisano , Bruno Hay Mele , Viola Calabrò
{"title":"Computational strategies in nutrigenetics: Constructing a reference dataset of nutrition-associated genetic polymorphisms","authors":"Giovanni Maria De Filippis ,&nbsp;Maria Monticelli ,&nbsp;Alessandra Pollice ,&nbsp;Tiziana Angrisano ,&nbsp;Bruno Hay Mele ,&nbsp;Viola Calabrò","doi":"10.1016/j.jbi.2025.104845","DOIUrl":"10.1016/j.jbi.2025.104845","url":null,"abstract":"<div><h3>Objective:</h3><div>This study aims to create a comprehensive dataset of human genetic polymorphisms associated with nutrition by integrating data from multiple sources, including the LitVar database, PubMed, and the GWAS catalog. This consolidated resource is intended to facilitate research in nutrigenetics by providing a reliable foundation to explore genetic polymorphisms linked to nutrition-related traits.</div></div><div><h3>Methods:</h3><div>We developed a data integration pipeline to assemble and analyze the dataset. It performs data retrieval from LitVar and PubMed and merges the data to produce a unified dataset. Comprehensive MeSH queries are defined to extract relevant genetic associations, which are then cross-referenced with the GWAS data.</div></div><div><h3>Results:</h3><div>The resulting dataset aggregates extensive information on genetic polymorphisms and nutrition-related traits. Through MeSH query, we identified key genes and SNPs associated with nutrition-related traits. Cross-referencing with GWAS data provided insights on potential effects or risk alleles associated with this genetic polymorphisms. The co-occurrence analysis revealed meaningful gene-diet interactions, advancing personalized nutrition and nutrigenomics research.</div></div><div><h3>Conclusion:</h3><div>The dataset presented in this study consolidates and organizes information on genetic polymorphisms associated with nutrition, facilitating detailed exploration of gene-diet interactions. This resource advances personalized nutrition interventions and nutrigenomics research. The dataset is publicly accessible at <span><span>https://zenodo.org/records/14052302</span><svg><path></path></svg></span>, its adaptable structure ensures applicability in a broad range of genetic investigations.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104845"},"PeriodicalIF":4.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144154934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Navigating regulatory challenges across the life cycle of a SaMD 在SaMD的整个生命周期中应对监管挑战
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-21 DOI: 10.1016/j.jbi.2025.104856
Martina Francesconi , Miriam Cangi , Silvia Tamarri , Noemi Conditi , Chiara Menicucci , Alice Ravizza , Luisa Cattaneo , Elisabetta Bianchini
{"title":"Navigating regulatory challenges across the life cycle of a SaMD","authors":"Martina Francesconi ,&nbsp;Miriam Cangi ,&nbsp;Silvia Tamarri ,&nbsp;Noemi Conditi ,&nbsp;Chiara Menicucci ,&nbsp;Alice Ravizza ,&nbsp;Luisa Cattaneo ,&nbsp;Elisabetta Bianchini","doi":"10.1016/j.jbi.2025.104856","DOIUrl":"10.1016/j.jbi.2025.104856","url":null,"abstract":"<div><h3>Objective</h3><div>Software as medical devices (SaMDs) have become part of clinical practice and the management of the development and control processes of the documentation associated with them are an integral part of many medical realities. The European Regulation, MDR (EU) 2017/745, introduces a classification rule (rule 11, Annex VIII) specifically for software, which provides more explicit requirements than in the past, leading to classification of many software to higher risk and therefore to more complex certification processes. In this context, planning and awareness of possible regulatory strategies and related standards are fundamental for the key stakeholders, but this complex landscape can be perceived as fragmented. The aim of this work is to provide an amalgamated overview of how the current EU normative framework integrates into the various phases of the life-cycle of a medical device software, trying to ensure its safe and effective development.</div></div><div><h3>Methods</h3><div>In addition to the MDR, the main normative references relevant to the medical device software sector were taken into consideration. Specifically, the IEC 62304 standard clarifies the main processes of the software life-cycle, including the analysis of problems and changes, and the IEC 82304 standard completes its management by addressing activities relating to post-market phases and requirements. In addition, the various steps include also key points such as risk identification and control (ISO 14971), design, implementation and validation of usability requirements (IEC 62366) and in general the quality of the context in which the software is developed and maintained (ISO 13485). The application of these standards can support the activities of the various stakeholders and facilitate evidence of compliance with the regulatory requirements by MDR.</div></div><div><h3>Results</h3><div>Based on the software life cycle, a mapping of the requirements from the entire normative framework analyzed over the various phases was implemented.</div></div><div><h3>Conclusions</h3><div>A detailed and integrated picture of the regulatory context behind the life cycle of a SaMD has been provided: this can facilitate the implementation of a balanced and effective approach, including key aspects, such as risk management and usability processes, and ensuring safety for the end user.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104856"},"PeriodicalIF":4.0,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144131252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impacts of sample weighting on transferability of risk prediction models across EHR-Linked biobanks with different recruitment strategies 样本权重对不同招募策略下ehr关联生物库风险预测模型可转移性的影响
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-19 DOI: 10.1016/j.jbi.2025.104853
Maxwell Salvatore , Alison M Mondul , Christopher R Friese , David Hanauer , Hua Xu , Celeste Leigh Pearce , Bhramar Mukherjee
{"title":"Impacts of sample weighting on transferability of risk prediction models across EHR-Linked biobanks with different recruitment strategies","authors":"Maxwell Salvatore ,&nbsp;Alison M Mondul ,&nbsp;Christopher R Friese ,&nbsp;David Hanauer ,&nbsp;Hua Xu ,&nbsp;Celeste Leigh Pearce ,&nbsp;Bhramar Mukherjee","doi":"10.1016/j.jbi.2025.104853","DOIUrl":"10.1016/j.jbi.2025.104853","url":null,"abstract":"<div><h3>Objective</h3><div>To evaluate whether using poststratification weights when training risk prediction models enhances transferability when the external test cohort has a different sampling strategy, a commonly encountered scenario when analyzing electronic health record (EHR)-linked biobanks.</div></div><div><h3>Methods</h3><div>PS weights were calculated to align a health system-based biobank, the Michigan Genomics Initiative (MGI; n = 76,757), with a nationally recruited biobank, All of Us (AOU; n = 226,764), which oversamples underrepresented groups. Basic PS weights (PS<sub>BASIC</sub>) captured age, sex, and race/ethnicity; full PS weights (PS<sub>FULL</sub>) additionally included smoking, alcohol consumption, BMI, depression, hypertension, and the Charlson Comorbidity Index. Models for esophageal, liver, and pancreatic cancers were developed using EHR data from MGI at 0, 1, 2, and 5 years prior to diagnosis. Phenotype risk scores (PheRS) were constructed using six methods (e.g., regularized regression, random forest) and evaluated alongside covariates, risk factors, and symptoms. Evaluation metrics included the odds ratio (OR) for the top decile vs. the middle 40th-60th percentiles of the risk score distribution and the area under the receiver operating curve (AUC) evaluated in the AOU test cohort when models are trained with and without weighting.</div></div><div><h3>Results</h3><div>Elastic net and random forest methods generally performed well in risk stratification, but no single PheRS construction method consistently outperformed others. Applying PS weights did not consistently improve risk stratification performance. For example, in liver cancer risk stratification at t = 1, unweighted random forest PheRS yielded an OR of 13.73 (95 % CI: 8.97, 21.01), compared to 14.55 (95 % CI: 9.45, 22.42) with PS<sub>BASIC</sub> and 13.62 (95 % CI: 8.90, 20.85) with PS<sub>FULL</sub>.</div></div><div><h3>Conclusion</h3><div>PS weights do not significantly enhance risk model transferability between biobanks. EHR-based PheRS are crucial for risk stratification and should be integrated with other multimodal data for improved risk prediction. Identifying high-risk populations for diseases like liver cancer early through health history mining shows promise.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104853"},"PeriodicalIF":4.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144116244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive debiasing learning for drug repositioning 药物重新定位的自适应去偏学习
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-17 DOI: 10.1016/j.jbi.2025.104843
Yajie Meng , Yi Wang , Xinrong Hu , Changcheng Lu , Xianfang Tang , Feifei Cui , Pan Zeng , Yuhua Yao , Jialiang Yang , Junlin Xu
{"title":"Adaptive debiasing learning for drug repositioning","authors":"Yajie Meng ,&nbsp;Yi Wang ,&nbsp;Xinrong Hu ,&nbsp;Changcheng Lu ,&nbsp;Xianfang Tang ,&nbsp;Feifei Cui ,&nbsp;Pan Zeng ,&nbsp;Yuhua Yao ,&nbsp;Jialiang Yang ,&nbsp;Junlin Xu","doi":"10.1016/j.jbi.2025.104843","DOIUrl":"10.1016/j.jbi.2025.104843","url":null,"abstract":"<div><div>Drug repositioning, pivotal in current pharmaceutical development, aims to find new uses for existing drugs, offering an efficient and cost-effective path to drug discovery. In recent years, graph neural network-based deep learning methods have achieved significant success in drug repositioning tasks. However, few studies have analyzed the characteristics of datasets to mitigate potential data biases. In this paper, we analyzed three commonly used drug repositioning datasets and identified a consistent characteristic among them: a trend of node polarization, characterized by the presence of popular entities (those commonly occurring and extensively associated) and long-tail entities (those appearing less frequently with fewer associations). Based on this finding, we propose a deep learning framework with a debiasing mechanism, called DRDM. The framework excels in addressing popular entities’ biases, which often overshadow the subtle patterns in long-tail entities—key for novel insights. DRDM dynamically adjusts association weights during training, enhancing long-tail entity representation and reducing bias. In addition, we employ dual-view contrastive learning to provide rich supervisory signals, thereby further enhancing the model’s robustness. We conducted experiments with our method on these three datasets, and the results demonstrated that our approach exhibits strong competitiveness compared to competing models. Case studies further highlighted the potential of the model in practical applications, which could provide new insights for future drug discovery.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104843"},"PeriodicalIF":4.0,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144071857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biomedical text normalization through generative modeling 通过生成建模的生物医学文本规范化。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-15 DOI: 10.1016/j.jbi.2025.104850
Jacob S. Berkowitz, Apoorva Srinivasan, Jose Miguel Acitores Cortina, Yasaman Fatapour, Nicholas P Tatonetti
{"title":"Biomedical text normalization through generative modeling","authors":"Jacob S. Berkowitz,&nbsp;Apoorva Srinivasan,&nbsp;Jose Miguel Acitores Cortina,&nbsp;Yasaman Fatapour,&nbsp;Nicholas P Tatonetti","doi":"10.1016/j.jbi.2025.104850","DOIUrl":"10.1016/j.jbi.2025.104850","url":null,"abstract":"<div><h3>Objective</h3><div>A large proportion of electronic health record (EHR) data consists of unstructured medical language text. The formatting of this text is often flexible and inconsistent, making it challenging to use for predictive modeling, clinical decision support, and data mining. Large language models’ (LLMs) ability to understand context and semantic variations makes them promising tools for standardizing medical text. In this study, we develop and assess clinical text normalization pipelines built using large-language models.</div></div><div><h3>Methods</h3><div>We implemented four LLM-based normalization strategies (Zero-Shot Recall, Prompt Recall, Semantic Search, and Retrieval-Augmented Generation based normalization [RAGnorm]) and one baseline approach using TF-IDF based String Matching. We evaluated performance across three datasets of SNOMED-mapped condition terms: [<span><span>1</span></span>] an oncology-specific dataset, [<span><span>2</span></span>] a representative sample of institutional medical conditions, and [<span><span>3</span></span>] a dataset of commonly occurring condition codes (&gt;1000 uses) from our institution. We measured performance by recording the mean shortest path length between predicted and true SNOMED CT terms. Additionally, we benchmarked our models against the TAC 2017 drug label annotations, which normalizes terms to the Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms.</div></div><div><h3>Results</h3><div>We found that RAGnorm was the most effective throughout each dataset, achieving a mean shortest path length of 0.21 for the domain-specific dataset, 0.58 for the sampled dataset, and 0.90 for the top terms dataset. It achieved a micro F1 score of 88.01 on task 4 of the TAC2017 conference, surpassing all other models without viewing the provided training data.</div></div><div><h3>Conclusion</h3><div>We find that retrieval-focused approaches overcome traditional LLM limitations for this task. RAGnorm and related retrieval techniques should be explored further for the normalization of biomedical free text.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104850"},"PeriodicalIF":4.0,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144093797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning applications related to suicide in military and Veterans: A scoping literature review 与军队和退伍军人自杀相关的机器学习应用:范围文献综述。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-13 DOI: 10.1016/j.jbi.2025.104848
Yuhan Zhang , Yishu Wei , Yanshan Wang , Yunyu Xiao , COL Ret. Ronald K. Poropatich , Gretchen L. Haas , Yiye Zhang , Chunhua Weng , Jinze Liu , Lisa A. Brenner , James M. Bjork , Yifan Peng
{"title":"Machine learning applications related to suicide in military and Veterans: A scoping literature review","authors":"Yuhan Zhang ,&nbsp;Yishu Wei ,&nbsp;Yanshan Wang ,&nbsp;Yunyu Xiao ,&nbsp;COL Ret. Ronald K. Poropatich ,&nbsp;Gretchen L. Haas ,&nbsp;Yiye Zhang ,&nbsp;Chunhua Weng ,&nbsp;Jinze Liu ,&nbsp;Lisa A. Brenner ,&nbsp;James M. Bjork ,&nbsp;Yifan Peng","doi":"10.1016/j.jbi.2025.104848","DOIUrl":"10.1016/j.jbi.2025.104848","url":null,"abstract":"<div><h3>Objective</h3><div>Suicide remains one of the main preventable causes of death among service members and veterans. Early detection and accurate prediction are essential components of effective suicide prevention strategies. Machine learning techniques have been explored in recent years with a specific focus on the assessment and prediction of multiple suicide-related outcomes, showing promising advancements. This study aims to assess and summarize current research and provides a comprehensive review regarding the application of machine learning techniques in assessing and predicting suicidal ideation, attempts, and mortality among members of military and veteran populations.</div></div><div><h3>Methods</h3><div>A keyword search using PubMed, IEEE, ACM, and Google Scholar was conducted, and the PRISMA protocol was adopted for relevant study selection. Peer-reviewed original research in English targeting the assessment or prediction of suicide-related outcomes among service members and veteran populations was included. 1,110 studies were retrieved, and 32 satisfied the inclusion criteria and were included.</div></div><div><h3>Results</h3><div>Thirty-two articles met the inclusion criteria. Despite these studies exhibiting significant variability in sample characteristics, data modalities, specific suicide-related outcomes, and the machine learning technologies employed, they consistently identified risk factors relevant to mental health issues such as depression, post-traumatic stress disorder (PTSD), suicidal ideation, prior attempts, physical health problems, and demographic characteristics. Machine learning models applied in this area have demonstrated reasonable predictive accuracy and have verified, on a large scale, risk factors previously detected by more manual analytic methods. Additional research gaps still exist. First, many studies have overlooked metrics that distinguish between false positives and negatives, such as positive predictive value and negative predictive value, which are crucial in the context of suicide prevention policies. Second, more dedicated approaches to handling survival and longitudinal data should be explored. Lastly, most studies focused on machine learning methods, with limited discussion of their connection to clinical rationales.</div></div><div><h3>Conclusion</h3><div>In sum, machine learning analyses have identified risk factors associated with suicide in military populations, which span a wide range of psychological, biological, and sociocultural factors, highlighting the complexities involved in assessing suicide risk among service members and veterans. Some differences were noted between males and females. The diversity of these factors also demonstrates that effective prevention strategies must be comprehensive and flexible.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"167 ","pages":"Article 104848"},"PeriodicalIF":4.0,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144078291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-agent norm perception and induction in distributed healthcare 分布式医疗中的多智能体规范感知与归纳。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2025-05-11 DOI: 10.1016/j.jbi.2025.104835
Chao Li , Olga Petruchik , Elizaveta Grishanina , Sergey Kovalchuk
{"title":"Multi-agent norm perception and induction in distributed healthcare","authors":"Chao Li ,&nbsp;Olga Petruchik ,&nbsp;Elizaveta Grishanina ,&nbsp;Sergey Kovalchuk","doi":"10.1016/j.jbi.2025.104835","DOIUrl":"10.1016/j.jbi.2025.104835","url":null,"abstract":"<div><div>This paper presents a Multi-Agent Norm Perception and Induction Learning Model aimed at facilitating the integration of autonomous agent systems into distributed healthcare environments through dynamic interaction processes. The nature of the medical norm system and its sharing channels necessitates distinct approaches for Multi-Agent Systems to learn two types of norms. Building on this foundation, the model enables agents to simultaneously learn descriptive norms, which capture collective tendencies, and prescriptive norms, which dictate ideal behaviors. Through parameterized mixed probability density models and practice-enhanced Markov games, the multi-agent system perceives descriptive norms in dynamic interactions and captures emergent prescriptive norms. We conducted experiments using a dataset from a neurological medical center spanning from 2016 to 2020.</div><div>The descriptive norm-sharing experiment results demonstrate that the model can effectively perceive the descriptive collective medical norms – which embody the current best clinical practices – across medical communities of varying scales. By contrasting this with the fact that the real descriptive diagnostic practice patterns in the neurological medical center dataset gradually converged over a period of 5 years, we find that the model, through prolonged learning and sharing processes, progressively mirrors the actual descriptive diagnostic trends and collective behavioral tendencies present within the medical community. In the experiment where multiple agents infer prescriptive norms within a dynamic healthcare environment, the agents effectively learned the key clinical protocols within the norm space <span><math><mi>H</mi></math></span>, which includes control norms, without developing high belief in invalid norms. Furthermore, the agents’ belief update process was relatively smooth, avoiding any discontinuous stepwise updates.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"166 ","pages":"Article 104835"},"PeriodicalIF":4.0,"publicationDate":"2025-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143984923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信