Bushra Hossain, Sarah M Preum, Md Fazle Rabbi, Rifat Ara, Mohammed Eunus Ali
{"title":"Extracting Symptoms of Complex Conditions From Online Discourse (Subreddit to Symptomatology): Lexicon-Based Approach.","authors":"Bushra Hossain, Sarah M Preum, Md Fazle Rabbi, Rifat Ara, Mohammed Eunus Ali","doi":"10.2196/70940","DOIUrl":"10.2196/70940","url":null,"abstract":"<p><strong>Background: </strong>Millions of people affected with complex medical conditions with diverse symptoms often turn to online discourse to share their experiences. While some studies have explored natural language processing methods and medical information extraction tools, these typically focus on generic symptoms in clinical notes and struggle to identify patient-reported, disease-specific, subtle symptoms from online health discourse.</p><p><strong>Objective: </strong>We aimed to extract patient-reported, disease-specific symptoms shared on social media reflecting the lived experiences of thousands of affected individuals and explore the characteristics, prevalence, and occurrence patterns of the symptoms.</p><p><strong>Methods: </strong>We propose a lexicon-based symptom extraction (LSE) method to identify a diverse list of disease-specific, patient-reported symptoms. We initially used a large language model to accelerate the extraction of symptom-related key phrases that formed the lexicon. We evaluated the effectiveness of lexicon extraction against human annotation using a Jaccard index score. We then leveraged BERT-Base, BioBERT, and Phrase-BERT-based embeddings to learn representations of these symptom-related key phrases and cluster similar symptoms using k-means and hierarchical density-based spatial clustering of applications with noise (HDBSCAN). Among the different options explored in our experiments, BioBERT-based k-means clustering was found to be the most effective. Finally, we applied symptom normalization to eliminate duplicate and redundant entries in the comprehensive symptom list.</p><p><strong>Results: </strong>In a real-world polycystic ovary syndrome (PCOS) subreddit dataset, we found that LSE significantly outperformed state-of-the-art baselines, achieving at least 41% and 20% higher F<sub>1</sub>-scores (mean 86.10) than automatic medical extraction tools and large language models, respectively. Notably, the comprehensive list of 64 PCOS symptoms generated via LSE ensured extensive coverage of symptoms reported in 7 reputable eHealth forums. Analyzing PCOS symptomatology revealed 28 potentially emerging symptoms and 8 self-reported comorbidities co-occurring with PCOS.</p><p><strong>Conclusions: </strong>The comprehensive patient-reported, disease-specific symptom list can help patients and health practitioners resolve uncertainties surrounding the disease, eliminating the variability of PCOS symptoms prevailing in the community. Analyzing PCOS symptomatology across varied dimensions provides valuable insights for public health research.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e70940"},"PeriodicalIF":3.8,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12475878/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145056545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Römer, Jean-Jacques Ponciano, Katharina Kloster, Fabia Siegberg, Bastian Plaß, Shankeeth Vinayahalingam, Bilal Al-Nawas, Peer W Kämmerer, Thomas Klauer, Daniel Thiem
{"title":"Enhancing Oral Health Diagnostics With Hyperspectral Imaging and Computer Vision: Clinical Dataset Study.","authors":"Paul Römer, Jean-Jacques Ponciano, Katharina Kloster, Fabia Siegberg, Bastian Plaß, Shankeeth Vinayahalingam, Bilal Al-Nawas, Peer W Kämmerer, Thomas Klauer, Daniel Thiem","doi":"10.2196/76148","DOIUrl":"10.2196/76148","url":null,"abstract":"<p><strong>Background: </strong>Diseases of the oral cavity, including oral squamous cell carcinoma, pose major challenges to health care worldwide due to their late diagnosis and complicated differentiation of oral tissues. The combination of endoscopic hyperspectral imaging (HSI) and deep learning (DL) models offers a promising approach to the demand for modern, noninvasive tissue diagnostics. This study presents a large-scale in vivo dataset designed to support DL-based segmentation and classification of healthy oral tissues.</p><p><strong>Objective: </strong>This study aimed to develop a comprehensive, annotated endoscopic HSI dataset of the oral cavity and to demonstrate automated, reliable differentiation of intraoral tissue structures by integrating endoscopic HSI with advanced machine learning methods.</p><p><strong>Methods: </strong>A total of 226 participants (166 women [73.5%], 60 men [26.5%], aged 24-87 years) were examined using an endoscopic HSI system, capturing spectral data in the range of 500 to 1000 nm. Oral structures in red, green, and blue and HSI scans were annotated using RectLabel Pro (by Ryo Kawamura). DeepLabv3 (Google Research) with a ResNet-50 backbone was adapted for endoscopic HSI segmentation. The model was trained for 50 epochs on 70% of the dataset, with 30% for evaluation. Performance metrics (precision, recall, and F1-score) confirmed its efficacy in distinguishing oral tissue types.</p><p><strong>Results: </strong>DeepLabv3 (ResNet-101) and U-Net (EfficientNet-B0/ResNet-50) achieved the highest overall F1-scores of 0.857 and 0.84, respectively, particularly excelling in segmenting the mucosa (0.915), retractor (0.94), tooth (0.90), and palate (0.90). Variability analysis confirmed high spectral diversity across tissue classes, supporting the dataset's complexity and authenticity for realistic clinical conditions.</p><p><strong>Conclusions: </strong>The presented dataset addresses a key gap in oral health imaging by developing and validating robust DL algorithms for endoscopic HSI data. It enables accurate classification of oral tissue and paves the way for future applications in individualized noninvasive pathological tissue analysis, early cancer detection, and intraoperative diagnostics of oral diseases.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e76148"},"PeriodicalIF":3.8,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12425605/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johnathan R Lex, Aazad Abbas, Jacob Mosseri, Jay Singh Toor, Michael Simone, Bheeshma Ravi, Cari Whyne, Elias B Khalil
{"title":"Using Machine Learning to Predict-Then-Optimize Elective Orthopedic Surgery Scheduling to Improve Operating Room Utilization: Retrospective Study.","authors":"Johnathan R Lex, Aazad Abbas, Jacob Mosseri, Jay Singh Toor, Michael Simone, Bheeshma Ravi, Cari Whyne, Elias B Khalil","doi":"10.2196/70857","DOIUrl":"10.2196/70857","url":null,"abstract":"<p><strong>Background: </strong>Total knee and hip arthroplasty (TKA and THA) are among the most performed elective procedures. Rising demand and the resource-intensive nature of these procedures have contributed to longer wait times despite significant health care investment. Current scheduling methods often rely on average surgical durations, overlooking patient-specific variability.</p><p><strong>Objective: </strong>To determine the potential for improving elective surgery scheduling for TKA and THA, respectively, by using a 2-stage approach that incorporates machine learning (ML) prediction of the duration of surgery (DOS) with scheduling optimization.</p><p><strong>Methods: </strong>In total, 2 ML models (one each for TKA and THA) were trained to predict DOS using patient factors based on 302,490 and 196,942 patients, respectively, from a large international database. In total, 3 optimization formulations based on varying surgeon flexibility were compared: Any (surgeons could operate in any operating room at any time), Split (limitation of 2 surgeons per operating room per day), and multiple subset sum problem (MSSP; limit of 1 surgeon per operating room per day). Two years of daily scheduling simulations were performed for each optimization problem using ML prediction or mean DOS over a range of schedule parameters. Constraints and resources were based on a high-volume arthroplasty hospital in Canada.</p><p><strong>Results: </strong>The TKA and THA prediction models achieved test accuracy (with a 30 min buffer) of 78.1% (mean squared error 0.898) and 75.4% (mean squared error 0.916), respectively. Any scheduling formulation performed significantly worse than the Split and MSSP formulations with respect to overtime and underutilization (P<.001). The latter 2 problems performed similarly (P>.05) over most schedule parameters. The ML prediction schedules outperformed those generated using a mean DOS for most scheduling parameters, with overtime reduced on average by 300-500 minutes per week (12-20 min per operating room per day; P<.001). However, there was more operating room underutilization with the ML prediction schedules, with it ranging from 70-192 minutes more underutilization (P<.001). Using a 15-minute schedule granularity with a waitlist pool of a minimum of 1 month generated the ML schedule that outperformed the mean schedule 97.1% of times.</p><p><strong>Conclusions: </strong>Assuming a full waiting list, optimizing an individual surgeon's elective operating room time using an ML-assisted predict-then-optimize scheduling system improves overall operating room efficiency, significantly decreasing overtime. This has significant potential implications for health care systems struggling with pressures of rising costs and growing operative waitlists.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e70857"},"PeriodicalIF":3.8,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12422739/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Unplanned Readmission Risk in Patients With Cirrhosis: Complication-Aware Dynamic Classifier Selection Approach.","authors":"Zixin Shi, Linjun Huang, Xiaomei Xu, Kexue Pu, Qingpeng Zhang, Haolin Wang","doi":"10.2196/63581","DOIUrl":"10.2196/63581","url":null,"abstract":"<p><strong>Background: </strong>Cirrhosis is a leading cause of noncancer deaths in gastrointestinal diseases, resulting in high hospitalization and readmission rates. Early identification of high-risk patients is vital for proactive interventions and improving health care outcomes. However, the quality and integrity of real-world electronic health records (EHRs) limit their utility in developing risk assessment tools.</p><p><strong>Objective: </strong>Despite the widespread application of classical and ensemble machine learning for EHR-based predictive tasks, the diversity of health conditions among patients and the inherent limitations of the data, such as incompleteness, sparsity, and temporal dynamics, have not been fully addressed. To tackle those challenges, we explored a framework that characterizes patient subgroups and adaptively selects optimal predictive models for each patient on the fly to enable individualized decision support.</p><p><strong>Methods: </strong>The proposed framework uniquely addresses patient heterogeneity by aligning diverse subgroups with dynamically selected classifiers. First, patient subgroups are generated and characterized using rules indicating medical diagnosis patterns. Next, a meta-learning framework trains a meta-classifier for optimal dynamic model selection, which identifies suitable models for individual patients. Notably, we incorporated a tailored region of competence to refine model selection, specifically accounting for cirrhosis complications. This approach not only enhances predictive performance but also elucidates why individualized predictions are better supported by selected classifiers trained on specific data subsets.</p><p><strong>Results: </strong>The proposed framework was evaluated for predicting 14-day and 30-day readmission in patients with cirrhosis using multicenter data obtained from 6 hospitals. The final dataset comprised 3307 patients with at least 2 admission records, along with a range of factors including demographic information, complications, and laboratory test results. The proposed framework achieved an average AUC (area under the curve) improvement of 5% and 4% compared to the best baseline models, respectively.</p><p><strong>Conclusions: </strong>By leveraging the expertise of the most competent classifiers for each patient subgroup, our approach enables interpretable training and dynamic selection of heterogeneous predictive models. This advancement not only improves prediction accuracy but also highlights its considerable potential for clinical applications, facilitating the alignment of diverse patient subgroups with tailored decision-support algorithms.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e63581"},"PeriodicalIF":3.8,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12422527/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Youmei Chen, Mengshi Dong, Jie Sun, Zhanao Meng, Yiqing Yang, Abudushalamu Muhetaier, Chao Li, Jie Qin
{"title":"Leveraging GPT-4o for Automated Extraction and Categorization of CAD-RADS Features From Free-Text Coronary CT Angiography Reports: Diagnostic Study.","authors":"Youmei Chen, Mengshi Dong, Jie Sun, Zhanao Meng, Yiqing Yang, Abudushalamu Muhetaier, Chao Li, Jie Qin","doi":"10.2196/70967","DOIUrl":"10.2196/70967","url":null,"abstract":"<p><strong>Background: </strong>Despite the Coronary Artery Reporting and Data System (CAD-RADS) providing a standardized approach, radiologists continue to favor free-text reports. This preference creates significant challenges for data extraction and analysis in longitudinal studies, potentially limiting large-scale research and quality assessment initiatives.</p><p><strong>Objective: </strong>To evaluate the ability of the generative pre-trained transformer (GPT)-4o model to convert real-world coronary computed tomography angiography (CCTA) free-text reports into structured data and automatically identify CAD-RADS categories and P categories.</p><p><strong>Methods: </strong>This retrospective study analyzed CCTA reports from January 2024 and July 2024. A subset of 25 reports was used for prompt engineering to instruct the large language models (LLMs) in extracting CAD-RADS categories, P categories, and the presence of myocardial bridges and noncalcified plaques. Reports were processed using the GPT-4o API (application programming interface) and custom Python scripts. The ground truth was established by radiologists based on the CAD-RADS 2.0 guidelines. Model performance was assessed using accuracy, sensitivity, specificity, and F1-score. Intrarater reliability was assessed using Cohen κ coefficient.</p><p><strong>Results: </strong>Among 999 patients (median age 66 y, range 58-74; 650 males), CAD-RADS categorization showed accuracy of 0.98-1.00 (95% CI 0.9730-1.0000), sensitivity of 0.95-1.00 (95% CI 0.9191-1.0000), specificity of 0.98-1.00 (95% CI 0.9669-1.0000), and F1-score of 0.96-1.00 (95% CI 0.9253-1.0000). P categories demonstrated accuracy of 0.97-1.00 (95% CI 0.9569-0.9990), sensitivity from 0.90 to 1.00 (95% CI 0.8085-1.0000), specificity from 0.97 to 1.00 (95% CI 0.9533-1.0000), and F1-score from 0.91 to 0.99 (95% CI 0.8377-0.9967). Myocardial bridge detection achieved an accuracy of 0.98 (95% CI 0.9680-0.9870), and noncalcified coronary plaques detection showed an accuracy of 0.98 (95% CI 0.9680-0.9870). Cohen κ values for all classifications exceeded 0.98.</p><p><strong>Conclusions: </strong>The GPT-4o model efficiently and accurately converts CCTA free-text reports into structured data, excelling in CAD-RADS classification, plaque burden assessment, and detection of myocardial bridges and calcified plaques.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e70967"},"PeriodicalIF":3.8,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12422720/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yili Wen, Zhiqiang Wan, Huiling Ren, Xu Wang, Weijie Wang
{"title":"Interpretable Machine Learning Model for Predicting and Risk Assessment of Diabetic Nephropathy.","authors":"Yili Wen, Zhiqiang Wan, Huiling Ren, Xu Wang, Weijie Wang","doi":"10.2196/64979","DOIUrl":"https://doi.org/10.2196/64979","url":null,"abstract":"<p><strong>Unstructured: </strong>Introduction: Diabetic Nephropathy (DN), a severe complication of diabetes, is characterized by proteinuria, hypertension, and progressive renal function decline, potentially leading to end-stage renal disease. The International Diabetes Federation projects that by 2045, 783 million people will have diabetes, with 30%-40% of them developing DN. Current diagnostic approaches lack sufficient sensitivity and specificity for early detection and diagnosis, underscoring the need for an accurate, interpretable predictive model to enable timely intervention, reduce cardiovascular risks, and optimize healthcare costs. Methods: Our retrospective cohort study investigated 1,000 type-2 diabetes patients using data from electronic medical records collected between 2015 and 2020. The study design incorporated a sample of 444 patients with diabetic nephropathy and 556 without, focusing on demographics, clinical metrics such as blood pressure and glucose levels, and renal function markers. Data collection relied on electronic records, with missing values handled via multiple imputation and dataset balance achieved using SMOTE. In this study, advanced machine learning algorithms, namly XGBoost, CatBoost, and LightGBM, were utilized due to their robustness in handling complex datasets. Key metrics, including accuracy, precision, recall, F1 score, specificity, and area under the curve (AUC), were employed to provide a comprehensive assessment of model performance. Additionally, Explainable Machine Learning (XML) techniques, such as LIME and SHAP, were applied to enhance the transparency and interpretability of the models, offering valuable insights into their decision-making processes. Results: XGBoost and LightGBM demonstrated superior performance, with XGBoost achieving the highest accuracy of 86.87%, a precision of 88.90%, a recall of 84.40%, an f1 score of 86.44%, and a specificity of 89.12%. LIME and SHAP analyses provided insights into the contribution of individual features to elucidate the decision-making processes of these models, identifying serum creatinine, albumin, and lipoproteins as significant predictors. Conclusion: The developed machine learning model not only provides a robust predictive tool for early diagnosis and risk assessment of DN but also ensures transparency and interpretability, crucial for clinical integration. By enabling early intervention and personalized treatment strategies, this model has the potential to improve patient outcomes and optimize healthcare resource utilization.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":""},"PeriodicalIF":3.8,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145056567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-World Evaluation of AI-Driven Diabetic Retinopathy Screening in Public Health Settings: Validation and Implementation Study.","authors":"Mona Duggal, Anshul Chauhan, Vishali Gupta, Ankita Kankaria, Deepmala Budhija, Priyanka Verma, Vaibhav Miglani, Preeti Syal, Gagandeep Kaur, Lakshay Kumar, Naveen Mutyala, Rishabh Bezbaruah, Nayanshi Sood, Ashleigh Kernohan, Geeta Menon, Luke Vale","doi":"10.2196/67529","DOIUrl":"10.2196/67529","url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) algorithms offer an effective solution to alleviate the burden of diabetic retinopathy (DR) screening in public health settings. However, there are challenges in translating diagnostic performance and its application when deployed in real-world conditions.</p><p><strong>Objective: </strong>This study aimed to assess the technical feasibility of integration and diagnostic performance of validated DR screening (DRS) AI algorithms in real-world outpatient public health settings.</p><p><strong>Methods: </strong>Prior to integrating an AI algorithm for DR screening, the study involved several steps: (1) Five AI companies, including four from India and one international company, were invited to evaluate their diagnostic performance using low-cost nonmydriatic fundus cameras in public health settings; (2) The AI algorithms were prospectively validated on fundus images from 250 people with diabetes mellitus, captured by a trained optometrist in public health settings in Chandigarh Tricity in North India. The performance evaluation used diagnostic metrics, including sensitivity, specificity, and accuracy, compared to human grader assessments; (3) The AI algorithm with better diagnostic performance was integrated into a low-cost screening camera deployed at a community health center (CHC) in the Moga district of Punjab, India. For AI algorithm analysis, a trained health system optometrist captured nonmydriatic images of 343 patients.</p><p><strong>Results: </strong>Three web-based AI screening companies agreed to participate, while one declined and one chose to withdraw due to low specificity identified during the interim analysis. The three AI algorithms demonstrated variable diagnostic performance, with sensitivity (60%-80%) and specificity (14%-96%). Upon integration, the better-performing algorithm AI-3 (sensitivity: 68%, specificity: 96, and accuracy: 88·43%) demonstrated high sensitivity of image gradability (99.5%), DR detection (99.6%), and referral DR (79%) at the CHC.</p><p><strong>Conclusions: </strong>This study highlights the importance of systematic AI validation for responsible clinical integration, demonstrating the potential of DRS to improve health care access in resource-limited public health settings.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e67529"},"PeriodicalIF":3.8,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12419978/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Literature Screening for Hepatocellular Carcinoma Treatment Through Integration of 3 Large Language Models: Methodological Study.","authors":"Chen Pan, Wei Lu, Bingliang Chen, Gang Zhang, Zhiming Yang, Jingcheng Hao","doi":"10.2196/76252","DOIUrl":"10.2196/76252","url":null,"abstract":"<p><strong>Background: </strong>Primary liver cancer, particularly hepatocellular carcinoma (HCC), poses significant clinical challenges due to late-stage diagnosis, tumor heterogeneity, and rapidly evolving therapeutic strategies. While systematic reviews and meta-analyses are essential for updating clinical guidelines, their labor-intensive nature limits timely evidence synthesis.</p><p><strong>Objective: </strong>This study proposes an automated literature screening workflow powered by large language models (LLMs) to accelerate evidence synthesis for HCC treatment guidelines.</p><p><strong>Methods: </strong>We developed a tripartite LLM framework integrating Doubao-1.5-pro-32k, Deepseek-v3, and DeepSeek-R1-Distill-Qwen-7B to simulate collaborative decision-making for study inclusion and exclusion. The system was evaluated across 9 reconstructed datasets derived from published HCC meta-analyses, with performance assessed using accuracy, agreement metrics (κ and prevalence-adjusted bias-adjusted κ), recall, precision, F<sub>1</sub>-scores, and computational efficiency parameters (processing time and cost).</p><p><strong>Results: </strong>The framework demonstrated good performance, with a weighted accuracy of 0.96 and substantial agreement (prevalence-adjusted bias-adjusted κ=0.91), achieving high weighted recall (0.90) but modest weighted precision (0.15) and F<sub>1</sub>-scores (0.22). Computational efficiency varied across datasets (processing time: 248-5850 s; cost: US $0.14-$3.68 per dataset).</p><p><strong>Conclusions: </strong>This LLM-driven approach shows promise for accelerating evidence synthesis in HCC care by reducing screening time while maintaining methodological rigor. Key limitations related to clinical context sensitivity and error propagation highlight the need for reinforcement learning integration and domain-specific fine-tuning. LLM agent architectures with reinforcement learning offer a practical path for streamlining guideline updates, though further optimization is needed to improve specialization and reliability in complex clinical settings.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e76252"},"PeriodicalIF":3.8,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12455167/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145024830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lauren M Wasser, Hai-Wei Liang, Chenyu Li, Julie Cassidy, Pooja Tallapaneni, Hunter Osterhoudt, Yanshan Wang, Andrew M Williams
{"title":"Identifying Transportation Needs in Ophthalmology Clinic Notes Using Natural Language Processing: Retrospective, Cross-Sectional Study.","authors":"Lauren M Wasser, Hai-Wei Liang, Chenyu Li, Julie Cassidy, Pooja Tallapaneni, Hunter Osterhoudt, Yanshan Wang, Andrew M Williams","doi":"10.2196/69216","DOIUrl":"10.2196/69216","url":null,"abstract":"<p><strong>Background: </strong>Transportation insecurity is a known barrier to accessing eye care and is associated with poorer visual outcomes for patients. However, its mention is seldom captured in structured data fields in electronic health records, limiting efforts to identify and support affected patients. Free-text clinical documentation may more efficiently capture information on transportation-related challenges than structured data.</p><p><strong>Objective: </strong>In this study, we aimed to identify mention of transportation insecurity in free-text ophthalmology clinic notes using natural language processing (NLP).</p><p><strong>Methods: </strong>In this retrospective, cross-sectional study, we examined ophthalmology clinic notes of adult patients with an encounter at a tertiary academic eye center from 2016 to 2023. Demographic information and free text from clinical notes were extracted from electronic health records and deidentified for analysis. Free text was used to develop a rule-based NLP algorithm to identify transportation insecurity. The NLP algorithm was trained and validated using a gold-standard expert review, and precision, recall, and F1-scores were used to evaluate the algorithm's performance. Logistic regression evaluated associations between demographics and transportation insecurity.</p><p><strong>Results: </strong>A total of 1,801,572 clinical notes of 118,518 unique patients were examined, and the NLP algorithm identified 726 (0.6%) patients with transportation insecurity. The algorithm's precision, recall, and F1-score were 0.860, 0.960, and 0.778, respectively, indicating high agreement with the gold-standard expert review. Patients with identified transportation insecurity were more likely to be older (OR 3.01, 95% CI 2.38-3.78 for those aged ≥80 vs 18-60 y) and less likely to identify as Asian (OR 0.04, 95% CI 0-0.18 for Asian patients vs White patients). There was no difference by sex (OR 1.13, 95% CI 0.97-1.31) or between the Black and White races (OR 0.98, 95% CI 0.79-1.22).</p><p><strong>Conclusions: </strong>NLP has the potential to identify patients experiencing transportation insecurity from ophthalmology clinic notes, which may help to facilitate referrals to transportation resources.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e69216"},"PeriodicalIF":3.8,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12413321/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dewi Nur Aisyah, Agus Heri Setiawan, Chyntia Aryanti Mayadewi, Alfiano Fawwaz Lokopessy, Zisis Kozlakidis, Logan Manikam
{"title":"Understanding Health Information Systems Utilization Across Public Health Centers in Indonesia: Cross-Sectional Study.","authors":"Dewi Nur Aisyah, Agus Heri Setiawan, Chyntia Aryanti Mayadewi, Alfiano Fawwaz Lokopessy, Zisis Kozlakidis, Logan Manikam","doi":"10.2196/68613","DOIUrl":"10.2196/68613","url":null,"abstract":"<p><strong>Background: </strong>The primary health care service in Indonesia consists of 10,260 public health centers (Puskesmas), which play a major role in providing health care in the community, recording and reporting health data using digital health information systems (HIS) or manual reports. The utilization of HIS across Puskesmas is crucial to capture the dynamic evolution of health problems and monitor interventions, thus providing effective primary health care services for the community.</p><p><strong>Objective: </strong>This paper provides a national-level baseline mapping of HIS utilization in Indonesian Puskesmas. It evaluates the number of HIS used, associated challenges, and contextual factors influencing system adoption.</p><p><strong>Methods: </strong>A cross-sectional survey was carried out covering all Puskesmas across 34 Indonesian provinces between January and February 2022. The questionnaire covered a list of HIS used by Puskesmas, which developed the HIS, and the utilization and challenges during HIS implementation. Descriptive statistical analysis and bivariate analysis were applied.</p><p><strong>Results: </strong>A total of 2606 (25.5%) public health centers across 34 provinces participated in this study. On average, Puskesmas reported using 30 different HIS platforms, with notable variation across provinces and islands. Most systems (n=62,060, 72.94%) were developed by national ministries, though local governments and third parties also contributed. Despite 91.5% of respondents reporting that HIS aligned with their needs and 90% claiming data use for decision-making, many centers faced operational barriers: 49% (n=132,300) of systems required excessive data entry, 33% (n=89,100) experienced frequent downtime, and 29% (n=78,300) lacked automated analysis features. In terms of the infrastructure supporting HIS implementation, 9.45% (n=138) of Puskesmas have no access to the internet, while only 28.9% (n=422) have access to robust and efficient internet connections. As for the human resources, the study reveals that each health personnel manages up to six different HIS for data reporting tasks, 74.30% (n=1133) of Puskesmas only received training at the initial system's implementation stage, and 80.51% (n=1225) of respondents report the existence of an informal knowledge transfer process among the staff. The bivariate analysis shows that Puskesmas with the characteristics of being located in Java island and urban areas possessed higher accreditation levels, had more training and knowledge transfer, and had a greater chance to use >30 HIS.</p><p><strong>Conclusions: </strong>This descriptive study highlights substantial fragmentation in Indonesia's HIS environment and reveals critical disparities in system infrastructure, usability, and workforce capacity. Recommendations should be tailored to different contexts: offline-compatible systems and basic digital literacy training are needed in rural areas, while urban Puskesmas m","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e68613"},"PeriodicalIF":3.8,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12407222/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}