Berk B Ozmen, Nishant Singh, Kavach Shah, Ibrahim Berber, Damanjit Singh, Eugene Pinsky, Antonio Rampazzo, Graham S Schwarz
{"title":"Development of a novel artificial intelligence clinical decision support tool for hand surgery: HandRAG.","authors":"Berk B Ozmen, Nishant Singh, Kavach Shah, Ibrahim Berber, Damanjit Singh, Eugene Pinsky, Antonio Rampazzo, Graham S Schwarz","doi":"10.1016/j.jham.2025.100293","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Hand surgery decision-making requires integration of complex anatomical understanding, diverse patient-specific factors, and nuanced operative techniques. While artificial intelligence (AI), large language models (LLMs), and retrieval-augmented generation (RAG) models have advanced significantly in various fields, no AI-driven clinical decision support systems currently exist for hand surgery. A novel retrieval-enhanced AI large language model specifically tailored for hand surgery was developed, capable of effectively utilizing peer-reviewed published hand surgery literature for clinical decision support in real-time at point of care.</p><p><strong>Methods: </strong>An AI clinical decision support system was developed integrating all available open-access 4510 peer-reviewed hand surgery publications from 2000 to 2024 identified through hand surgery-relevant keywords. Documents were processed using a hierarchical pipeline based on the RAPTOR methodology, which breaks down large texts into smaller segments to enhance accurate retrieval. The system was evaluated using 15 standardized clinical queries assessed using automated computational metrics for correctness and semantic similarity to source documents.</p><p><strong>Results: </strong>The AI system demonstrated consistent performance with an average G-Eval correctness score of 0.79, SEM with an average similarity score of 0.75 (range: 0.54-0.86) and average maximum similarity score of 0.80 (range: 0.56-0.91), predominantly at moderate confidence levels. Generated recommendations were contextually appropriate and reliably linked to relevant hand surgery literature, providing accurate and clinically meaningful guidance.</p><p><strong>Conclusion: </strong>The AI system, HandRAG, incorporating RAG and LLM approach offers potential benefits for evidence-based clinical decision support and education in hand surgery.</p>","PeriodicalId":45368,"journal":{"name":"Journal of Hand and Microsurgery","volume":"17 4","pages":"100293"},"PeriodicalIF":0.3000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12210289/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hand and Microsurgery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jham.2025.100293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Hand surgery decision-making requires integration of complex anatomical understanding, diverse patient-specific factors, and nuanced operative techniques. While artificial intelligence (AI), large language models (LLMs), and retrieval-augmented generation (RAG) models have advanced significantly in various fields, no AI-driven clinical decision support systems currently exist for hand surgery. A novel retrieval-enhanced AI large language model specifically tailored for hand surgery was developed, capable of effectively utilizing peer-reviewed published hand surgery literature for clinical decision support in real-time at point of care.
Methods: An AI clinical decision support system was developed integrating all available open-access 4510 peer-reviewed hand surgery publications from 2000 to 2024 identified through hand surgery-relevant keywords. Documents were processed using a hierarchical pipeline based on the RAPTOR methodology, which breaks down large texts into smaller segments to enhance accurate retrieval. The system was evaluated using 15 standardized clinical queries assessed using automated computational metrics for correctness and semantic similarity to source documents.
Results: The AI system demonstrated consistent performance with an average G-Eval correctness score of 0.79, SEM with an average similarity score of 0.75 (range: 0.54-0.86) and average maximum similarity score of 0.80 (range: 0.56-0.91), predominantly at moderate confidence levels. Generated recommendations were contextually appropriate and reliably linked to relevant hand surgery literature, providing accurate and clinically meaningful guidance.
Conclusion: The AI system, HandRAG, incorporating RAG and LLM approach offers potential benefits for evidence-based clinical decision support and education in hand surgery.