整合特定领域的资源：推进足部和踝关节手术的人工智能

Foot & ankle surgery (New York, N.Y.) Pub Date : 2024-11-28 DOI:10.1016/j.fastrc.2024.100445

Steven R. Cooperman DPM, MBA, AACFAS , Roberto A. Brandão DPM, FACFAS

{"title":"整合特定领域的资源：推进足部和踝关节手术的人工智能","authors":"Steven R. Cooperman DPM, MBA, AACFAS , Roberto A. Brandão DPM, FACFAS","doi":"10.1016/j.fastrc.2024.100445","DOIUrl":null,"url":null,"abstract":"<div><div>Large language models like ChatGPT offer significant potential for applications in medicine, including patient education and clinical support. This study evaluates the performance of ChatGPT-4, ChatGPT-4 enhanced with retrieval-augmented generation (RAG), and Gemini AI in responding to clinical vignette questions regarding Hallux Rigidus, a condition requiring specialized knowledge in foot and ankle surgery. The ChatGPT-4 + RAG model, enhanced with the 2024 ACFAS clinical consensus statements, demonstrated the highest agreement with surveyor majority responses (83.26 %) compared to ChatGPT-4 (59.54 %) and Gemini AI (53.02 %). All models provided clinically appropriate responses to most questions, with the ChatGPT-4 + RAG model excelling in accuracy, despite the rationale for answers being deemed most difficult to read. These findings highlight the limitations of generic AI models, which may propagate misinformation if used by patients seeking health information. By incorporating domain-specific resources, the RAG-augmented model showed enhanced reliability and contextual accuracy, suggesting their potential as tools for both clinical decision-making and patient education. This study emphasizes the importance of integrating verified medical resources to advance AI in healthcare, addressing critical gaps in existing capabilities while minimizing risks of misinformation.</div></div>","PeriodicalId":73047,"journal":{"name":"Foot & ankle surgery (New York, N.Y.)","volume":"5 1","pages":"Article 100445"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating domain-specific resources: Advancing AI for foot and ankle surgery\",\"authors\":\"Steven R. Cooperman DPM, MBA, AACFAS , Roberto A. Brandão DPM, FACFAS\",\"doi\":\"10.1016/j.fastrc.2024.100445\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Large language models like ChatGPT offer significant potential for applications in medicine, including patient education and clinical support. This study evaluates the performance of ChatGPT-4, ChatGPT-4 enhanced with retrieval-augmented generation (RAG), and Gemini AI in responding to clinical vignette questions regarding Hallux Rigidus, a condition requiring specialized knowledge in foot and ankle surgery. The ChatGPT-4 + RAG model, enhanced with the 2024 ACFAS clinical consensus statements, demonstrated the highest agreement with surveyor majority responses (83.26 %) compared to ChatGPT-4 (59.54 %) and Gemini AI (53.02 %). All models provided clinically appropriate responses to most questions, with the ChatGPT-4 + RAG model excelling in accuracy, despite the rationale for answers being deemed most difficult to read. These findings highlight the limitations of generic AI models, which may propagate misinformation if used by patients seeking health information. By incorporating domain-specific resources, the RAG-augmented model showed enhanced reliability and contextual accuracy, suggesting their potential as tools for both clinical decision-making and patient education. This study emphasizes the importance of integrating verified medical resources to advance AI in healthcare, addressing critical gaps in existing capabilities while minimizing risks of misinformation.</div></div>\",\"PeriodicalId\":73047,\"journal\":{\"name\":\"Foot & ankle surgery (New York, N.Y.)\",\"volume\":\"5 1\",\"pages\":\"Article 100445\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foot & ankle surgery (New York, N.Y.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667396724000855\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foot & ankle surgery (New York, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667396724000855","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

像ChatGPT这样的大型语言模型为医学应用提供了巨大的潜力，包括患者教育和临床支持。本研究评估了ChatGPT-4、检索增强生成（RAG）增强的ChatGPT-4和Gemini AI在回答拇僵硬（一种需要足部和踝关节手术专业知识的疾病）的临床小问题方面的性能。与ChatGPT-4（59.54%）和Gemini AI（53.02%）相比，经2024年ACFAS临床共识声明增强的ChatGPT-4 + RAG模型与测量者多数反应的一致性最高（83.26%）。所有模型对大多数问题都提供了临床适当的答案，尽管答案的基本原理被认为是最难读的，但ChatGPT-4 + RAG模型在准确性方面表现出色。这些发现突出了通用人工智能模型的局限性，如果患者在寻求健康信息时使用这些模型，可能会传播错误信息。通过整合特定领域的资源，rag增强模型显示出更高的可靠性和上下文准确性，表明它们作为临床决策和患者教育工具的潜力。该研究强调了整合经过验证的医疗资源以推进人工智能在医疗保健领域的重要性，解决了现有能力中的关键差距，同时最大限度地降低了错误信息的风险。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Integrating domain-specific resources: Advancing AI for foot and ankle surgery

Large language models like ChatGPT offer significant potential for applications in medicine, including patient education and clinical support. This study evaluates the performance of ChatGPT-4, ChatGPT-4 enhanced with retrieval-augmented generation (RAG), and Gemini AI in responding to clinical vignette questions regarding Hallux Rigidus, a condition requiring specialized knowledge in foot and ankle surgery. The ChatGPT-4 + RAG model, enhanced with the 2024 ACFAS clinical consensus statements, demonstrated the highest agreement with surveyor majority responses (83.26 %) compared to ChatGPT-4 (59.54 %) and Gemini AI (53.02 %). All models provided clinically appropriate responses to most questions, with the ChatGPT-4 + RAG model excelling in accuracy, despite the rationale for answers being deemed most difficult to read. These findings highlight the limitations of generic AI models, which may propagate misinformation if used by patients seeking health information. By incorporating domain-specific resources, the RAG-augmented model showed enhanced reliability and contextual accuracy, suggesting their potential as tools for both clinical decision-making and patient education. This study emphasizes the importance of integrating verified medical resources to advance AI in healthcare, addressing critical gaps in existing capabilities while minimizing risks of misinformation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Foot & ankle surgery (New York, N.Y.) Orthopedics, Sports Medicine and Rehabilitation, Podiatry

自引率

0.00%

发文量

审稿时长

75 days