{"title":"评估ChatGPT在两个不同的时间点(间隔14天)对类风湿关节炎的EULAR诊断标准和治疗方案的依从性,使用二元和多项选择询问。","authors":"Neşe Çabuk Çelik, Elif Altunel Kılınç","doi":"10.1007/s10067-025-07417-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Artificial intelligence (AI) possesses considerable promise in healthcare to offer decision help in particular domains, including rheumatoid arthritis (RA). This study assesses the adherence of the advanced AI model ChatGPT-v4 to the European League Against Rheumatism (EULAR) recommendations.</p><p><strong>Methods: </strong>The research employed a 100-item questionnaire consisting of true/false and multiple-choice formats, accompanied with real-world clinical scenarios developed concurrently with EULAR in the therapy of RA. Inquiries addressed diagnostic criteria, therapeutic alternatives, and follow-up procedures. Two rheumatologists assessed the ChatGPT for accuracy, consistency, and comprehensiveness utilizing a 6-point Likert scale.</p><p><strong>Results: </strong>Evaluation occurred at baseline and on day 14. AI rectified the majority of errors at baseline in the paired questions. It did not advance on specific responses. One of the two previously incongruent responses remained unaltered, while the other was rectified. The 48 originally congruent responses rose to 49 on day 14. In binary questions, AI exhibited greater coherence than in multiple-choice questions. At baseline, 43 (86%) of the multiple-choice items were answered correctly. Upon reevaluation, 42 (84%) were found to be accurate. One response was erroneous on day 14. Three of the seven initially erroneous responses remained unaltered. Four erroneous responses were later rectified.</p><p><strong>Conclusion: </strong>ChatGPT demonstrated efficacy in addressing binary and multiple-choice questions formulated according to EULAR guidelines for RA. The findings validated that AI can serve as a clinical support instrument in RA. It demonstrated that AI can be enhanced. AI attained accuracy in objective information and promptly rectified the error. Key Points • AI in healthcare: The integration of artificial intelligence, specifically ChatGPT-v4, in clinical practice aims to enhance decision-making in RA by adhering to EULAR recommendations for diagnosis, treatment, and follow-up. • Inter-rater reliability: High agreement levels were noted among the evaluators, with Cohen's kappa coefficients of 0.92 for binary questions and 0.94 for multiple-choice questions. • AI learning dynamics: The study reveals that ChatGPT showed improvement in understanding and answering more complex questions over time, unlike findings in previous studies where AI struggled with consistency. • Implications for clinical practice: The findings support the growing role of AI as a reliable tool in rheumatology, suggesting potential for personalized, evidence-based patient care.</p>","PeriodicalId":10482,"journal":{"name":"Clinical Rheumatology","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessment of ChatGPT's adherence to EULAR diagnostic criteria and therapeutic protocols for rheumatoid arthritis at two distinct time points, 14 days apart, utilizing binary and multiple-choice inquiries.\",\"authors\":\"Neşe Çabuk Çelik, Elif Altunel Kılınç\",\"doi\":\"10.1007/s10067-025-07417-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>Artificial intelligence (AI) possesses considerable promise in healthcare to offer decision help in particular domains, including rheumatoid arthritis (RA). This study assesses the adherence of the advanced AI model ChatGPT-v4 to the European League Against Rheumatism (EULAR) recommendations.</p><p><strong>Methods: </strong>The research employed a 100-item questionnaire consisting of true/false and multiple-choice formats, accompanied with real-world clinical scenarios developed concurrently with EULAR in the therapy of RA. Inquiries addressed diagnostic criteria, therapeutic alternatives, and follow-up procedures. Two rheumatologists assessed the ChatGPT for accuracy, consistency, and comprehensiveness utilizing a 6-point Likert scale.</p><p><strong>Results: </strong>Evaluation occurred at baseline and on day 14. AI rectified the majority of errors at baseline in the paired questions. It did not advance on specific responses. One of the two previously incongruent responses remained unaltered, while the other was rectified. The 48 originally congruent responses rose to 49 on day 14. In binary questions, AI exhibited greater coherence than in multiple-choice questions. At baseline, 43 (86%) of the multiple-choice items were answered correctly. Upon reevaluation, 42 (84%) were found to be accurate. One response was erroneous on day 14. Three of the seven initially erroneous responses remained unaltered. Four erroneous responses were later rectified.</p><p><strong>Conclusion: </strong>ChatGPT demonstrated efficacy in addressing binary and multiple-choice questions formulated according to EULAR guidelines for RA. The findings validated that AI can serve as a clinical support instrument in RA. It demonstrated that AI can be enhanced. AI attained accuracy in objective information and promptly rectified the error. Key Points • AI in healthcare: The integration of artificial intelligence, specifically ChatGPT-v4, in clinical practice aims to enhance decision-making in RA by adhering to EULAR recommendations for diagnosis, treatment, and follow-up. • Inter-rater reliability: High agreement levels were noted among the evaluators, with Cohen's kappa coefficients of 0.92 for binary questions and 0.94 for multiple-choice questions. • AI learning dynamics: The study reveals that ChatGPT showed improvement in understanding and answering more complex questions over time, unlike findings in previous studies where AI struggled with consistency. • Implications for clinical practice: The findings support the growing role of AI as a reliable tool in rheumatology, suggesting potential for personalized, evidence-based patient care.</p>\",\"PeriodicalId\":10482,\"journal\":{\"name\":\"Clinical Rheumatology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Rheumatology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s10067-025-07417-9\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"RHEUMATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Rheumatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10067-025-07417-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RHEUMATOLOGY","Score":null,"Total":0}
Assessment of ChatGPT's adherence to EULAR diagnostic criteria and therapeutic protocols for rheumatoid arthritis at two distinct time points, 14 days apart, utilizing binary and multiple-choice inquiries.
Objectives: Artificial intelligence (AI) possesses considerable promise in healthcare to offer decision help in particular domains, including rheumatoid arthritis (RA). This study assesses the adherence of the advanced AI model ChatGPT-v4 to the European League Against Rheumatism (EULAR) recommendations.
Methods: The research employed a 100-item questionnaire consisting of true/false and multiple-choice formats, accompanied with real-world clinical scenarios developed concurrently with EULAR in the therapy of RA. Inquiries addressed diagnostic criteria, therapeutic alternatives, and follow-up procedures. Two rheumatologists assessed the ChatGPT for accuracy, consistency, and comprehensiveness utilizing a 6-point Likert scale.
Results: Evaluation occurred at baseline and on day 14. AI rectified the majority of errors at baseline in the paired questions. It did not advance on specific responses. One of the two previously incongruent responses remained unaltered, while the other was rectified. The 48 originally congruent responses rose to 49 on day 14. In binary questions, AI exhibited greater coherence than in multiple-choice questions. At baseline, 43 (86%) of the multiple-choice items were answered correctly. Upon reevaluation, 42 (84%) were found to be accurate. One response was erroneous on day 14. Three of the seven initially erroneous responses remained unaltered. Four erroneous responses were later rectified.
Conclusion: ChatGPT demonstrated efficacy in addressing binary and multiple-choice questions formulated according to EULAR guidelines for RA. The findings validated that AI can serve as a clinical support instrument in RA. It demonstrated that AI can be enhanced. AI attained accuracy in objective information and promptly rectified the error. Key Points • AI in healthcare: The integration of artificial intelligence, specifically ChatGPT-v4, in clinical practice aims to enhance decision-making in RA by adhering to EULAR recommendations for diagnosis, treatment, and follow-up. • Inter-rater reliability: High agreement levels were noted among the evaluators, with Cohen's kappa coefficients of 0.92 for binary questions and 0.94 for multiple-choice questions. • AI learning dynamics: The study reveals that ChatGPT showed improvement in understanding and answering more complex questions over time, unlike findings in previous studies where AI struggled with consistency. • Implications for clinical practice: The findings support the growing role of AI as a reliable tool in rheumatology, suggesting potential for personalized, evidence-based patient care.
期刊介绍:
Clinical Rheumatology is an international English-language journal devoted to publishing original clinical investigation and research in the general field of rheumatology with accent on clinical aspects at postgraduate level.
The journal succeeds Acta Rheumatologica Belgica, originally founded in 1945 as the official journal of the Belgian Rheumatology Society. Clinical Rheumatology aims to cover all modern trends in clinical and experimental research as well as the management and evaluation of diagnostic and treatment procedures connected with the inflammatory, immunologic, metabolic, genetic and degenerative soft and hard connective tissue diseases.