Maximilian Frederik Russe, Alexander Rau, Michael Andreas Ermer, René Rothweiler, Sina Wenger, Klara Klöble, Ralf K W Schulze, Fabian Bamberg, Rainer Schmelzeisen, Marco Reisert, Wiebke Semper-Hogg
{"title":"A content-aware chatbot based on GPT 4 provides trustworthy recommendations for Cone-Beam CT guidelines in dental imaging.","authors":"Maximilian Frederik Russe, Alexander Rau, Michael Andreas Ermer, René Rothweiler, Sina Wenger, Klara Klöble, Ralf K W Schulze, Fabian Bamberg, Rainer Schmelzeisen, Marco Reisert, Wiebke Semper-Hogg","doi":"10.1093/dmfr/twad015","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To develop a content-aware chatbot based on GPT-3.5-Turbo and GPT-4 with specialized knowledge on the German S2 Cone-Beam CT (CBCT) dental imaging guideline and to compare the performance against humans.</p><p><strong>Methods: </strong>The LlamaIndex software library was used to integrate the guideline context into the chatbots. Based on the CBCT S2 guideline, 40 questions were posed to content-aware chatbots and early career and senior practitioners with different levels of experience served as reference. The chatbots' performance was compared in terms of recommendation accuracy and explanation quality. Chi-square test and one-tailed Wilcoxon signed rank test evaluated accuracy and explanation quality, respectively.</p><p><strong>Results: </strong>The GPT-4 based chatbot provided 100% correct recommendations and superior explanation quality compared to the one based on GPT3.5-Turbo (87.5% vs. 57.5% for GPT-3.5-Turbo; P = .003). Moreover, it outperformed early career practitioners in correct answers (P = .002 and P = .032) and earned higher trust than the chatbot using GPT-3.5-Turbo (P = 0.006).</p><p><strong>Conclusions: </strong>A content-aware chatbot using GPT-4 reliably provided recommendations according to current consensus guidelines. The responses were deemed trustworthy and transparent, and therefore facilitate the integration of artificial intelligence into clinical decision-making.</p>","PeriodicalId":11261,"journal":{"name":"Dento maxillo facial radiology","volume":" ","pages":"109-114"},"PeriodicalIF":2.9000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11003655/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dento maxillo facial radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/dmfr/twad015","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: To develop a content-aware chatbot based on GPT-3.5-Turbo and GPT-4 with specialized knowledge on the German S2 Cone-Beam CT (CBCT) dental imaging guideline and to compare the performance against humans.
Methods: The LlamaIndex software library was used to integrate the guideline context into the chatbots. Based on the CBCT S2 guideline, 40 questions were posed to content-aware chatbots and early career and senior practitioners with different levels of experience served as reference. The chatbots' performance was compared in terms of recommendation accuracy and explanation quality. Chi-square test and one-tailed Wilcoxon signed rank test evaluated accuracy and explanation quality, respectively.
Results: The GPT-4 based chatbot provided 100% correct recommendations and superior explanation quality compared to the one based on GPT3.5-Turbo (87.5% vs. 57.5% for GPT-3.5-Turbo; P = .003). Moreover, it outperformed early career practitioners in correct answers (P = .002 and P = .032) and earned higher trust than the chatbot using GPT-3.5-Turbo (P = 0.006).
Conclusions: A content-aware chatbot using GPT-4 reliably provided recommendations according to current consensus guidelines. The responses were deemed trustworthy and transparent, and therefore facilitate the integration of artificial intelligence into clinical decision-making.
期刊介绍:
Dentomaxillofacial Radiology (DMFR) is the journal of the International Association of Dentomaxillofacial Radiology (IADMFR) and covers the closely related fields of oral radiology and head and neck imaging.
Established in 1972, DMFR is a key resource keeping dentists, radiologists and clinicians and scientists with an interest in Head and Neck imaging abreast of important research and developments in oral and maxillofacial radiology.
The DMFR editorial board features a panel of international experts including Editor-in-Chief Professor Ralf Schulze. Our editorial board provide their expertise and guidance in shaping the content and direction of the journal.
Quick Facts:
- 2015 Impact Factor - 1.919
- Receipt to first decision - average of 3 weeks
- Acceptance to online publication - average of 3 weeks
- Open access option
- ISSN: 0250-832X
- eISSN: 1476-542X