{"title":"借助OpenAI的o1预览版推进牙科诊断","authors":"Arman Danesh BMSc, Arsalan Danesh DDS, Farzad Danesh DDS, MSC","doi":"10.1016/j.adaj.2025.04.003","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The introduction of o1-preview (OpenAI) has stirred discussions surrounding its potential applications for diagnosing complex patient cases. The authors gauged changes in o1-preview’s capacity to diagnose complex cases compared with its predecessors ChatGPT-3.5 (OpenAI) and ChatGPT-4 (legacy) (OpenAI).</div></div><div><h3>Methods</h3><div>The authors used diagnostic challenges retrieved from the literature using 2 different approaches to elucidate o1-preview’s capacity to produce plausible differential diagnoses (DDs) and final diagnoses (FDs). The first approach instructed the chatbot to independently construct a DD before selecting a final diagnosis. The second approach instructed the chatbot to rely on DDs retrieved from the literature accompanying the diagnostic challenge. A 2-tailed <em>t</em> test was used to compare sample means, and a 2-tailed χ<sup>2</sup> test was used to compare sample proportions. A <em>P</em> value < .05 was considered statistically significant.</div></div><div><h3>Results</h3><div>The o1-preview model produced a plausible DD and a correct diagnosis for 94% and 80% of cases, respectively, when relying on an independent diagnostic approach, marking a significant increase from ChatGPT-3.5 (DD: difference, 32%; <em>P =</em> .001; FD: difference, 40%; <em>P</em> < .001) and ChatGPT-4 (legacy) (DD: difference, 18%; <em>P =</em> .012; FD: difference, 18%; <em>P</em> = .048). When relying on DDs retrieved from the literature, the model achieved a diagnostic accuracy of 86%, displaying a superior performance than its predecessors, although these results were not significant (ChatGPT-3.5: difference, 16%; <em>P</em> = .055; ChatGPT-4 (legacy): difference, 6%; <em>P</em> = .427).</div></div><div><h3>Conclusions</h3><div>Although further validation is required, the transformative findings of this investigation shift the discussion surrounding ChatGPT’s integration as a diagnostic tool to be not a question of if, but instead a matter of when.</div></div><div><h3>Practical Implications</h3><div>Although o1-preview has yet to achieve a proficient diagnostic accuracy, the model served well in generating DDs for complex cases.</div></div>","PeriodicalId":17197,"journal":{"name":"Journal of the American Dental Association","volume":"156 7","pages":"Pages 555-562.e3"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advancing dental diagnostics with OpenAI's o1-preview\",\"authors\":\"Arman Danesh BMSc, Arsalan Danesh DDS, Farzad Danesh DDS, MSC\",\"doi\":\"10.1016/j.adaj.2025.04.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>The introduction of o1-preview (OpenAI) has stirred discussions surrounding its potential applications for diagnosing complex patient cases. The authors gauged changes in o1-preview’s capacity to diagnose complex cases compared with its predecessors ChatGPT-3.5 (OpenAI) and ChatGPT-4 (legacy) (OpenAI).</div></div><div><h3>Methods</h3><div>The authors used diagnostic challenges retrieved from the literature using 2 different approaches to elucidate o1-preview’s capacity to produce plausible differential diagnoses (DDs) and final diagnoses (FDs). The first approach instructed the chatbot to independently construct a DD before selecting a final diagnosis. The second approach instructed the chatbot to rely on DDs retrieved from the literature accompanying the diagnostic challenge. A 2-tailed <em>t</em> test was used to compare sample means, and a 2-tailed χ<sup>2</sup> test was used to compare sample proportions. A <em>P</em> value < .05 was considered statistically significant.</div></div><div><h3>Results</h3><div>The o1-preview model produced a plausible DD and a correct diagnosis for 94% and 80% of cases, respectively, when relying on an independent diagnostic approach, marking a significant increase from ChatGPT-3.5 (DD: difference, 32%; <em>P =</em> .001; FD: difference, 40%; <em>P</em> < .001) and ChatGPT-4 (legacy) (DD: difference, 18%; <em>P =</em> .012; FD: difference, 18%; <em>P</em> = .048). When relying on DDs retrieved from the literature, the model achieved a diagnostic accuracy of 86%, displaying a superior performance than its predecessors, although these results were not significant (ChatGPT-3.5: difference, 16%; <em>P</em> = .055; ChatGPT-4 (legacy): difference, 6%; <em>P</em> = .427).</div></div><div><h3>Conclusions</h3><div>Although further validation is required, the transformative findings of this investigation shift the discussion surrounding ChatGPT’s integration as a diagnostic tool to be not a question of if, but instead a matter of when.</div></div><div><h3>Practical Implications</h3><div>Although o1-preview has yet to achieve a proficient diagnostic accuracy, the model served well in generating DDs for complex cases.</div></div>\",\"PeriodicalId\":17197,\"journal\":{\"name\":\"Journal of the American Dental Association\",\"volume\":\"156 7\",\"pages\":\"Pages 555-562.e3\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Dental Association\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0002817725002223\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Dental Association","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0002817725002223","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
Advancing dental diagnostics with OpenAI's o1-preview
Background
The introduction of o1-preview (OpenAI) has stirred discussions surrounding its potential applications for diagnosing complex patient cases. The authors gauged changes in o1-preview’s capacity to diagnose complex cases compared with its predecessors ChatGPT-3.5 (OpenAI) and ChatGPT-4 (legacy) (OpenAI).
Methods
The authors used diagnostic challenges retrieved from the literature using 2 different approaches to elucidate o1-preview’s capacity to produce plausible differential diagnoses (DDs) and final diagnoses (FDs). The first approach instructed the chatbot to independently construct a DD before selecting a final diagnosis. The second approach instructed the chatbot to rely on DDs retrieved from the literature accompanying the diagnostic challenge. A 2-tailed t test was used to compare sample means, and a 2-tailed χ2 test was used to compare sample proportions. A P value < .05 was considered statistically significant.
Results
The o1-preview model produced a plausible DD and a correct diagnosis for 94% and 80% of cases, respectively, when relying on an independent diagnostic approach, marking a significant increase from ChatGPT-3.5 (DD: difference, 32%; P = .001; FD: difference, 40%; P < .001) and ChatGPT-4 (legacy) (DD: difference, 18%; P = .012; FD: difference, 18%; P = .048). When relying on DDs retrieved from the literature, the model achieved a diagnostic accuracy of 86%, displaying a superior performance than its predecessors, although these results were not significant (ChatGPT-3.5: difference, 16%; P = .055; ChatGPT-4 (legacy): difference, 6%; P = .427).
Conclusions
Although further validation is required, the transformative findings of this investigation shift the discussion surrounding ChatGPT’s integration as a diagnostic tool to be not a question of if, but instead a matter of when.
Practical Implications
Although o1-preview has yet to achieve a proficient diagnostic accuracy, the model served well in generating DDs for complex cases.
期刊介绍:
There is not a single source or solution to help dentists in their quest for lifelong learning, improving dental practice, and dental well-being. JADA+, along with The Journal of the American Dental Association, is striving to do just that, bringing together practical content covering dentistry topics and procedures to help dentists—both general dentists and specialists—provide better patient care and improve oral health and well-being. This is a work in progress; as we add more content, covering more topics of interest, it will continue to expand, becoming an ever-more essential source of oral health knowledge.