{"title":"deepseek和ChatGPT能否用于口腔病理的诊断?","authors":"Ömer Faruk Kaygisiz, Mehmet Turhan Teke","doi":"10.1186/s12903-025-06034-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Artificial intelligence (AI) has been widely used in various medical fields to support diagnostic development. The development of different AI techniques has made important contributions to early diagnoses. This research compares and evaluates the diagnostic accuracy of ChatGPT-4o and Deepseek-v3 AI applications in 16 clinical case scenarios in oral pathologies.</p><p><strong>Methodology: </strong>Clinical case scenarios of 16 imaginary oral pathologies were prepared by the authors. The cases were asked to provide 3 possible preliminary diagnoses to two different AI applications, DeepSeek-V3 and ChatGPT-4o, and to reference the literature for these diagnoses. The diagnoses of both AI applications were evaluated with Likert scale by 20 different specialists from two different specialties.</p><p><strong>Results: </strong>The mean score for DeepSeek-v3 was 4.02 ± 0.36. For ChatGPT-4o it was 3.15 ± 0.41. According to the average scores, both models performed at a moderate to high level. Also, between the two AI models. DeepSeek-v3 was statistically better in 9 out of 16 clinical scenarios, while ChatGPT-4o was statistically better in 1 question. In general, DeepSeek-v3 was statistically more successful in the comparison of the two models (p = 0.024). In terms of references, ChatGPT-4o showed 62 references and 50 of them were fake, while 8 out of 48 references were fake in DeepSeek-v3.</p><p><strong>Conclusions: </strong>Chatbot applications have the potential to become a valuable consultant for clinicians in the future thanks to its fast-processing ability. It is clear that it can help healthcare services by reducing the workload of clinicians. It can be said that the Deepseek-v3 model produces better results compared to ChatGPT-4o, but both applications need to be improved for routine use. It is thought that the release of versions of AI models that can only perform scans in the medical field and respond to clinicians by providing more reliable resources may make these models more valuable.</p>","PeriodicalId":9072,"journal":{"name":"BMC Oral Health","volume":"25 1","pages":"638"},"PeriodicalIF":2.6000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12023442/pdf/","citationCount":"0","resultStr":"{\"title\":\"Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?\",\"authors\":\"Ömer Faruk Kaygisiz, Mehmet Turhan Teke\",\"doi\":\"10.1186/s12903-025-06034-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Artificial intelligence (AI) has been widely used in various medical fields to support diagnostic development. The development of different AI techniques has made important contributions to early diagnoses. This research compares and evaluates the diagnostic accuracy of ChatGPT-4o and Deepseek-v3 AI applications in 16 clinical case scenarios in oral pathologies.</p><p><strong>Methodology: </strong>Clinical case scenarios of 16 imaginary oral pathologies were prepared by the authors. The cases were asked to provide 3 possible preliminary diagnoses to two different AI applications, DeepSeek-V3 and ChatGPT-4o, and to reference the literature for these diagnoses. The diagnoses of both AI applications were evaluated with Likert scale by 20 different specialists from two different specialties.</p><p><strong>Results: </strong>The mean score for DeepSeek-v3 was 4.02 ± 0.36. For ChatGPT-4o it was 3.15 ± 0.41. According to the average scores, both models performed at a moderate to high level. Also, between the two AI models. DeepSeek-v3 was statistically better in 9 out of 16 clinical scenarios, while ChatGPT-4o was statistically better in 1 question. In general, DeepSeek-v3 was statistically more successful in the comparison of the two models (p = 0.024). In terms of references, ChatGPT-4o showed 62 references and 50 of them were fake, while 8 out of 48 references were fake in DeepSeek-v3.</p><p><strong>Conclusions: </strong>Chatbot applications have the potential to become a valuable consultant for clinicians in the future thanks to its fast-processing ability. It is clear that it can help healthcare services by reducing the workload of clinicians. It can be said that the Deepseek-v3 model produces better results compared to ChatGPT-4o, but both applications need to be improved for routine use. It is thought that the release of versions of AI models that can only perform scans in the medical field and respond to clinicians by providing more reliable resources may make these models more valuable.</p>\",\"PeriodicalId\":9072,\"journal\":{\"name\":\"BMC Oral Health\",\"volume\":\"25 1\",\"pages\":\"638\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12023442/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Oral Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12903-025-06034-x\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Oral Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12903-025-06034-x","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?
Objective: Artificial intelligence (AI) has been widely used in various medical fields to support diagnostic development. The development of different AI techniques has made important contributions to early diagnoses. This research compares and evaluates the diagnostic accuracy of ChatGPT-4o and Deepseek-v3 AI applications in 16 clinical case scenarios in oral pathologies.
Methodology: Clinical case scenarios of 16 imaginary oral pathologies were prepared by the authors. The cases were asked to provide 3 possible preliminary diagnoses to two different AI applications, DeepSeek-V3 and ChatGPT-4o, and to reference the literature for these diagnoses. The diagnoses of both AI applications were evaluated with Likert scale by 20 different specialists from two different specialties.
Results: The mean score for DeepSeek-v3 was 4.02 ± 0.36. For ChatGPT-4o it was 3.15 ± 0.41. According to the average scores, both models performed at a moderate to high level. Also, between the two AI models. DeepSeek-v3 was statistically better in 9 out of 16 clinical scenarios, while ChatGPT-4o was statistically better in 1 question. In general, DeepSeek-v3 was statistically more successful in the comparison of the two models (p = 0.024). In terms of references, ChatGPT-4o showed 62 references and 50 of them were fake, while 8 out of 48 references were fake in DeepSeek-v3.
Conclusions: Chatbot applications have the potential to become a valuable consultant for clinicians in the future thanks to its fast-processing ability. It is clear that it can help healthcare services by reducing the workload of clinicians. It can be said that the Deepseek-v3 model produces better results compared to ChatGPT-4o, but both applications need to be improved for routine use. It is thought that the release of versions of AI models that can only perform scans in the medical field and respond to clinicians by providing more reliable resources may make these models more valuable.
期刊介绍:
BMC Oral Health is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of disorders of the mouth, teeth and gums, as well as related molecular genetics, pathophysiology, and epidemiology.