手外科中的Dall-E:探索ChatGPT图像生成的效用

IF 1.4 Q3 SURGERY
Daniel Soroudi , Daniel S. Rouhani , Alap Patel , Ryan Sadjadi , Reta Behnam-Hanona , Nicholas C. Oleck , Israel Falade , Merisa Piper , Scott L. Hansen
{"title":"手外科中的Dall-E:探索ChatGPT图像生成的效用","authors":"Daniel Soroudi ,&nbsp;Daniel S. Rouhani ,&nbsp;Alap Patel ,&nbsp;Ryan Sadjadi ,&nbsp;Reta Behnam-Hanona ,&nbsp;Nicholas C. Oleck ,&nbsp;Israel Falade ,&nbsp;Merisa Piper ,&nbsp;Scott L. Hansen","doi":"10.1016/j.sopen.2025.04.012","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Artificial intelligence (AI) has significantly influenced various medical fields, including plastic surgery. Large language model (LLM) chatbots such as ChatGPT and text-to-image tools like Dall-E and GPT-4o are gaining broader adoption. This study explores the capabilities and limitations of these tools in hand surgery, focusing on their application in patient and medical education.</div></div><div><h3>Methods</h3><div>Utilizing Google Trends data, common search terms were identified and queried on ChatGPT-4.5 and ChatGPT-3.5 from the following categories: “Hand Anatomy”, “Hand Fracture”, “Hand Joint Injury”, “Hand Tumor”, and “Hand Dislocation”. Responses were graded on a 1–5 scale for accuracy and evaluated using the Flesch-Kincaid Grade Level, Patient Education Materials Assessment Tool (PEMAT), and DISCERN instrument. GPT 4o, DALL-E 3, and DALL-E 2 illustrated visual representations of selected ChatGPT responses in each category, which were further evaluated.</div></div><div><h3>Results</h3><div>ChatGPT-4.5 achieved a DISCERN overall score of 3.80 ± 0.23. Its responses averaged 91.67 ± 0.29 for PEMAT understandability and 54.67 ± 0.55 for actionability. Accuracy was 4.47 ± 0.52, with a Flesch-Kincaid Grade Level of 9.26 ± 1.04. ChatGPT-4.5 consistently outperformed ChatGPT-3.5 across all evaluation metrics. For text-to-image generation, GPT-4o produced more accurate visuals compared to DALL-E 3 and DALL-E 2.</div></div><div><h3>Conclusions</h3><div>This study highlights the strengths and limitations of ChatGPT-4.5 and GPT-4o in hand surgery education. While combining accurate text generation with image creation shows promise, these AI tools still need further refinement before widespread clinical adoption.</div></div>","PeriodicalId":74892,"journal":{"name":"Surgery open science","volume":"26 ","pages":"Pages 64-78"},"PeriodicalIF":1.4000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dall-E in hand surgery: Exploring the utility of ChatGPT image generation\",\"authors\":\"Daniel Soroudi ,&nbsp;Daniel S. Rouhani ,&nbsp;Alap Patel ,&nbsp;Ryan Sadjadi ,&nbsp;Reta Behnam-Hanona ,&nbsp;Nicholas C. Oleck ,&nbsp;Israel Falade ,&nbsp;Merisa Piper ,&nbsp;Scott L. Hansen\",\"doi\":\"10.1016/j.sopen.2025.04.012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Artificial intelligence (AI) has significantly influenced various medical fields, including plastic surgery. Large language model (LLM) chatbots such as ChatGPT and text-to-image tools like Dall-E and GPT-4o are gaining broader adoption. This study explores the capabilities and limitations of these tools in hand surgery, focusing on their application in patient and medical education.</div></div><div><h3>Methods</h3><div>Utilizing Google Trends data, common search terms were identified and queried on ChatGPT-4.5 and ChatGPT-3.5 from the following categories: “Hand Anatomy”, “Hand Fracture”, “Hand Joint Injury”, “Hand Tumor”, and “Hand Dislocation”. Responses were graded on a 1–5 scale for accuracy and evaluated using the Flesch-Kincaid Grade Level, Patient Education Materials Assessment Tool (PEMAT), and DISCERN instrument. GPT 4o, DALL-E 3, and DALL-E 2 illustrated visual representations of selected ChatGPT responses in each category, which were further evaluated.</div></div><div><h3>Results</h3><div>ChatGPT-4.5 achieved a DISCERN overall score of 3.80 ± 0.23. Its responses averaged 91.67 ± 0.29 for PEMAT understandability and 54.67 ± 0.55 for actionability. Accuracy was 4.47 ± 0.52, with a Flesch-Kincaid Grade Level of 9.26 ± 1.04. ChatGPT-4.5 consistently outperformed ChatGPT-3.5 across all evaluation metrics. For text-to-image generation, GPT-4o produced more accurate visuals compared to DALL-E 3 and DALL-E 2.</div></div><div><h3>Conclusions</h3><div>This study highlights the strengths and limitations of ChatGPT-4.5 and GPT-4o in hand surgery education. While combining accurate text generation with image creation shows promise, these AI tools still need further refinement before widespread clinical adoption.</div></div>\",\"PeriodicalId\":74892,\"journal\":{\"name\":\"Surgery open science\",\"volume\":\"26 \",\"pages\":\"Pages 64-78\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Surgery open science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589845025000387\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Surgery open science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589845025000387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

摘要

人工智能(AI)已经对包括整形外科在内的各个医疗领域产生了重大影响。大型语言模型(LLM)聊天机器人(如ChatGPT)和文本到图像的工具(如Dall-E和gpt - 40)正在得到更广泛的采用。本研究探讨了这些工具在手外科手术中的能力和局限性,重点是它们在患者和医学教育中的应用。方法利用谷歌Trends数据,对ChatGPT-4.5和ChatGPT-3.5中“手部解剖”、“手部骨折”、“手部关节损伤”、“手部肿瘤”和“手部脱位”等类别的常用检索词进行识别和查询。回答的准确性分为1-5级,并使用Flesch-Kincaid等级水平、患者教育材料评估工具(PEMAT)和DISCERN仪器进行评估。GPT 40、DALL-E 3和DALL-E 2显示了每个类别中选择的ChatGPT反应的视觉表示,并对其进行进一步评估。结果schatgpt 4.5得分为3.80±0.23分。对PEMAT可理解性的平均反应为91.67±0.29,对可操作性的平均反应为54.67±0.55。准确率为4.47±0.52,Flesch-Kincaid分级水平为9.26±1.04。在所有评估指标上,ChatGPT-4.5始终优于ChatGPT-3.5。对于文本到图像的生成,与DALL-E 3和DALL-E 2相比,gpt - 40产生了更准确的视觉效果。结论本研究突出了ChatGPT-4.5和gpt - 40在手外科教育中的优势和局限性。虽然将准确的文本生成与图像创建相结合显示出前景,但在广泛应用于临床之前,这些人工智能工具仍需要进一步完善。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dall-E in hand surgery: Exploring the utility of ChatGPT image generation

Background

Artificial intelligence (AI) has significantly influenced various medical fields, including plastic surgery. Large language model (LLM) chatbots such as ChatGPT and text-to-image tools like Dall-E and GPT-4o are gaining broader adoption. This study explores the capabilities and limitations of these tools in hand surgery, focusing on their application in patient and medical education.

Methods

Utilizing Google Trends data, common search terms were identified and queried on ChatGPT-4.5 and ChatGPT-3.5 from the following categories: “Hand Anatomy”, “Hand Fracture”, “Hand Joint Injury”, “Hand Tumor”, and “Hand Dislocation”. Responses were graded on a 1–5 scale for accuracy and evaluated using the Flesch-Kincaid Grade Level, Patient Education Materials Assessment Tool (PEMAT), and DISCERN instrument. GPT 4o, DALL-E 3, and DALL-E 2 illustrated visual representations of selected ChatGPT responses in each category, which were further evaluated.

Results

ChatGPT-4.5 achieved a DISCERN overall score of 3.80 ± 0.23. Its responses averaged 91.67 ± 0.29 for PEMAT understandability and 54.67 ± 0.55 for actionability. Accuracy was 4.47 ± 0.52, with a Flesch-Kincaid Grade Level of 9.26 ± 1.04. ChatGPT-4.5 consistently outperformed ChatGPT-3.5 across all evaluation metrics. For text-to-image generation, GPT-4o produced more accurate visuals compared to DALL-E 3 and DALL-E 2.

Conclusions

This study highlights the strengths and limitations of ChatGPT-4.5 and GPT-4o in hand surgery education. While combining accurate text generation with image creation shows promise, these AI tools still need further refinement before widespread clinical adoption.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.30
自引率
0.00%
发文量
0
审稿时长
66 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信