Areeb Shah, Luke Schwetschenau, Lisa Velez-Velez, Rohun Gupta, Kevin Chen, Collin Chen
{"title":"评估人工智能对莫氏重建术术后问题的反应。","authors":"Areeb Shah, Luke Schwetschenau, Lisa Velez-Velez, Rohun Gupta, Kevin Chen, Collin Chen","doi":"10.1055/a-2689-2685","DOIUrl":null,"url":null,"abstract":"<p><p>Patients frequently ask questions after Mohs facial reconstruction. AI tools, particularly large language models (LLMs), may optimize this communication.We evaluated four LLMs-Claude AI, ChatGPT, Microsoft Copilot, and Google Gemini-on responses to postoperative questions, hypothesizing variation in quality, accuracy, comprehensiveness, and readability.Prospective observational study following STROBE guidelines.A total of 31 common postoperative questions were created. Each was submitted to all four LLMs using a standardized prompt. Responses were evaluated by blinded facial plastic surgeons using validated scoring tools (EQIP, Likert scales, readability formulas). IRB exemption was granted.Claude AI outperformed others in quality (EQIP: 90.3), accuracy (4.55/5), and comprehensiveness (4.60/5). All LLMs exceeded the recommended 6th-grade reading level.LLMs show potential for supporting postoperative communication, but variation in readability and content depth highlights the continued need for physician oversight.</p>","PeriodicalId":12195,"journal":{"name":"Facial Plastic Surgery","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating AI Responses to Postoperative Questions in Mohs Reconstruction.\",\"authors\":\"Areeb Shah, Luke Schwetschenau, Lisa Velez-Velez, Rohun Gupta, Kevin Chen, Collin Chen\",\"doi\":\"10.1055/a-2689-2685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Patients frequently ask questions after Mohs facial reconstruction. AI tools, particularly large language models (LLMs), may optimize this communication.We evaluated four LLMs-Claude AI, ChatGPT, Microsoft Copilot, and Google Gemini-on responses to postoperative questions, hypothesizing variation in quality, accuracy, comprehensiveness, and readability.Prospective observational study following STROBE guidelines.A total of 31 common postoperative questions were created. Each was submitted to all four LLMs using a standardized prompt. Responses were evaluated by blinded facial plastic surgeons using validated scoring tools (EQIP, Likert scales, readability formulas). IRB exemption was granted.Claude AI outperformed others in quality (EQIP: 90.3), accuracy (4.55/5), and comprehensiveness (4.60/5). All LLMs exceeded the recommended 6th-grade reading level.LLMs show potential for supporting postoperative communication, but variation in readability and content depth highlights the continued need for physician oversight.</p>\",\"PeriodicalId\":12195,\"journal\":{\"name\":\"Facial Plastic Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Facial Plastic Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1055/a-2689-2685\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Facial Plastic Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2689-2685","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
Evaluating AI Responses to Postoperative Questions in Mohs Reconstruction.
Patients frequently ask questions after Mohs facial reconstruction. AI tools, particularly large language models (LLMs), may optimize this communication.We evaluated four LLMs-Claude AI, ChatGPT, Microsoft Copilot, and Google Gemini-on responses to postoperative questions, hypothesizing variation in quality, accuracy, comprehensiveness, and readability.Prospective observational study following STROBE guidelines.A total of 31 common postoperative questions were created. Each was submitted to all four LLMs using a standardized prompt. Responses were evaluated by blinded facial plastic surgeons using validated scoring tools (EQIP, Likert scales, readability formulas). IRB exemption was granted.Claude AI outperformed others in quality (EQIP: 90.3), accuracy (4.55/5), and comprehensiveness (4.60/5). All LLMs exceeded the recommended 6th-grade reading level.LLMs show potential for supporting postoperative communication, but variation in readability and content depth highlights the continued need for physician oversight.
期刊介绍:
Facial Plastic Surgery is a journal that publishes topic-specific issues covering areas of aesthetic and reconstructive plastic surgery as it relates to the head, neck, and face. The journal''s scope includes issues devoted to scar revision, periorbital and mid-face rejuvenation, facial trauma, facial implants, rhinoplasty, neck reconstruction, cleft palate, face lifts, as well as various other emerging minimally invasive procedures.
Authors provide a global perspective on each topic, critically evaluate recent works in the field, and apply it to clinical practice.