chatgpt - 40在喉恶性和癌前病变文本和视频分析中的准确性。

IF 2.5 4区医学 Q1 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Journal of Voice Pub Date : 2025-03-26 DOI:10.1016/j.jvoice.2025.03.006

Carlos M Chiesa-Estomba, Maider Andueza-Guembe, Antonino Maniaci, Miguel Mayo-Yanez, Frank Betances-Reinoso, Luigi A Vaira, Alberto Maria Saibene, Jerome R Lechien

{"title":"chatgpt - 40在喉恶性和癌前病变文本和视频分析中的准确性。","authors":"Carlos M Chiesa-Estomba, Maider Andueza-Guembe, Antonino Maniaci, Miguel Mayo-Yanez, Frank Betances-Reinoso, Luigi A Vaira, Alberto Maria Saibene, Jerome R Lechien","doi":"10.1016/j.jvoice.2025.03.006","DOIUrl":null,"url":null,"abstract":"Introduction: Chatbot Generative Pretrained Transformer (ChatGPT), a multimodal generative AI, has been studied for potential applications in healthcare, including otolaryngology-head and neck surgery. In this study, authors investigates the consistency of ChatGPT-4o in analyzing clinical fiberoptic videos of suspected laryngeal malignancies compared to expert clinicians.Methods: This experimental study involved twenty patients with primary laryngeal disease consulting at a tertiary academic center. Data, including laryngeal fiberoptic video examinations, were retrospectively analyzed using the ChatGPT-4o application programming interface. Responses were assessed for diagnostic accuracy, consistency, and clinical recommendations. Three otolaryngology-head and neck consultants independently evaluated ChatGPT-4o's performance using the Artificial Intelligence Performance Instrument and a five-point Likert scale for complexity and consistency.Results: ChatGPT-4o identified malignant diagnoses as the primary diagnosis in 30% of cases, while proposing malignancies as one of the top three diagnoses in 90% of cases. Despite high sensitivity, specificity was limited. The mean consistency score for image analysis was 2.36 ± 1.13, with an intraclass correlation coefficient of 0.890 (P = 0.03). The model showed a tendency to prioritize text over visual data, limiting the improvement in diagnostic accuracy from video input.Conclusion: While ChatGPT-4o demonstrates potential in analyzing laryngeal pathologies through multimodal data, current limitations in specificity and image interpretation indicate the need for further refinement. Ongoing advancements could enhance its integration into clinical workflows, supporting accurate diagnoses and decision-making in otolaryngology.","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accuracy of ChatGPT-4o in Text and Video Analysis of Laryngeal Malignant and Premalignant Diseases.\",\"authors\":\"Carlos M Chiesa-Estomba, Maider Andueza-Guembe, Antonino Maniaci, Miguel Mayo-Yanez, Frank Betances-Reinoso, Luigi A Vaira, Alberto Maria Saibene, Jerome R Lechien\",\"doi\":\"10.1016/j.jvoice.2025.03.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Chatbot Generative Pretrained Transformer (ChatGPT), a multimodal generative AI, has been studied for potential applications in healthcare, including otolaryngology-head and neck surgery. In this study, authors investigates the consistency of ChatGPT-4o in analyzing clinical fiberoptic videos of suspected laryngeal malignancies compared to expert clinicians.Methods: This experimental study involved twenty patients with primary laryngeal disease consulting at a tertiary academic center. Data, including laryngeal fiberoptic video examinations, were retrospectively analyzed using the ChatGPT-4o application programming interface. Responses were assessed for diagnostic accuracy, consistency, and clinical recommendations. Three otolaryngology-head and neck consultants independently evaluated ChatGPT-4o's performance using the Artificial Intelligence Performance Instrument and a five-point Likert scale for complexity and consistency.Results: ChatGPT-4o identified malignant diagnoses as the primary diagnosis in 30% of cases, while proposing malignancies as one of the top three diagnoses in 90% of cases. Despite high sensitivity, specificity was limited. The mean consistency score for image analysis was 2.36 ± 1.13, with an intraclass correlation coefficient of 0.890 (P = 0.03). The model showed a tendency to prioritize text over visual data, limiting the improvement in diagnostic accuracy from video input.Conclusion: While ChatGPT-4o demonstrates potential in analyzing laryngeal pathologies through multimodal data, current limitations in specificity and image interpretation indicate the need for further refinement. Ongoing advancements could enhance its integration into clinical workflows, supporting accurate diagnoses and decision-making in otolaryngology.\",\"PeriodicalId\":49954,\"journal\":{\"name\":\"Journal of Voice\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Voice\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jvoice.2025.03.006\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2025.03.006","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

聊天机器人生成预训练转换器（ChatGPT）是一种多模态生成人工智能，已被研究用于医疗保健领域，包括耳鼻喉科-头颈外科。在这项研究中，作者调查了chatgpt - 40在分析疑似喉部恶性肿瘤的临床纤维视频时与专家临床医生的一致性。方法：本实验研究纳入20例在某三级学术中心就诊的原发性喉部疾病患者。使用chatgpt - 40应用程序编程接口对包括喉部光纤视频检查在内的数据进行回顾性分析。评估反应的诊断准确性、一致性和临床建议。三位耳鼻喉头颈科顾问使用人工智能性能仪器和五点李克特量表对chatgpt - 40的性能进行了独立评估，以评估复杂性和一致性。结果：chatgpt - 40在30%的病例中将恶性诊断作为主要诊断，在90%的病例中将恶性诊断列为前三大诊断之一。尽管灵敏度高，但特异性有限。图像分析的平均一致性评分为2.36±1.13，类内相关系数为0.890 （P = 0.03）。该模型显示出文本优先于视觉数据的趋势，限制了视频输入诊断准确性的提高。结论：虽然chatgpt - 40显示了通过多模态数据分析喉部病变的潜力，但目前在特异性和图像解释方面的局限性表明需要进一步改进。目前的进展可以加强其与临床工作流程的整合，支持耳鼻喉科的准确诊断和决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accuracy of ChatGPT-4o in Text and Video Analysis of Laryngeal Malignant and Premalignant Diseases.

Introduction: Chatbot Generative Pretrained Transformer (ChatGPT), a multimodal generative AI, has been studied for potential applications in healthcare, including otolaryngology-head and neck surgery. In this study, authors investigates the consistency of ChatGPT-4o in analyzing clinical fiberoptic videos of suspected laryngeal malignancies compared to expert clinicians.

Methods: This experimental study involved twenty patients with primary laryngeal disease consulting at a tertiary academic center. Data, including laryngeal fiberoptic video examinations, were retrospectively analyzed using the ChatGPT-4o application programming interface. Responses were assessed for diagnostic accuracy, consistency, and clinical recommendations. Three otolaryngology-head and neck consultants independently evaluated ChatGPT-4o's performance using the Artificial Intelligence Performance Instrument and a five-point Likert scale for complexity and consistency.

Results: ChatGPT-4o identified malignant diagnoses as the primary diagnosis in 30% of cases, while proposing malignancies as one of the top three diagnoses in 90% of cases. Despite high sensitivity, specificity was limited. The mean consistency score for image analysis was 2.36 ± 1.13, with an intraclass correlation coefficient of 0.890 (P = 0.03). The model showed a tendency to prioritize text over visual data, limiting the improvement in diagnostic accuracy from video input.

Conclusion: While ChatGPT-4o demonstrates potential in analyzing laryngeal pathologies through multimodal data, current limitations in specificity and image interpretation indicate the need for further refinement. Ongoing advancements could enhance its integration into clinical workflows, supporting accurate diagnoses and decision-making in otolaryngology.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Voice 医学-耳鼻喉科学

CiteScore

4.00

自引率

13.60%

发文量

395

审稿时长

59 days

期刊介绍： The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.