多模态大语言模型对甲状腺相关眼病评估的帮助

IF 7 2区 医学 Q1 BIOLOGY
Bo Ram Kim , Joon Yul Choi , Tae Keun Yoo
{"title":"多模态大语言模型对甲状腺相关眼病评估的帮助","authors":"Bo Ram Kim ,&nbsp;Joon Yul Choi ,&nbsp;Tae Keun Yoo","doi":"10.1016/j.compbiomed.2025.110301","DOIUrl":null,"url":null,"abstract":"<div><div>This study evaluated the potential of multimodal AI chatbots, specifically ChatGPT-4o, in assessing thyroid-associated ophthalmopathy (TAO) through the Clinical Activity Score (CAS). Using publicly available case reports and datasets, ChatGPT-4o was tasked with generating a web-based CAS calculator and estimating CAS from external ocular photographs. Its predictions were compared with CAS evaluations by ophthalmologists and convolutional neural network (CNN) models, including ResNet50. Receiver operating characteristic (ROC) areas under the curve (AUCs) were calculated for the assessment of active TAO (CAS ≥3). ChatGPT-4o demonstrated high accuracy, with mean absolute errors of 0.39 and 0.45 compared to reference ophthalmologist scores across two datasets, outperforming both Gemini Advanced and ResNet50 in identifying active TAO. In the preoperative and pre-treatment datasets, ChatGPT-4o achieved ROC-AUCs of 0.974 and 0.990, respectively, significantly exceeding the performance of ResNet50 (0.770 and 0.623). Both ChatGPT-4o and Customized GPTs achieved identical results, suggesting robust performance without the need for further customization. The AI chatbot effectively processed both text- and image-based inputs, providing detailed explanations for its CAS estimates and creating a user-friendly calculator for rapid and accessible TAO evaluation. ChatGPT-4o thus can offer a reliable tool for TAO assessment, outperforming traditional CNN-based models. Its ability to generate a CAS calculator without prior training or coding expertise highlights its practical utility for clinical ophthalmology. This study's limitations included a small sample size, lack of real-world validation, reliance on photos without patient metadata, and challenges in repeatability. Future studies should aim to validate its effectiveness in real-world clinical settings.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"192 ","pages":"Article 110301"},"PeriodicalIF":7.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal large language models as assistance for evaluation of thyroid-associated ophthalmopathy\",\"authors\":\"Bo Ram Kim ,&nbsp;Joon Yul Choi ,&nbsp;Tae Keun Yoo\",\"doi\":\"10.1016/j.compbiomed.2025.110301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study evaluated the potential of multimodal AI chatbots, specifically ChatGPT-4o, in assessing thyroid-associated ophthalmopathy (TAO) through the Clinical Activity Score (CAS). Using publicly available case reports and datasets, ChatGPT-4o was tasked with generating a web-based CAS calculator and estimating CAS from external ocular photographs. Its predictions were compared with CAS evaluations by ophthalmologists and convolutional neural network (CNN) models, including ResNet50. Receiver operating characteristic (ROC) areas under the curve (AUCs) were calculated for the assessment of active TAO (CAS ≥3). ChatGPT-4o demonstrated high accuracy, with mean absolute errors of 0.39 and 0.45 compared to reference ophthalmologist scores across two datasets, outperforming both Gemini Advanced and ResNet50 in identifying active TAO. In the preoperative and pre-treatment datasets, ChatGPT-4o achieved ROC-AUCs of 0.974 and 0.990, respectively, significantly exceeding the performance of ResNet50 (0.770 and 0.623). Both ChatGPT-4o and Customized GPTs achieved identical results, suggesting robust performance without the need for further customization. The AI chatbot effectively processed both text- and image-based inputs, providing detailed explanations for its CAS estimates and creating a user-friendly calculator for rapid and accessible TAO evaluation. ChatGPT-4o thus can offer a reliable tool for TAO assessment, outperforming traditional CNN-based models. Its ability to generate a CAS calculator without prior training or coding expertise highlights its practical utility for clinical ophthalmology. This study's limitations included a small sample size, lack of real-world validation, reliance on photos without patient metadata, and challenges in repeatability. Future studies should aim to validate its effectiveness in real-world clinical settings.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"192 \",\"pages\":\"Article 110301\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525006523\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525006523","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

本研究评估了多模式人工智能聊天机器人,特别是chatgpt - 40,通过临床活动评分(CAS)评估甲状腺相关眼病(TAO)的潜力。利用公开可用的病例报告和数据集,chatgpt - 40的任务是生成一个基于网络的CAS计算器,并从外部眼照片中估计CAS。将其预测与眼科医生和卷积神经网络(CNN)模型(包括ResNet50)的CAS评估进行比较。计算受者工作特征(ROC)曲线下面积(auc),评价活动性TAO (CAS≥3)。chatggt - 40显示出很高的准确性,与两个数据集的参考眼科医生评分相比,其平均绝对误差为0.39和0.45,在识别活动性TAO方面优于Gemini Advanced和ResNet50。在术前和预处理数据集中,chatgpt - 40的roc - auc分别为0.974和0.990,显著超过ResNet50的0.770和0.623。chatgpt - 40和自定义gpt都获得了相同的结果,表明无需进一步定制即可实现强大的性能。人工智能聊天机器人有效地处理了基于文本和图像的输入,为其CAS估计提供了详细的解释,并创建了一个用户友好的计算器,用于快速和可访问的TAO评估。因此,chatgpt - 40可以为TAO评估提供可靠的工具,优于传统的基于cnn的模型。它无需事先培训或编码专业知识即可生成CAS计算器的能力突出了其在眼科临床中的实用性。本研究的局限性包括样本量小,缺乏真实世界的验证,依赖于没有患者元数据的照片,以及可重复性的挑战。未来的研究应旨在验证其在实际临床环境中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Multimodal large language models as assistance for evaluation of thyroid-associated ophthalmopathy

Multimodal large language models as assistance for evaluation of thyroid-associated ophthalmopathy
This study evaluated the potential of multimodal AI chatbots, specifically ChatGPT-4o, in assessing thyroid-associated ophthalmopathy (TAO) through the Clinical Activity Score (CAS). Using publicly available case reports and datasets, ChatGPT-4o was tasked with generating a web-based CAS calculator and estimating CAS from external ocular photographs. Its predictions were compared with CAS evaluations by ophthalmologists and convolutional neural network (CNN) models, including ResNet50. Receiver operating characteristic (ROC) areas under the curve (AUCs) were calculated for the assessment of active TAO (CAS ≥3). ChatGPT-4o demonstrated high accuracy, with mean absolute errors of 0.39 and 0.45 compared to reference ophthalmologist scores across two datasets, outperforming both Gemini Advanced and ResNet50 in identifying active TAO. In the preoperative and pre-treatment datasets, ChatGPT-4o achieved ROC-AUCs of 0.974 and 0.990, respectively, significantly exceeding the performance of ResNet50 (0.770 and 0.623). Both ChatGPT-4o and Customized GPTs achieved identical results, suggesting robust performance without the need for further customization. The AI chatbot effectively processed both text- and image-based inputs, providing detailed explanations for its CAS estimates and creating a user-friendly calculator for rapid and accessible TAO evaluation. ChatGPT-4o thus can offer a reliable tool for TAO assessment, outperforming traditional CNN-based models. Its ability to generate a CAS calculator without prior training or coding expertise highlights its practical utility for clinical ophthalmology. This study's limitations included a small sample size, lack of real-world validation, reliance on photos without patient metadata, and challenges in repeatability. Future studies should aim to validate its effectiveness in real-world clinical settings.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信