{"title":"多模态大语言模型对甲状腺相关眼病评估的帮助","authors":"Bo Ram Kim , Joon Yul Choi , Tae Keun Yoo","doi":"10.1016/j.compbiomed.2025.110301","DOIUrl":null,"url":null,"abstract":"<div><div>This study evaluated the potential of multimodal AI chatbots, specifically ChatGPT-4o, in assessing thyroid-associated ophthalmopathy (TAO) through the Clinical Activity Score (CAS). Using publicly available case reports and datasets, ChatGPT-4o was tasked with generating a web-based CAS calculator and estimating CAS from external ocular photographs. Its predictions were compared with CAS evaluations by ophthalmologists and convolutional neural network (CNN) models, including ResNet50. Receiver operating characteristic (ROC) areas under the curve (AUCs) were calculated for the assessment of active TAO (CAS ≥3). ChatGPT-4o demonstrated high accuracy, with mean absolute errors of 0.39 and 0.45 compared to reference ophthalmologist scores across two datasets, outperforming both Gemini Advanced and ResNet50 in identifying active TAO. In the preoperative and pre-treatment datasets, ChatGPT-4o achieved ROC-AUCs of 0.974 and 0.990, respectively, significantly exceeding the performance of ResNet50 (0.770 and 0.623). Both ChatGPT-4o and Customized GPTs achieved identical results, suggesting robust performance without the need for further customization. The AI chatbot effectively processed both text- and image-based inputs, providing detailed explanations for its CAS estimates and creating a user-friendly calculator for rapid and accessible TAO evaluation. ChatGPT-4o thus can offer a reliable tool for TAO assessment, outperforming traditional CNN-based models. Its ability to generate a CAS calculator without prior training or coding expertise highlights its practical utility for clinical ophthalmology. This study's limitations included a small sample size, lack of real-world validation, reliance on photos without patient metadata, and challenges in repeatability. Future studies should aim to validate its effectiveness in real-world clinical settings.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"192 ","pages":"Article 110301"},"PeriodicalIF":7.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal large language models as assistance for evaluation of thyroid-associated ophthalmopathy\",\"authors\":\"Bo Ram Kim , Joon Yul Choi , Tae Keun Yoo\",\"doi\":\"10.1016/j.compbiomed.2025.110301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study evaluated the potential of multimodal AI chatbots, specifically ChatGPT-4o, in assessing thyroid-associated ophthalmopathy (TAO) through the Clinical Activity Score (CAS). Using publicly available case reports and datasets, ChatGPT-4o was tasked with generating a web-based CAS calculator and estimating CAS from external ocular photographs. Its predictions were compared with CAS evaluations by ophthalmologists and convolutional neural network (CNN) models, including ResNet50. Receiver operating characteristic (ROC) areas under the curve (AUCs) were calculated for the assessment of active TAO (CAS ≥3). ChatGPT-4o demonstrated high accuracy, with mean absolute errors of 0.39 and 0.45 compared to reference ophthalmologist scores across two datasets, outperforming both Gemini Advanced and ResNet50 in identifying active TAO. In the preoperative and pre-treatment datasets, ChatGPT-4o achieved ROC-AUCs of 0.974 and 0.990, respectively, significantly exceeding the performance of ResNet50 (0.770 and 0.623). Both ChatGPT-4o and Customized GPTs achieved identical results, suggesting robust performance without the need for further customization. The AI chatbot effectively processed both text- and image-based inputs, providing detailed explanations for its CAS estimates and creating a user-friendly calculator for rapid and accessible TAO evaluation. ChatGPT-4o thus can offer a reliable tool for TAO assessment, outperforming traditional CNN-based models. Its ability to generate a CAS calculator without prior training or coding expertise highlights its practical utility for clinical ophthalmology. This study's limitations included a small sample size, lack of real-world validation, reliance on photos without patient metadata, and challenges in repeatability. Future studies should aim to validate its effectiveness in real-world clinical settings.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"192 \",\"pages\":\"Article 110301\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525006523\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525006523","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
Multimodal large language models as assistance for evaluation of thyroid-associated ophthalmopathy
This study evaluated the potential of multimodal AI chatbots, specifically ChatGPT-4o, in assessing thyroid-associated ophthalmopathy (TAO) through the Clinical Activity Score (CAS). Using publicly available case reports and datasets, ChatGPT-4o was tasked with generating a web-based CAS calculator and estimating CAS from external ocular photographs. Its predictions were compared with CAS evaluations by ophthalmologists and convolutional neural network (CNN) models, including ResNet50. Receiver operating characteristic (ROC) areas under the curve (AUCs) were calculated for the assessment of active TAO (CAS ≥3). ChatGPT-4o demonstrated high accuracy, with mean absolute errors of 0.39 and 0.45 compared to reference ophthalmologist scores across two datasets, outperforming both Gemini Advanced and ResNet50 in identifying active TAO. In the preoperative and pre-treatment datasets, ChatGPT-4o achieved ROC-AUCs of 0.974 and 0.990, respectively, significantly exceeding the performance of ResNet50 (0.770 and 0.623). Both ChatGPT-4o and Customized GPTs achieved identical results, suggesting robust performance without the need for further customization. The AI chatbot effectively processed both text- and image-based inputs, providing detailed explanations for its CAS estimates and creating a user-friendly calculator for rapid and accessible TAO evaluation. ChatGPT-4o thus can offer a reliable tool for TAO assessment, outperforming traditional CNN-based models. Its ability to generate a CAS calculator without prior training or coding expertise highlights its practical utility for clinical ophthalmology. This study's limitations included a small sample size, lack of real-world validation, reliance on photos without patient metadata, and challenges in repeatability. Future studies should aim to validate its effectiveness in real-world clinical settings.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.