Interpretation of Clinical Retinal Images Using an Artificial Intelligence Chatbot

IF 3.2 Q1 OPHTHALMOLOGY

Ophthalmology science Pub Date : 2024-05-23 DOI:10.1016/j.xops.2024.100556

{"title":"Interpretation of Clinical Retinal Images Using an Artificial Intelligence Chatbot","authors":"","doi":"10.1016/j.xops.2024.100556","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>To assess the performance of Chat Generative Pre-Trained Transformer-4 in providing accurate diagnoses to retina teaching cases from OCTCases.</p></div><div><h3>Design</h3><p>Cross-sectional study.</p></div><div><h3>Subjects</h3><p>Retina teaching cases from OCTCases.</p></div><div><h3>Methods</h3><p>We prompted a custom chatbot with 69 retina cases containing multimodal ophthalmic images, asking it to provide the most likely diagnosis. In a sensitivity analysis, we inputted increasing amounts of clinical information pertaining to each case until the chatbot achieved a correct diagnosis. We performed multivariable logistic regressions on Stata v17.0 (StataCorp LLC) to investigate associations between the amount of text-based information inputted per prompt and the odds of the chatbot achieving a correct diagnosis, adjusting for the laterality of cases, number of ophthalmic images inputted, and imaging modalities.</p></div><div><h3>Main Outcome Measures</h3><p>Our primary outcome was the proportion of cases for which the chatbot was able to provide a correct diagnosis. Our secondary outcome was the chatbot’s performance in relation to the amount of text-based information accompanying ophthalmic images.</p></div><div><h3>Results</h3><p>Across 69 retina cases collectively containing 139 ophthalmic images, the chatbot was able to provide a definitive, correct diagnosis for 35 (50.7%) cases. The chatbot needed variable amounts of clinical information to achieve a correct diagnosis, where the entire patient description as presented by OCTCases was required for a majority of correctly diagnosed cases (23 of 35 cases, 65.7%). Relative to when the chatbot was only prompted with a patient’s age and sex, the chatbot achieved a higher odds of a correct diagnosis when prompted with an entire patient description (odds ratio = 10.1, 95% confidence interval = 3.3–30.3, <em>P</em> < 0.01). Despite providing an incorrect diagnosis for 34 (49.3%) cases, the chatbot listed the correct diagnosis within its differential diagnosis for 7 (20.6%) of these incorrectly answered cases.</p></div><div><h3>Conclusions</h3><p>This custom chatbot was able to accurately diagnose approximately half of the retina cases requiring multimodal input, albeit relying heavily on text-based contextual information that accompanied ophthalmic images. The diagnostic ability of the chatbot in interpretation of multimodal imaging without text-based information is currently limited. The appropriate use of the chatbot in this setting is of utmost importance, given bioethical concerns.</p></div><div><h3>Financial Disclosure(s)</h3><p>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</p></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"4 6","pages":"Article 100556"},"PeriodicalIF":3.2000,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666914524000927/pdfft?md5=cbc151f11a332e61ad5ea6ce2945620c&pid=1-s2.0-S2666914524000927-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914524000927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

To assess the performance of Chat Generative Pre-Trained Transformer-4 in providing accurate diagnoses to retina teaching cases from OCTCases.

Design

Cross-sectional study.

Subjects

Retina teaching cases from OCTCases.

Methods

We prompted a custom chatbot with 69 retina cases containing multimodal ophthalmic images, asking it to provide the most likely diagnosis. In a sensitivity analysis, we inputted increasing amounts of clinical information pertaining to each case until the chatbot achieved a correct diagnosis. We performed multivariable logistic regressions on Stata v17.0 (StataCorp LLC) to investigate associations between the amount of text-based information inputted per prompt and the odds of the chatbot achieving a correct diagnosis, adjusting for the laterality of cases, number of ophthalmic images inputted, and imaging modalities.

Main Outcome Measures

Our primary outcome was the proportion of cases for which the chatbot was able to provide a correct diagnosis. Our secondary outcome was the chatbot’s performance in relation to the amount of text-based information accompanying ophthalmic images.

Results

Across 69 retina cases collectively containing 139 ophthalmic images, the chatbot was able to provide a definitive, correct diagnosis for 35 (50.7%) cases. The chatbot needed variable amounts of clinical information to achieve a correct diagnosis, where the entire patient description as presented by OCTCases was required for a majority of correctly diagnosed cases (23 of 35 cases, 65.7%). Relative to when the chatbot was only prompted with a patient’s age and sex, the chatbot achieved a higher odds of a correct diagnosis when prompted with an entire patient description (odds ratio = 10.1, 95% confidence interval = 3.3–30.3, P < 0.01). Despite providing an incorrect diagnosis for 34 (49.3%) cases, the chatbot listed the correct diagnosis within its differential diagnosis for 7 (20.6%) of these incorrectly answered cases.

Conclusions

This custom chatbot was able to accurately diagnose approximately half of the retina cases requiring multimodal input, albeit relying heavily on text-based contextual information that accompanied ophthalmic images. The diagnostic ability of the chatbot in interpretation of multimodal imaging without text-based information is currently limited. The appropriate use of the chatbot in this setting is of utmost importance, given bioethical concerns.

Financial Disclosure(s)

Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

查看原文本刊更多论文

利用人工智能聊天机器人解读临床视网膜图像

目的评估聊天生成预训练变换器-4在为OCTCases视网膜教学病例提供准确诊断方面的性能。方法我们用69个包含多模态眼科图像的视网膜病例提示自定义聊天机器人，要求它提供最可能的诊断。在敏感性分析中，我们输入了与每个病例相关的越来越多的临床信息，直到聊天机器人做出正确诊断。我们在 Stata v17.0 (StataCorp LLC) 上进行了多变量逻辑回归，研究每次提示输入的文本信息量与聊天机器人获得正确诊断的几率之间的关系，并对病例的侧位、输入的眼科图像数量和成像模式进行了调整。主要结果测量我们的主要结果是聊天机器人能够提供正确诊断的病例比例。我们的次要结果是聊天机器人的性能与眼科图像随附的文本信息量的关系。结果在总共包含 139 张眼科图像的 69 个视网膜病例中，聊天机器人能够为 35 个病例（50.7%）提供明确、正确的诊断。聊天机器人需要不同数量的临床信息才能做出正确诊断，其中大部分正确诊断病例（35 例中的 23 例，65.7%）需要 OCTCases 提供的完整患者描述。与聊天机器人只提示患者年龄和性别的情况相比，聊天机器人在提示完整的患者描述时获得正确诊断的几率更高（几率比 = 10.1，95% 置信区间 = 3.3-30.3，P <0.01）。尽管提供了 34 个（49.3%）病例的错误诊断，聊天机器人还是在这些错误回答病例中的 7 个（20.6%）病例的鉴别诊断中列出了正确诊断。在没有文本信息的情况下，聊天机器人对多模态成像的诊断能力目前还很有限。考虑到生物伦理方面的问题，在这种情况下适当使用聊天机器人至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊