评估ChatGPT-4在MASH肝纤维化组织病理学评估中的诊断准确性。

IF 5.6 2区 医学 Q1 GASTROENTEROLOGY & HEPATOLOGY
Hepatology Communications Pub Date : 2025-04-30 eCollection Date: 2025-05-01 DOI:10.1097/HC9.0000000000000695
Davide Panzeri, Thiyaphat Laohawetwanit, Reha Akpinar, Camilla De Carlo, Vincenzo Belsito, Luigi Terracciano, Alessio Aghemo, Nicola Pugliese, Giuseppe Chirico, Donato Inverso, Julien Calderaro, Laura Sironi, Luca Di Tommaso
{"title":"评估ChatGPT-4在MASH肝纤维化组织病理学评估中的诊断准确性。","authors":"Davide Panzeri, Thiyaphat Laohawetwanit, Reha Akpinar, Camilla De Carlo, Vincenzo Belsito, Luigi Terracciano, Alessio Aghemo, Nicola Pugliese, Giuseppe Chirico, Donato Inverso, Julien Calderaro, Laura Sironi, Luca Di Tommaso","doi":"10.1097/HC9.0000000000000695","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Large language models like ChatGPT have demonstrated potential in medical image interpretation, but their efficacy in liver histopathological analysis remains largely unexplored. This study aims to assess ChatGPT-4-vision's diagnostic accuracy, compared to liver pathologists' performance, in evaluating liver fibrosis (stage) in metabolic dysfunction-associated steatohepatitis.</p><p><strong>Methods: </strong>Digitized Sirius Red-stained images for 59 metabolic dysfunction-associated steatohepatitis tissue biopsy specimens were evaluated by ChatGPT-4 and 4 pathologists using the NASH-CRN staging system. Fields of view at increasing magnification levels, extracted by a senior pathologist or randomly selected, were shown to ChatGPT-4, asking for fibrosis staging. The diagnostic accuracy of ChatGPT-4 was compared with pathologists' evaluations and correlated to the collagen proportionate area for additional insights. All cases were further analyzed by an in-context learning approach, where the model learns from exemplary images provided during prompting.</p><p><strong>Results: </strong>ChatGPT-4's diagnostic accuracy was 81% when using images selected by a pathologist, while it decreased to 54% with randomly cropped fields of view. By employing an in-context learning approach, the accuracy increased to 88% and 77% for selected and random fields of view, respectively. This method enabled the model to fully and correctly identify the tissue structures characteristic of F4 stages, previously misclassified. The study also highlighted a moderate to strong correlation between ChatGPT-4's fibrosis staging and collagen proportionate area.</p><p><strong>Conclusions: </strong>ChatGPT-4 showed remarkable results with a diagnostic accuracy overlapping those of expert liver pathologists. The in-context learning analysis, applied here for the first time to assess fibrosis deposition in metabolic dysfunction-associated steatohepatitis samples, was crucial in accurately identifying the key features of F4 cases, critical for early therapeutic decision-making. These findings suggest the potential for integrating large language models as supportive tools in diagnostic pathology.</p>","PeriodicalId":12978,"journal":{"name":"Hepatology Communications","volume":"9 5","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12045550/pdf/","citationCount":"0","resultStr":"{\"title\":\"Assessing the diagnostic accuracy of ChatGPT-4 in the histopathological evaluation of liver fibrosis in MASH.\",\"authors\":\"Davide Panzeri, Thiyaphat Laohawetwanit, Reha Akpinar, Camilla De Carlo, Vincenzo Belsito, Luigi Terracciano, Alessio Aghemo, Nicola Pugliese, Giuseppe Chirico, Donato Inverso, Julien Calderaro, Laura Sironi, Luca Di Tommaso\",\"doi\":\"10.1097/HC9.0000000000000695\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Large language models like ChatGPT have demonstrated potential in medical image interpretation, but their efficacy in liver histopathological analysis remains largely unexplored. This study aims to assess ChatGPT-4-vision's diagnostic accuracy, compared to liver pathologists' performance, in evaluating liver fibrosis (stage) in metabolic dysfunction-associated steatohepatitis.</p><p><strong>Methods: </strong>Digitized Sirius Red-stained images for 59 metabolic dysfunction-associated steatohepatitis tissue biopsy specimens were evaluated by ChatGPT-4 and 4 pathologists using the NASH-CRN staging system. Fields of view at increasing magnification levels, extracted by a senior pathologist or randomly selected, were shown to ChatGPT-4, asking for fibrosis staging. The diagnostic accuracy of ChatGPT-4 was compared with pathologists' evaluations and correlated to the collagen proportionate area for additional insights. All cases were further analyzed by an in-context learning approach, where the model learns from exemplary images provided during prompting.</p><p><strong>Results: </strong>ChatGPT-4's diagnostic accuracy was 81% when using images selected by a pathologist, while it decreased to 54% with randomly cropped fields of view. By employing an in-context learning approach, the accuracy increased to 88% and 77% for selected and random fields of view, respectively. This method enabled the model to fully and correctly identify the tissue structures characteristic of F4 stages, previously misclassified. The study also highlighted a moderate to strong correlation between ChatGPT-4's fibrosis staging and collagen proportionate area.</p><p><strong>Conclusions: </strong>ChatGPT-4 showed remarkable results with a diagnostic accuracy overlapping those of expert liver pathologists. The in-context learning analysis, applied here for the first time to assess fibrosis deposition in metabolic dysfunction-associated steatohepatitis samples, was crucial in accurately identifying the key features of F4 cases, critical for early therapeutic decision-making. These findings suggest the potential for integrating large language models as supportive tools in diagnostic pathology.</p>\",\"PeriodicalId\":12978,\"journal\":{\"name\":\"Hepatology Communications\",\"volume\":\"9 5\",\"pages\":\"\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12045550/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Hepatology Communications\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1097/HC9.0000000000000695\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hepatology Communications","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/HC9.0000000000000695","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:像ChatGPT这样的大型语言模型已经证明了在医学图像解释方面的潜力,但它们在肝脏组织病理学分析方面的功效仍然很大程度上未被探索。本研究旨在评估ChatGPT-4-vision在评估代谢功能障碍相关脂肪性肝炎肝纤维化(阶段)方面的诊断准确性,并与肝脏病理学家的表现进行比较。方法:采用ChatGPT-4和4名病理学家采用NASH-CRN分期系统对59例代谢功能障碍相关脂肪性肝炎组织活检标本进行数字化Sirius red染色。由高级病理学家提取或随机选择的视场在增加的放大水平下显示给ChatGPT-4,询问纤维化分期。ChatGPT-4的诊断准确性与病理学家的评估进行了比较,并与胶原比例面积相关,以获得更多的见解。所有案例都通过上下文学习方法进一步分析,其中模型从提示过程中提供的示例图像中学习。结果:当使用病理学家选择的图像时,ChatGPT-4的诊断准确率为81%,而当使用随机裁剪的视场时,准确率下降到54%。通过采用上下文学习方法,对于选定的视场和随机视场,准确率分别提高到88%和77%。该方法使模型能够完整、正确地识别F4期的组织结构特征,而之前被错误地分类。该研究还强调了ChatGPT-4纤维化分期与胶原比例面积之间的中度至强相关性。结论:ChatGPT-4显示了显著的结果,其诊断准确性与肝脏病理学专家的诊断准确性重叠。本文首次应用上下文学习分析来评估代谢功能障碍相关脂肪性肝炎样本中的纤维化沉积,这对于准确识别F4病例的关键特征至关重要,对早期治疗决策至关重要。这些发现表明整合大型语言模型作为病理诊断辅助工具的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Assessing the diagnostic accuracy of ChatGPT-4 in the histopathological evaluation of liver fibrosis in MASH.

Background: Large language models like ChatGPT have demonstrated potential in medical image interpretation, but their efficacy in liver histopathological analysis remains largely unexplored. This study aims to assess ChatGPT-4-vision's diagnostic accuracy, compared to liver pathologists' performance, in evaluating liver fibrosis (stage) in metabolic dysfunction-associated steatohepatitis.

Methods: Digitized Sirius Red-stained images for 59 metabolic dysfunction-associated steatohepatitis tissue biopsy specimens were evaluated by ChatGPT-4 and 4 pathologists using the NASH-CRN staging system. Fields of view at increasing magnification levels, extracted by a senior pathologist or randomly selected, were shown to ChatGPT-4, asking for fibrosis staging. The diagnostic accuracy of ChatGPT-4 was compared with pathologists' evaluations and correlated to the collagen proportionate area for additional insights. All cases were further analyzed by an in-context learning approach, where the model learns from exemplary images provided during prompting.

Results: ChatGPT-4's diagnostic accuracy was 81% when using images selected by a pathologist, while it decreased to 54% with randomly cropped fields of view. By employing an in-context learning approach, the accuracy increased to 88% and 77% for selected and random fields of view, respectively. This method enabled the model to fully and correctly identify the tissue structures characteristic of F4 stages, previously misclassified. The study also highlighted a moderate to strong correlation between ChatGPT-4's fibrosis staging and collagen proportionate area.

Conclusions: ChatGPT-4 showed remarkable results with a diagnostic accuracy overlapping those of expert liver pathologists. The in-context learning analysis, applied here for the first time to assess fibrosis deposition in metabolic dysfunction-associated steatohepatitis samples, was crucial in accurately identifying the key features of F4 cases, critical for early therapeutic decision-making. These findings suggest the potential for integrating large language models as supportive tools in diagnostic pathology.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Hepatology Communications
Hepatology Communications GASTROENTEROLOGY & HEPATOLOGY-
CiteScore
8.00
自引率
2.00%
发文量
248
审稿时长
8 weeks
期刊介绍: Hepatology Communications is a peer-reviewed, online-only, open access journal for fast dissemination of high quality basic, translational, and clinical research in hepatology. Hepatology Communications maintains high standard and rigorous peer review. Because of its open access nature, authors retain the copyright to their works, all articles are immediately available and free to read and share, and it is fully compliant with funder and institutional mandates. The journal is committed to fast publication and author satisfaction. ​
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信