IF 7.6 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES
Junyi Shen, Anqi Lin, Ting Wei, Jian Zhang, Peng Luo
{"title":"Evaluating generative AI models for explainable pathological feature extraction in lung adenocarcinoma: grading assessment and prognostic model construction","authors":"Junyi Shen,&nbsp;Anqi Lin,&nbsp;Ting Wei,&nbsp;Jian Zhang,&nbsp;Peng Luo","doi":"10.1016/j.lanwpc.2024.101352","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>With the widespread application of generative AI (GenAI) models, it is crucial to systematically evaluate their performance in lung adenocarcinoma histopathological assessment. This study aimed to evaluate and compare the performance of three GenAI models with visual capabilities (GPT-4o, Claude-3.5-Sonnet, and Gemini-1.5-Pro) in lung adenocarcinoma histological pattern recognition and grading, and to explore the construction of prognostic prediction models based on GenAI feature extraction.</div></div><div><h3>Methods</h3><div>This retrospective study extracted 310 diagnostic slides from the TCGA-LUAD database for model evaluation. An additional 87 diagnostic pathology slides from local lung adenocarcinoma surgical patients were used for external validation of the prognostic model. Primary outcomes were GenAI grading accuracy and stability, measured by the area under the receiver operating characteristic curve (AUC) and intraclass correlation coefficient (ICC), respectively. Secondary outcomes included the construction and assessment of machine learning-based prognostic prediction models, utilizing features extracted by GenAI, with model performance evaluated using the Concordance index (C-index).</div></div><div><h3>Findings</h3><div>Claude-3.5-Sonnet demonstrated the best overall performance, combining high grading accuracy (average AUC = 0.82) with moderate stability (ICC = 0.59) The optimal machine learning-based prognostic model, constructed using features extracted by Claude-3.5-Sonnet and incorporating clinical variables, showed good performance in both internal and external validation, with an average C-index of 0.72. Meta-analysis demonstrated that this prognostic model effectively stratified patients into risk groups, with the high-risk group showing significantly worse outcomes (Hazard ratio = 6.44, 95% confidence interval = 3.42-12.14).</div></div><div><h3>Interpretation</h3><div>This study demonstrates the potential application value of GenAI models in lung adenocarcinoma histopathological assessment. Claude-3.5-Sonnet demonstrated the highest grading accuracy, and the machine learning-based prognostic model that utilized its feature extraction showed good predictive capabilities. These findings provide new research directions for AI-assisted pathological diagnosis and prognostic prediction, with the potential to improve the management of lung adenocarcinoma patients.</div></div>","PeriodicalId":22792,"journal":{"name":"The Lancet Regional Health: Western Pacific","volume":"55 ","pages":"Article 101352"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Lancet Regional Health: Western Pacific","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666606524003468","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

背景随着生成式人工智能(GenAI)模型的广泛应用,系统评估其在肺腺癌组织病理学评估中的性能至关重要。本研究旨在评估和比较三种具有视觉能力的 GenAI 模型(GPT-4o、Claude-3.5-Sonnet 和 Gemini-1.5-Pro)在肺腺癌组织学模式识别和分级中的表现,并探索基于 GenAI 特征提取构建预后预测模型。另外87张来自当地肺腺癌手术患者的诊断病理切片用于预后模型的外部验证。主要结果是GenAI分级的准确性和稳定性,分别用接收者操作特征曲线下面积(AUC)和类内相关系数(ICC)来衡量。次要结果包括利用 GenAI 提取的特征构建和评估基于机器学习的预后预测模型,并使用一致性指数(C-index)评估模型性能。利用 Claude-3.5-Sonnet 提取的特征并结合临床变量构建的基于机器学习的最佳预后模型在内部和外部验证中均表现良好,平均 C-index 为 0.72。Meta分析表明,该预后模型能有效地将患者分为不同的风险组,其中高风险组的预后明显较差(危险比=6.44,95%置信区间=3.42-12.14)。Claude-3.5-Sonnet显示了最高的分级准确性,利用其特征提取的基于机器学习的预后模型显示了良好的预测能力。这些发现为人工智能辅助病理诊断和预后预测提供了新的研究方向,有望改善肺腺癌患者的管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating generative AI models for explainable pathological feature extraction in lung adenocarcinoma: grading assessment and prognostic model construction

Background

With the widespread application of generative AI (GenAI) models, it is crucial to systematically evaluate their performance in lung adenocarcinoma histopathological assessment. This study aimed to evaluate and compare the performance of three GenAI models with visual capabilities (GPT-4o, Claude-3.5-Sonnet, and Gemini-1.5-Pro) in lung adenocarcinoma histological pattern recognition and grading, and to explore the construction of prognostic prediction models based on GenAI feature extraction.

Methods

This retrospective study extracted 310 diagnostic slides from the TCGA-LUAD database for model evaluation. An additional 87 diagnostic pathology slides from local lung adenocarcinoma surgical patients were used for external validation of the prognostic model. Primary outcomes were GenAI grading accuracy and stability, measured by the area under the receiver operating characteristic curve (AUC) and intraclass correlation coefficient (ICC), respectively. Secondary outcomes included the construction and assessment of machine learning-based prognostic prediction models, utilizing features extracted by GenAI, with model performance evaluated using the Concordance index (C-index).

Findings

Claude-3.5-Sonnet demonstrated the best overall performance, combining high grading accuracy (average AUC = 0.82) with moderate stability (ICC = 0.59) The optimal machine learning-based prognostic model, constructed using features extracted by Claude-3.5-Sonnet and incorporating clinical variables, showed good performance in both internal and external validation, with an average C-index of 0.72. Meta-analysis demonstrated that this prognostic model effectively stratified patients into risk groups, with the high-risk group showing significantly worse outcomes (Hazard ratio = 6.44, 95% confidence interval = 3.42-12.14).

Interpretation

This study demonstrates the potential application value of GenAI models in lung adenocarcinoma histopathological assessment. Claude-3.5-Sonnet demonstrated the highest grading accuracy, and the machine learning-based prognostic model that utilized its feature extraction showed good predictive capabilities. These findings provide new research directions for AI-assisted pathological diagnosis and prognostic prediction, with the potential to improve the management of lung adenocarcinoma patients.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
The Lancet Regional Health: Western Pacific
The Lancet Regional Health: Western Pacific Medicine-Pediatrics, Perinatology and Child Health
CiteScore
8.80
自引率
2.80%
发文量
305
审稿时长
11 weeks
期刊介绍: The Lancet Regional Health – Western Pacific, a gold open access journal, is an integral part of The Lancet's global initiative advocating for healthcare quality and access worldwide. It aims to advance clinical practice and health policy in the Western Pacific region, contributing to enhanced health outcomes. The journal publishes high-quality original research shedding light on clinical practice and health policy in the region. It also includes reviews, commentaries, and opinion pieces covering diverse regional health topics, such as infectious diseases, non-communicable diseases, child and adolescent health, maternal and reproductive health, aging health, mental health, the health workforce and systems, and health policy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信