乳腺超声成像中的gpt - 40和专业人工智能:准确性、一致性、局限性和诊断潜力的比较研究。

IF 2.1 4区 医学 Q2 ACOUSTICS
Deniz Esin Tekcan Sanli, Ahmet Necati Sanli, Yildiz Buyukdereli Atadag, Atakan Kurt, Emel Esmerer
{"title":"乳腺超声成像中的gpt - 40和专业人工智能:准确性、一致性、局限性和诊断潜力的比较研究。","authors":"Deniz Esin Tekcan Sanli, Ahmet Necati Sanli, Yildiz Buyukdereli Atadag, Atakan Kurt, Emel Esmerer","doi":"10.1002/jum.16749","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study aimed to evaluate the ability of ChatGPT and Breast Ultrasound Helper, a special ChatGPT-based subprogram trained on ultrasound image analysis, to analyze and differentiate benign and malignant breast lesions on ultrasound images.</p><p><strong>Methods: </strong>Ultrasound images of histopathologically confirmed breast cancer and fibroadenoma patients were read GPT-4o (the latest ChatGPT version) and Breast Ultrasound Helper (BUH), a tool from the \"Explore\" section of ChatGPT. Both were prompted in English using ACR BI-RADS Breast Ultrasound Lexicon criteria: lesion shape, orientation, margin, internal echo pattern, echogenicity, posterior acoustic features, microcalcifications or hyperechoic foci, perilesional hyperechoic rim, edema or architectural distortion, lesion size, and BI-RADS category. Two experienced radiologists evaluated the images and the responses of the programs in consensus. The outputs, BI-RADS category agreement, and benign/malignant discrimination were statistically compared.</p><p><strong>Results: </strong>A total of 232 ultrasound images were analyzed, of which 133 (57.3%) were malignant and 99 (42.7%) benign. In comparative analysis, BUH showed superior performance overall, with higher kappa values and statistically significant results across multiple features (P .001). However, the overall level of agreement with the radiologists' consensus for all features was similar for BUH (κ: 0.387-0.755) and GPT-4o (κ: 0.317-0.803). On the other hand, BI-RADS category agreement was slightly higher in GPT-4o than in BUH (69.4% versus 65.9%), but BUH was slightly more successful in distinguishing benign lesions from malignant lesions (65.9% versus 67.7%).</p><p><strong>Conclusions: </strong>Although both AI tools show moderate-good performance in ultrasound image analysis, their limited compatibility with radiologists' evaluations and BI-RADS categorization suggests that their clinical application in breast ultrasound interpretation is still early and unreliable.</p>","PeriodicalId":17563,"journal":{"name":"Journal of Ultrasound in Medicine","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GPT-4o and Specialized AI in Breast Ultrasound Imaging: A comparative Study on Accuracy, Agreement, Limitations, and Diagnostic Potential.\",\"authors\":\"Deniz Esin Tekcan Sanli, Ahmet Necati Sanli, Yildiz Buyukdereli Atadag, Atakan Kurt, Emel Esmerer\",\"doi\":\"10.1002/jum.16749\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>This study aimed to evaluate the ability of ChatGPT and Breast Ultrasound Helper, a special ChatGPT-based subprogram trained on ultrasound image analysis, to analyze and differentiate benign and malignant breast lesions on ultrasound images.</p><p><strong>Methods: </strong>Ultrasound images of histopathologically confirmed breast cancer and fibroadenoma patients were read GPT-4o (the latest ChatGPT version) and Breast Ultrasound Helper (BUH), a tool from the \\\"Explore\\\" section of ChatGPT. Both were prompted in English using ACR BI-RADS Breast Ultrasound Lexicon criteria: lesion shape, orientation, margin, internal echo pattern, echogenicity, posterior acoustic features, microcalcifications or hyperechoic foci, perilesional hyperechoic rim, edema or architectural distortion, lesion size, and BI-RADS category. Two experienced radiologists evaluated the images and the responses of the programs in consensus. The outputs, BI-RADS category agreement, and benign/malignant discrimination were statistically compared.</p><p><strong>Results: </strong>A total of 232 ultrasound images were analyzed, of which 133 (57.3%) were malignant and 99 (42.7%) benign. In comparative analysis, BUH showed superior performance overall, with higher kappa values and statistically significant results across multiple features (P .001). However, the overall level of agreement with the radiologists' consensus for all features was similar for BUH (κ: 0.387-0.755) and GPT-4o (κ: 0.317-0.803). On the other hand, BI-RADS category agreement was slightly higher in GPT-4o than in BUH (69.4% versus 65.9%), but BUH was slightly more successful in distinguishing benign lesions from malignant lesions (65.9% versus 67.7%).</p><p><strong>Conclusions: </strong>Although both AI tools show moderate-good performance in ultrasound image analysis, their limited compatibility with radiologists' evaluations and BI-RADS categorization suggests that their clinical application in breast ultrasound interpretation is still early and unreliable.</p>\",\"PeriodicalId\":17563,\"journal\":{\"name\":\"Journal of Ultrasound in Medicine\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Ultrasound in Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/jum.16749\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Ultrasound in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/jum.16749","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

摘要

目的:本研究旨在评估ChatGPT和基于ChatGPT的超声图像分析专用子程序Breast Ultrasound Helper在超声图像上分析和区分乳腺良恶性病变的能力。方法:对组织病理学证实的乳腺癌和纤维腺瘤患者的超声图像进行读取gpt - 40 (ChatGPT最新版本)和乳腺超声辅助器(BUH),该工具来自ChatGPT的“Explore”部分。两者均采用ACR BI-RADS乳腺超声词典标准进行英文提示:病变形状、方向、边缘、内部回声模式、回声强度、后方声学特征、微钙化或高回声灶、病灶周围高回声边缘、水肿或结构畸变、病变大小和BI-RADS分类。两位经验丰富的放射科医生对图像和程序的反应进行了一致评估。结果、BI-RADS分类一致性和良/恶性歧视进行统计学比较。结果:共分析超声图像232张,其中恶性133张(57.3%),良性99张(42.7%)。在比较分析中,BUH表现出更优越的整体性能,kappa值更高,多个特征的结果具有统计学意义(P .001)。然而,对于BUH (κ: 0.387-0.755)和gpt - 40 (κ: 0.317-0.803)的所有特征,与放射科医生共识的总体一致程度相似。另一方面,gpt - 40的BI-RADS分类一致性略高于BUH(69.4%比65.9%),但BUH在区分良性病变和恶性病变方面略成功(65.9%比67.7%)。结论:尽管这两种人工智能工具在超声图像分析方面表现出中佳的表现,但它们与放射科医生的评估和BI-RADS分类的兼容性有限,表明它们在乳腺超声解释中的临床应用仍处于早期阶段且不可靠。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GPT-4o and Specialized AI in Breast Ultrasound Imaging: A comparative Study on Accuracy, Agreement, Limitations, and Diagnostic Potential.

Objectives: This study aimed to evaluate the ability of ChatGPT and Breast Ultrasound Helper, a special ChatGPT-based subprogram trained on ultrasound image analysis, to analyze and differentiate benign and malignant breast lesions on ultrasound images.

Methods: Ultrasound images of histopathologically confirmed breast cancer and fibroadenoma patients were read GPT-4o (the latest ChatGPT version) and Breast Ultrasound Helper (BUH), a tool from the "Explore" section of ChatGPT. Both were prompted in English using ACR BI-RADS Breast Ultrasound Lexicon criteria: lesion shape, orientation, margin, internal echo pattern, echogenicity, posterior acoustic features, microcalcifications or hyperechoic foci, perilesional hyperechoic rim, edema or architectural distortion, lesion size, and BI-RADS category. Two experienced radiologists evaluated the images and the responses of the programs in consensus. The outputs, BI-RADS category agreement, and benign/malignant discrimination were statistically compared.

Results: A total of 232 ultrasound images were analyzed, of which 133 (57.3%) were malignant and 99 (42.7%) benign. In comparative analysis, BUH showed superior performance overall, with higher kappa values and statistically significant results across multiple features (P .001). However, the overall level of agreement with the radiologists' consensus for all features was similar for BUH (κ: 0.387-0.755) and GPT-4o (κ: 0.317-0.803). On the other hand, BI-RADS category agreement was slightly higher in GPT-4o than in BUH (69.4% versus 65.9%), but BUH was slightly more successful in distinguishing benign lesions from malignant lesions (65.9% versus 67.7%).

Conclusions: Although both AI tools show moderate-good performance in ultrasound image analysis, their limited compatibility with radiologists' evaluations and BI-RADS categorization suggests that their clinical application in breast ultrasound interpretation is still early and unreliable.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.10
自引率
4.30%
发文量
205
审稿时长
1.5 months
期刊介绍: The Journal of Ultrasound in Medicine (JUM) is dedicated to the rapid, accurate publication of original articles dealing with all aspects of medical ultrasound, particularly its direct application to patient care but also relevant basic science, advances in instrumentation, and biological effects. The journal is an official publication of the American Institute of Ultrasound in Medicine and publishes articles in a variety of categories, including Original Research papers, Review Articles, Pictorial Essays, Technical Innovations, Case Series, Letters to the Editor, and more, from an international bevy of countries in a continual effort to showcase and promote advances in the ultrasound community. Represented through these efforts are a wide variety of disciplines of ultrasound, including, but not limited to: -Basic Science- Breast Ultrasound- Contrast-Enhanced Ultrasound- Dermatology- Echocardiography- Elastography- Emergency Medicine- Fetal Echocardiography- Gastrointestinal Ultrasound- General and Abdominal Ultrasound- Genitourinary Ultrasound- Gynecologic Ultrasound- Head and Neck Ultrasound- High Frequency Clinical and Preclinical Imaging- Interventional-Intraoperative Ultrasound- Musculoskeletal Ultrasound- Neurosonology- Obstetric Ultrasound- Ophthalmologic Ultrasound- Pediatric Ultrasound- Point-of-Care Ultrasound- Public Policy- Superficial Structures- Therapeutic Ultrasound- Ultrasound Education- Ultrasound in Global Health- Urologic Ultrasound- Vascular Ultrasound
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信