Radiology-Artificial Intelligence最新文献

筛选
英文 中文
Estimating Total Lung Volume from Pixel-Level Thickness Maps of Chest Radiographs Using Deep Learning. 利用深度学习从胸片像素级厚度图估计总肺容量。
IF 13.2
Radiology-Artificial Intelligence Pub Date : 2025-07-01 DOI: 10.1148/ryai.240484
Tina Dorosti, Manuel Schultheiß, Philipp Schmette, Jule Heuchert, Johannes Thalhammer, Florian T Gassert, Thorsten Sellerer, Rafael Schick, Kirsten Taphorn, Korbinian Mechlem, Lorenz Birnbacher, Florian Schaff, Franz Pfeiffer, Daniela Pfeiffer
{"title":"Estimating Total Lung Volume from Pixel-Level Thickness Maps of Chest Radiographs Using Deep Learning.","authors":"Tina Dorosti, Manuel Schultheiß, Philipp Schmette, Jule Heuchert, Johannes Thalhammer, Florian T Gassert, Thorsten Sellerer, Rafael Schick, Kirsten Taphorn, Korbinian Mechlem, Lorenz Birnbacher, Florian Schaff, Franz Pfeiffer, Daniela Pfeiffer","doi":"10.1148/ryai.240484","DOIUrl":"10.1148/ryai.240484","url":null,"abstract":"<p><p>Purpose To estimate the total lung volume (TLV) from real and synthetic frontal chest radiographs on a pixel level using lung thickness maps generated by a U-Net deep learning model. Materials and Methods This retrospective study included 5959 chest CT scans from two public datasets, the Lung Nodule Analysis 2016 (Luna16) (<i>n</i> = 656) and the Radiological Society of North America Pulmonary Embolism Detection Challenge 2020 (<i>n</i> = 5303). Additionally, 72 participants were selected from the Klinikum Rechts der Isar dataset (October 2018 through December 2019), each with a corresponding chest radiograph obtained within 7 days. Synthetic radiographs and lung thickness maps were generated using forward projection of CT scans and their lung segmentations. A U-Net model was trained on synthetic radiographs to predict lung thickness maps and estimate TLV. Model performance was assessed using mean squared error (MSE), Pearson correlation coefficient, and two-sided Student <i>t</i> distribution. Results The study included 72 participants (45 male and 27 female participants; 33 healthy participants: mean age, 62 years [range, 34-80 years]; 39 with chronic obstructive pulmonary disease: mean age, 69 years [range, 47-91 years]). TLV predictions showed low error rates (MSE<sub>Public-Synthetic</sub>, 0.16 L<sup>2</sup>; MSE<sub>KRI-Synthetic</sub>, 0.20 L<sup>2</sup>; MSE<sub>KRI-Real</sub>, 0.35 L<sup>2</sup>) and strong correlations with CT-derived reference standard TLV (<i>n</i><sub>Public-Synthetic</sub>, 1191; <i>r</i> = 0.99; <i>P</i> < .001) (<i>n</i><sub>KRI-Synthetic</sub>, 72; <i>r</i> = 0.97; <i>P</i> < .001) (<i>n</i><sub>KRI-Real</sub>, 72; <i>r</i> = 0.91; <i>P</i> < .001). When evaluated on different datasets, the U-Net model achieved the highest performance for TLV estimation on the Luna16 test dataset, with the lowest MSE (0.09 L<sup>2</sup>) and strongest correlation (<i>r</i> = 0.99; <i>P</i> < .001) compared with CT-derived TLV. Conclusion The U-Net-generated pixel-level lung thickness maps successfully estimated TLV for both synthetic and real radiographs. <b>Keywords:</b> Frontal Chest Radiographs, Lung Thickness Map, Pixel-Level, Total Lung Volume, U-Net <i>Supplemental material is available for this article.</i> © RSNA, 2025.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240484"},"PeriodicalIF":13.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144162222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence in Breast US Diagnosis and Report Generation. 人工智能在乳腺诊断和报告生成中的应用。
IF 13.2
Radiology-Artificial Intelligence Pub Date : 2025-07-01 DOI: 10.1148/ryai.240625
Jian Wang, HongTian Tian, Xin Yang, HuaiYu Wu, XiLiang Zhu, RuSi Chen, Ao Chang, YanLin Chen, HaoRan Dou, RuoBing Huang, Jun Cheng, YongSong Zhou, Rui Gao, KeEn Yang, GuoQiu Li, Jing Chen, Dong Ni, JinFeng Xu, Ning Gu, FaJin Dong
{"title":"Artificial Intelligence in Breast US Diagnosis and Report Generation.","authors":"Jian Wang, HongTian Tian, Xin Yang, HuaiYu Wu, XiLiang Zhu, RuSi Chen, Ao Chang, YanLin Chen, HaoRan Dou, RuoBing Huang, Jun Cheng, YongSong Zhou, Rui Gao, KeEn Yang, GuoQiu Li, Jing Chen, Dong Ni, JinFeng Xu, Ning Gu, FaJin Dong","doi":"10.1148/ryai.240625","DOIUrl":"10.1148/ryai.240625","url":null,"abstract":"<p><p>Purpose To develop and evaluate an artificial intelligence (AI) system for generating breast US reports. Materials and Methods This retrospective study included 104 364 cases from three hospitals (January 2020-December 2022). The AI system was trained on 82 896 cases, validated on 10 385 cases, and tested on an internal set (10 383 cases) and two external sets (300 and 400 cases). Under blind review, three senior radiologists (each with >10 years of experience) evaluated AI-generated reports and those written by one midlevel radiologist (with 7 years of experience), as well as reports from three junior radiologists (each with 2-3 years of experience) with and without AI assistance. The primary outcomes included the acceptance rates of Breast Imaging Reporting and Data System (BI-RADS) categories and lesion characteristics. Statistical analysis included one-sided and two-sided McNemar tests for noninferiority and significance testing. Results In external test set 1 (300 cases), the midlevel radiologist and AI system achieved BI-RADS acceptance rates of 95.00% (285 of 300) versus 92.33% (277 of 300) (<i>P</i> < .001, noninferiority test with a prespecified margin of 10%). In external test set 2 (400 cases), three junior radiologists had BI-RADS acceptance rates of 87.00% (348 of 400) versus 90.75% (363 of 400) (<i>P</i> = .06), 86.50% (346 of 400) versus 92.00% (368 of 400) (<i>P</i> = .007), and 84.75% (339 of 400) versus 90.25% (361 of 400) (<i>P</i> = .02) without and with AI assistance, respectively. Conclusion The AI system performed comparably to a midlevel radiologist and aided junior radiologists in BI-RADS classification. <b>Keywords:</b> Neural Networks, Computer-aided Diagnosis, CAD, Ultrasound <i>Supplemental material is available for this article.</i> © RSNA, 2025.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240625"},"PeriodicalIF":13.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144327030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Duke Lung Cancer Screening (DLCS) Dataset: A Reference Dataset of Annotated Low-Dose Screening Thoracic CT. 杜克肺癌筛查(dlc)数据集:注释低剂量筛查胸部CT的参考数据集。
IF 13.2
Radiology-Artificial Intelligence Pub Date : 2025-07-01 DOI: 10.1148/ryai.240248
Avivah J Wang, Fakrul Islam Tushar, Michael R Harowicz, Betty C Tong, Kyle J Lafata, Tina D Tailor, Joseph Y Lo
{"title":"The Duke Lung Cancer Screening (DLCS) Dataset: A Reference Dataset of Annotated Low-Dose Screening Thoracic CT.","authors":"Avivah J Wang, Fakrul Islam Tushar, Michael R Harowicz, Betty C Tong, Kyle J Lafata, Tina D Tailor, Joseph Y Lo","doi":"10.1148/ryai.240248","DOIUrl":"10.1148/ryai.240248","url":null,"abstract":"","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240248"},"PeriodicalIF":13.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12319698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144053087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retrieval-Augmented Generation with Large Language Models in Radiology: From Theory to Practice. 放射学中大语言模型的检索增强生成:从理论到实践。
IF 13.2
Radiology-Artificial Intelligence Pub Date : 2025-07-01 DOI: 10.1148/ryai.240790
Anna Fink, Alexander Rau, Marco Reisert, Fabian Bamberg, Maximilian F Russe
{"title":"Retrieval-Augmented Generation with Large Language Models in Radiology: From Theory to Practice.","authors":"Anna Fink, Alexander Rau, Marco Reisert, Fabian Bamberg, Maximilian F Russe","doi":"10.1148/ryai.240790","DOIUrl":"10.1148/ryai.240790","url":null,"abstract":"<p><p>Large language models (LLMs) hold substantial promise in addressing the growing workload in radiology, but recent studies also reveal limitations, such as hallucinations and opacity in sources for LLM responses. Retrieval-augmented generation (RAG)-based LLMs offer a promising approach to streamline radiology workflows by integrating reliable, verifiable, and customizable information. Ongoing refinement is critical in order to enable RAG models to manage large amounts of input data and to engage in complex multiagent dialogues. This report provides an overview of recent advances in LLM architecture, including few-shot and zero-shot learning, RAG integration, multistep reasoning, and agentic RAG, and identifies future research directions. Exemplary cases demonstrate the practical application of these techniques in radiology practice. <b>Keywords:</b> Artificial Intelligence, Deep Learning, Natural Language Processing, Tomography, x-Ray © RSNA, 2025.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240790"},"PeriodicalIF":13.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The BraTS-Africa Dataset: Expanding the Brain Tumor Segmentation Data to Capture African Populations. BraTS- africa数据集:扩展脑肿瘤分割(BraTS)数据以捕获非洲人口。
IF 13.2
Radiology-Artificial Intelligence Pub Date : 2025-07-01 DOI: 10.1148/ryai.240528
Maruf Adewole, Jeffrey D Rudie, Anu Gbadamosi, Dong Zhang, Confidence Raymond, James Ajigbotoshso, Oluyemisi Toyobo, Kenneth Aguh, Olubukola Omidiji, Rachel Akinola, Mohammad Abba Suwaid, Adaobi Emegoakor, Nancy Ojo, Chinasa Kalaiwo, Gabriel Babatunde, Afolabi Ogunleye, Yewande Gbadamosi, Kator Iorpagher, Mayomi Onuwaje, Bamidele Betiku, Jasmine Cakmak, Björn Menze, Ujjwal Baid, Spyridon Bakas, Farouk Dako, Abiodun Fatade, Udunna C Anazodo
{"title":"The BraTS-Africa Dataset: Expanding the Brain Tumor Segmentation Data to Capture African Populations.","authors":"Maruf Adewole, Jeffrey D Rudie, Anu Gbadamosi, Dong Zhang, Confidence Raymond, James Ajigbotoshso, Oluyemisi Toyobo, Kenneth Aguh, Olubukola Omidiji, Rachel Akinola, Mohammad Abba Suwaid, Adaobi Emegoakor, Nancy Ojo, Chinasa Kalaiwo, Gabriel Babatunde, Afolabi Ogunleye, Yewande Gbadamosi, Kator Iorpagher, Mayomi Onuwaje, Bamidele Betiku, Jasmine Cakmak, Björn Menze, Ujjwal Baid, Spyridon Bakas, Farouk Dako, Abiodun Fatade, Udunna C Anazodo","doi":"10.1148/ryai.240528","DOIUrl":"10.1148/ryai.240528","url":null,"abstract":"","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240528"},"PeriodicalIF":13.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12319694/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143989079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of Scanner Manufacturer, Endorectal Coil Use, and Clinical Variables on Deep Learning-assisted Prostate Cancer Classification Using Multiparametric MRI. 扫描仪制造商、直肠内线圈使用和临床变量对多参数MRI深度学习辅助前列腺癌分类的影响。
IF 8.1
Radiology-Artificial Intelligence Pub Date : 2025-05-01 DOI: 10.1148/ryai.230555
José Guilherme de Almeida, Nuno M Rodrigues, Ana Sofia Castro Verde, Ana Mascarenhas Gaivão, Carlos Bilreiro, Inês Santiago, Joana Ip, Sara Belião, Celso Matos, Sara Silva, Manolis Tsiknakis, Kostantinos Marias, Daniele Regge, Nikolaos Papanikolaou
{"title":"Impact of Scanner Manufacturer, Endorectal Coil Use, and Clinical Variables on Deep Learning-assisted Prostate Cancer Classification Using Multiparametric MRI.","authors":"José Guilherme de Almeida, Nuno M Rodrigues, Ana Sofia Castro Verde, Ana Mascarenhas Gaivão, Carlos Bilreiro, Inês Santiago, Joana Ip, Sara Belião, Celso Matos, Sara Silva, Manolis Tsiknakis, Kostantinos Marias, Daniele Regge, Nikolaos Papanikolaou","doi":"10.1148/ryai.230555","DOIUrl":"10.1148/ryai.230555","url":null,"abstract":"<p><p>Purpose To assess the effect of scanner manufacturer and scanning protocol on the performance of deep learning models to classify aggressiveness of prostate cancer (PCa) at biparametric MRI (bpMRI). Materials and Methods In this retrospective study, 5478 cases from ProstateNet, a PCa bpMRI dataset with examinations from 13 centers, were used to develop five deep learning (DL) models to predict PCa aggressiveness with minimal lesion information and test how using data from different subgroups-scanner manufacturers and endorectal coil (ERC) use (Siemens, Philips, GE with and without ERC, and the full dataset)-affects model performance. Performance was assessed using the area under the receiver operating characteristic curve (AUC). The effect of clinical features (age, prostate-specific antigen level, Prostate Imaging Reporting and Data System score) on model performance was also evaluated. Results DL models were trained on 4328 bpMRI cases, and the best model achieved an AUC of 0.73 when trained and tested using data from all manufacturers. Held-out test set performance was higher when models trained with data from a manufacturer were tested on the same manufacturer (within- and between-manufacturer AUC differences of 0.05 on average, <i>P</i> < .001). The addition of clinical features did not improve performance (<i>P</i> = .24). Learning curve analyses showed that performance remained stable as training data increased. Analysis of DL features showed that scanner manufacturer and scanning protocol heavily influenced feature distributions. Conclusion In automated classification of PCa aggressiveness using bpMRI data, scanner manufacturer and ERC use had a major effect on DL model performance and features. <b>Keywords:</b> Convolutional Neural Network (CNN), Computer-aided Diagnosis (CAD), Computer Applications-General (Informatics), Oncology <i>Supplemental material is available for this article.</i> Published under a CC BY 4.0 license. See also commentary by Suri and Hsu in this issue.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e230555"},"PeriodicalIF":8.1,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143013116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning-based Aligned Strain from Cine Cardiac MRI for Detection of Fibrotic Myocardial Tissue in Patients with Duchenne Muscular Dystrophy. 基于深度学习的Cine心脏MRI对齐应变检测杜氏肌营养不良患者纤维化心肌组织。
IF 8.1
Radiology-Artificial Intelligence Pub Date : 2025-05-01 DOI: 10.1148/ryai.240303
Sven Koehler, Julian Kuhm, Tyler Huffaker, Daniel Young, Animesh Tandon, Florian André, Norbert Frey, Gerald Greil, Tarique Hussain, Sandy Engelhardt
{"title":"Deep Learning-based Aligned Strain from Cine Cardiac MRI for Detection of Fibrotic Myocardial Tissue in Patients with Duchenne Muscular Dystrophy.","authors":"Sven Koehler, Julian Kuhm, Tyler Huffaker, Daniel Young, Animesh Tandon, Florian André, Norbert Frey, Gerald Greil, Tarique Hussain, Sandy Engelhardt","doi":"10.1148/ryai.240303","DOIUrl":"10.1148/ryai.240303","url":null,"abstract":"<p><p>Purpose To develop a deep learning (DL) model that derives aligned strain values from cine (noncontrast) cardiac MRI and evaluate performance of these values to predict myocardial fibrosis in patients with Duchenne muscular dystrophy (DMD). Materials and Methods This retrospective study included 139 male patients with DMD who underwent cardiac MRI at a single center between February 2018 and April 2023. A DL pipeline was developed to detect five key frames throughout the cardiac cycle and respective dense deformation fields, allowing for phase-specific strain analysis across patients and from one key frame to the next. Effectiveness of these strain values in identifying abnormal deformations associated with fibrotic segments was evaluated in 57 patients (mean age [± SD], 15.2 years ± 3.1), and reproducibility was assessed in 82 patients by comparing the study method with existing feature-tracking and DL-based methods. Statistical analysis compared strain values using <i>t</i> tests, mixed models, and more than 2000 machine learning models; accuracy, F1 score, sensitivity, and specificity are reported. Results DL-based aligned strain identified five times more differences (29 vs five; <i>P</i> < .01) between fibrotic and nonfibrotic segments compared with traditional strain values and identified abnormal diastolic deformation patterns often missed with traditional methods. In addition, aligned strain values enhanced performance of predictive models for myocardial fibrosis detection, improving specificity by 40%, overall accuracy by 17%, and accuracy in patients with preserved ejection fraction by 61%. Conclusion The proposed aligned strain technique enables motion-based detection of myocardial dysfunction at noncontrast cardiac MRI, facilitating detailed interpatient strain analysis and allowing precise tracking of disease progression in DMD. <b>Keywords:</b> Pediatrics, Image Postprocessing, Heart, Cardiac, Convolutional Neural Network (CNN) Duchenne Muscular Dystrophy <i>Supplemental material is available for this article.</i> © RSNA, 2025.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240303"},"PeriodicalIF":8.1,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12127955/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Natural Language Processing for Everyone. 每个人的自然语言处理。
IF 13.2
Radiology-Artificial Intelligence Pub Date : 2025-05-01 DOI: 10.1148/ryai.250218
Quirin D Strotzer
{"title":"Natural Language Processing for Everyone.","authors":"Quirin D Strotzer","doi":"10.1148/ryai.250218","DOIUrl":"10.1148/ryai.250218","url":null,"abstract":"","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":"7 3","pages":"e250218"},"PeriodicalIF":13.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144053089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and Validation of a Sham-AI Model for Intracranial Aneurysm Detection at CT Angiography. 开发并验证用于 CT 血管造影检测颅内动脉瘤的模拟人工智能模型
IF 8.1
Radiology-Artificial Intelligence Pub Date : 2025-05-01 DOI: 10.1148/ryai.240140
Zhao Shi, Bin Hu, Mengjie Lu, Manting Zhang, Haiting Yang, Bo He, Jiyao Ma, Chunfeng Hu, Li Lu, Sheng Li, Shiyu Ren, Yonggao Zhang, Jun Li, Mayidili Nijiati, Jiake Dong, Hao Wang, Zhen Zhou, Fandong Zhang, Chengwei Pan, Yizhou Yu, Zijian Chen, Chang Sheng Zhou, Yongyue Wei, Junlin Zhou, Long Jiang Zhang
{"title":"Development and Validation of a Sham-AI Model for Intracranial Aneurysm Detection at CT Angiography.","authors":"Zhao Shi, Bin Hu, Mengjie Lu, Manting Zhang, Haiting Yang, Bo He, Jiyao Ma, Chunfeng Hu, Li Lu, Sheng Li, Shiyu Ren, Yonggao Zhang, Jun Li, Mayidili Nijiati, Jiake Dong, Hao Wang, Zhen Zhou, Fandong Zhang, Chengwei Pan, Yizhou Yu, Zijian Chen, Chang Sheng Zhou, Yongyue Wei, Junlin Zhou, Long Jiang Zhang","doi":"10.1148/ryai.240140","DOIUrl":"10.1148/ryai.240140","url":null,"abstract":"<p><p>Purpose To evaluate a sham-artificial intelligence (AI) model acting as a placebo control for a standard-AI model for diagnosis of intracranial aneurysm. Materials and Methods This retrospective crossover, blinded, multireader, multicase study was conducted from November 2022 to March 2023. A sham-AI model with near-zero sensitivity and similar specificity to a standard AI model was developed using 16 422 CT angiography examinations. Digital subtraction angiography-verified CT angiographic examinations from four hospitals were collected, half of which were processed by standard AI and the others by sham AI to generate sequence A; sequence B was generated in the reverse order. Twenty-eight radiologists from seven hospitals were randomly assigned to either sequence and then assigned to the other sequence after a washout period. The diagnostic performances of radiologists alone, radiologists with standard-AI assistance, and radiologists with sham-AI assistance were compared using sensitivity and specificity, and radiologists' susceptibility to sham AI suggestions was assessed. Results The testing dataset included 300 patients (median age, 61.0 years [IQR, 52.0-67.0]; 199 male), 50 of whom had aneurysms. Standard AI and sham AI performed as expected (sensitivity, 96.0% vs 0.0%; specificity, 82.0% vs 76.0%). The differences in sensitivity and specificity between standard AI-assisted and sham AI-assisted readings were 20.7% (95% CI: 15.8, 25.5 [superiority]) and 0.0% (95% CI: -2.0, 2.0 [noninferiority]), respectively. The difference between sham AI-assisted readings and radiologists alone was -2.6% (95% CI: -3.8, -1.4 [noninferiority]) for both sensitivity and specificity. After sham-AI suggestions, 5.3% (44 of 823) of true-positive and 1.2% (seven of 577) of false-negative results of radiologists alone were changed. Conclusion Radiologists' diagnostic performance was not compromised when aided by the proposed sham-AI model compared with their unassisted performance. <b>Keywords:</b> CT Angiography, Vascular, Intracranial Aneurysm, Sham AI <i>Supplemental material is available for this article.</i> Published under a CC BY 4.0 license. See also commentary by Mayfield and Romero in this issue.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240140"},"PeriodicalIF":8.1,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143658885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open-Weight Language Models and Retrieval-Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports: Assessment of Approaches and Parameters. 从诊断报告中自动提取结构化数据的开放权重语言模型和检索增强生成:方法和参数的评估。
IF 8.1
Radiology-Artificial Intelligence Pub Date : 2025-05-01 DOI: 10.1148/ryai.240551
Mohamed Sobhi Jabal, Pranav Warman, Jikai Zhang, Kartikeye Gupta, Ayush Jain, Maciej Mazurowski, Walter Wiggins, Kirti Magudia, Evan Calabrese
{"title":"Open-Weight Language Models and Retrieval-Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports: Assessment of Approaches and Parameters.","authors":"Mohamed Sobhi Jabal, Pranav Warman, Jikai Zhang, Kartikeye Gupta, Ayush Jain, Maciej Mazurowski, Walter Wiggins, Kirti Magudia, Evan Calabrese","doi":"10.1148/ryai.240551","DOIUrl":"10.1148/ryai.240551","url":null,"abstract":"<p><p>Purpose To develop and evaluate an automated system for extracting structured clinical information from unstructured radiology and pathology reports using open-weight language models (LMs) and retrieval-augmented generation (RAG) and to assess the effects of model configuration variables on extraction performance. Materials and Methods This retrospective study used two datasets: 7294 radiology reports annotated for Brain Tumor Reporting and Data System (BT-RADS) scores and 2154 pathology reports annotated for <i>IDH</i> mutation status (January 2017-July 2021). An automated pipeline was developed to benchmark the performance of various LMs and RAG configurations for accuracy of structured data extraction from reports. The effect of model size, quantization, prompting strategies, output formatting, and inference parameters on model accuracy was systematically evaluated. Results The best-performing models achieved up to 98% accuracy in extracting BT-RADS scores from radiology reports and greater than 90% accuracy for extraction of <i>IDH</i> mutation status from pathology reports. The best model was medical fine-tuned Llama 3. Larger, newer, and domain fine-tuned models consistently outperformed older and smaller models (mean accuracy, 86% vs 75%; <i>P</i> < .001). Model quantization had minimal effect on performance. Few-shot prompting significantly improved accuracy (mean [±SD] increase, 32% ± 32; <i>P</i> = .02). RAG improved performance for complex pathology reports by a mean of 48% ± 11 (<i>P</i> = .001) but not for shorter radiology reports (-8% ± 31; <i>P</i> = .39). Conclusion This study demonstrates the potential of open LMs in automated extraction of structured clinical data from unstructured clinical reports with local privacy-preserving application. Careful model selection, prompt engineering, and semiautomated optimization using annotated data are critical for optimal performance. <b>Keywords:</b> Large Language Models, Retrieval-Augmented Generation, Radiology, Pathology, Health Care Reports <i>Supplemental material is available for this article.</i> © RSNA, 2025 See also commentary by Tejani and Rauschecker in this issue.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240551"},"PeriodicalIF":8.1,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143606547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信