Hongna Tan, Qingxia Wu, Yaping Wu, Bingjie Zheng, Bo Wang, Yan Chen, Lijuan Du, Jing Zhou, Fangfang Fu, Huihui Guo, Cong Fu, Lun Ma, Pei Dong, Zhong Xue, Dinggang Shen, Meiyun Wang
{"title":"Mammography-based artificial intelligence for breast cancer detection, diagnosis, and BI-RADS categorization using multi-view and multi-level convolutional neural networks.","authors":"Hongna Tan, Qingxia Wu, Yaping Wu, Bingjie Zheng, Bo Wang, Yan Chen, Lijuan Du, Jing Zhou, Fangfang Fu, Huihui Guo, Cong Fu, Lun Ma, Pei Dong, Zhong Xue, Dinggang Shen, Meiyun Wang","doi":"10.1186/s13244-025-01983-x","DOIUrl":"10.1186/s13244-025-01983-x","url":null,"abstract":"<p><strong>Purpose: </strong>We developed an artificial intelligence system (AIS) using multi-view multi-level convolutional neural networks for breast cancer detection, diagnosis, and BI-RADS categorization support in mammography.</p><p><strong>Methods: </strong>Twenty-four thousand eight hundred sixty-six breasts from 12,433 Asian women between August 2012 and December 2018 were enrolled. The study consisted of three parts: (1) evaluation of AIS performance in malignancy diagnosis; (2) stratified analysis of BI-RADS 3-4 subgroups with AIS; and (3) reassessment of BI-RADS 0 breasts with AIS assistance. We further evaluate AIS by conducting a counterbalance-designed AI-assisted study, where ten radiologists read 1302 cases with/without AIS assistance. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, and F1 score were measured.</p><p><strong>Results: </strong>The AIS yielded AUC values of 0.995, 0.933, and 0.947 for malignancy diagnosis in the validation set, testing set 1, and testing set 2, respectively. Within BI-RADS 3-4 subgroups with pathological results, AIS downgraded 83.1% of false-positives into benign groups, and upgraded 54.1% of false-negatives into malignant groups. AIS also successfully assisted radiologists in identifying 7 out of 43 malignancies initially diagnosed with BI-RADS 0, with a specificity of 96.7%. In the counterbalance-designed AI-assisted study, the average AUC across ten readers significantly improved with AIS assistance (p = 0.001).</p><p><strong>Conclusion: </strong>AIS can accurately detect and diagnose breast cancer on mammography and further serve as a supportive tool for BI-RADS categorization.</p><p><strong>Critical relevance statement: </strong>An AI risk assessment tool employing deep learning algorithms was developed and validated for enhancing breast cancer diagnosis from mammograms, to improve risk stratification accuracy, particularly in patients with dense breasts, and serve as a decision support aid for radiologists.</p><p><strong>Key points: </strong>The false positive and negative rates of mammography diagnosis remain high. The AIS can yield a high AUC for malignancy diagnosis. The AIS is important in stratifying BI-RADS categorization.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"109"},"PeriodicalIF":4.1,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12095762/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144110836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep learning feature-based model for predicting lymphovascular invasion in urothelial carcinoma of bladder using CT images.","authors":"Bangxin Xiao, Yang Lv, Canjie Peng, Zongjie Wei, Qiao Xv, Fajin Lv, Qing Jiang, Huayun Liu, Feng Li, Yingjie Xv, Quanhao He, Mingzhao Xiao","doi":"10.1186/s13244-025-01988-6","DOIUrl":"10.1186/s13244-025-01988-6","url":null,"abstract":"<p><strong>Objectives: </strong>Lymphovascular invasion significantly impacts the prognosis of urothelial carcinoma of the bladder. Traditional lymphovascular invasion detection methods are time-consuming and costly. This study aims to develop a deep learning-based model to preoperatively predict lymphovascular invasion status in urothelial carcinoma of bladder using CT images.</p><p><strong>Methods: </strong>Data and CT images of 577 patients across four medical centers were retrospectively collected. The largest tumor slices from the transverse, coronal, and sagittal planes were selected and used to train CNN models (InceptionV3, DenseNet121, ResNet18, ResNet34, ResNet50, and VGG11). Deep learning features were extracted and visualized using Grad-CAM. Principal Component Analysis reduced features to 64. Using the extracted features, Decision Tree, XGBoost, and LightGBM models were trained with 5-fold cross-validation and ensembled in a stacking model. Clinical risk factors were identified through logistic regression analyses and combined with DL scores to enhance lymphovascular invasion prediction accuracy.</p><p><strong>Results: </strong>The ResNet50-based model achieved an AUC of 0.818 in the validation set and 0.708 in the testing set. The combined model showed an AUC of 0.794 in the validation set and 0.767 in the testing set, demonstrating robust performance across diverse data.</p><p><strong>Conclusion: </strong>We developed a robust radiomics model based on deep learning features from CT images to preoperatively predict lymphovascular invasion status in urothelial carcinoma of the bladder. This model offers a non-invasive, cost-effective tool to assist clinicians in personalized treatment planning.</p><p><strong>Critical relevance statement: </strong>We developed a robust radiomics model based on deep learning features from CT images to preoperatively predict lymphovascular invasion status in urothelial carcinoma of the bladder.</p><p><strong>Key points: </strong>We developed a deep learning feature-based stacking model to predict lymphovascular invasion in urothelial carcinoma of the bladder patients using CT. Max cross sections from three dimensions of the CT image are used to train the CNN model. We made comparisons across six CNN networks, including ResNet50.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"108"},"PeriodicalIF":4.1,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12086130/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144093530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenyi Yue, Ruxue Han, Haijie Wang, Xiaoyun Liang, He Zhang, Hua Li, Qi Yang
{"title":"Development and validation of clinical-radiomics deep learning model based on MRI for endometrial cancer molecular subtypes classification.","authors":"Wenyi Yue, Ruxue Han, Haijie Wang, Xiaoyun Liang, He Zhang, Hua Li, Qi Yang","doi":"10.1186/s13244-025-01966-y","DOIUrl":"10.1186/s13244-025-01966-y","url":null,"abstract":"<p><strong>Objectives: </strong>This study aimed to develop and validate a clinical-radiomics deep learning (DL) model based on MRI for endometrial cancer (EC) molecular subtypes classification.</p><p><strong>Methods: </strong>This multicenter retrospective study included EC patients undergoing surgery, MRI, and molecular pathology diagnosis across three institutions from January 2020 to March 2024. Patients were divided into training, internal, and external validation cohorts. A total of 386 handcrafted radiomics features were extracted from each MR sequence, and MoCo-v2 was employed for contrastive self-supervised learning to extract 2048 DL features per patient. Feature selection integrated selected features into 12 machine learning methods. Model performance was evaluated with the AUC.</p><p><strong>Results: </strong>A total of 526 patients were included (mean age, 55.01 ± 11.07). The radiomics model and clinical model demonstrated comparable performance across the internal and external validation cohorts, with macro-average AUCs of 0.70 vs 0.69 and 0.70 vs 0.67 (p = 0.51), respectively. The radiomics DL model, compared to the radiomics model, improved AUCs for POLEmut (0.68 vs 0.79), NSMP (0.71 vs 0.74), and p53abn (0.76 vs 0.78) in the internal validation (p = 0.08). The clinical-radiomics DL Model outperformed both the clinical model and radiomics DL model (macro-average AUC = 0.79 vs 0.69 and 0.73, in the internal validation [p = 0.02], 0.74 vs 0.67 and 0.69 in the external validation [p = 0.04]).</p><p><strong>Conclusions: </strong>The clinical-radiomics DL model based on MRI effectively distinguished EC molecular subtypes and demonstrated strong potential, with robust validation across multiple centers. Future research should explore larger datasets to further uncover DL's potential.</p><p><strong>Critical relevance statement: </strong>Our clinical-radiomics DL model based on MRI has the potential to distinguish EC molecular subtypes. This insight aids in guiding clinicians in tailoring individualized treatments for EC patients.</p><p><strong>Key points: </strong>Accurate classification of EC molecular subtypes is crucial for prognostic risk assessment. The clinical-radiomics DL model outperformed both the clinical model and the radiomics DL model. The MRI features exhibited better diagnostic performance for POLEmut and p53abn.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"107"},"PeriodicalIF":4.1,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12084453/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adrian P Brady, Christian Loewe, Boris Brkljacic, Graciano Paulo, Martina Szucsich, Monika Hierath
{"title":"Correction: Guidelines and recommendations for radiologist staffing, education and training.","authors":"Adrian P Brady, Christian Loewe, Boris Brkljacic, Graciano Paulo, Martina Szucsich, Monika Hierath","doi":"10.1186/s13244-025-01982-y","DOIUrl":"10.1186/s13244-025-01982-y","url":null,"abstract":"","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"105"},"PeriodicalIF":4.1,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12081806/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Workplace equity in radiology: a nationwide survey by the Radiological Society of Finland.","authors":"Suvi Marjasuo, Milja Holstila, Jussi Hirvonen","doi":"10.1186/s13244-025-01975-x","DOIUrl":"10.1186/s13244-025-01975-x","url":null,"abstract":"<p><strong>Objectives: </strong>The issue of equity among medical professionals has been extensively discussed in recent literature. Gender inequity, in particular, is a well-documented phenomenon within scientific communities. The Radiological Society of Finland undertook a national survey to assess equity among radiologists in Finland, with the primary hypothesis of equity prevailing in the radiological community.</p><p><strong>Methods: </strong>A cross-sectional study in the form of an online questionnaire was developed to investigate occupational equity and demographic variables. This survey was disseminated to the heads of radiological departments in all Finnish public healthcare units and the largest radiological units within the private sector, with instructions to distribute to their medical staff. The questionnaire was accessible for responses from May 1 to June 16, 2024.</p><p><strong>Results: </strong>A total of 259 answers were received, representing 31% of all radiologists and residents working in Finland. Among the respondents, 137/259 (52.9%) identified as female, 118/259 (45.6%) male, and 1/259 (0.4%) other, with three choosing not to answer. A significant proportion, 63/259 (24.3%), reported having witnessed discriminatory behavior, while 41/259 (15.8%) had personally experienced discrimination. The prevalence of respondents having witnessed workplace discrimination was notably higher in female respondents (42/131, 32.1%) than in males (18/113, 15.9%) or others (0%) (p = 0.012). The most cited bases for discrimination included gender, opinion, age, and cultural background.</p><p><strong>Conclusions: </strong>Perceived discrimination is prevalent within the Finnish radiological community. Gender was reported as the most common suspected grounds of perceived discriminatory behavior.</p><p><strong>Critical relevance statement: </strong>This study is the first to explore equity and diversity among radiologists in Finland. This broader approach offers a more comprehensive perspective, and the findings aim to support efforts toward greater inclusivity and equity within the field.</p><p><strong>Key points: </strong>One-quarter of radiologists in Finland reported witnessing and one-sixth reported personally experiencing discrimination in the workplace. Gender was suspected to be the most common basis for discrimination, followed by differences in opinion, age, and cultural background. Respondents were largely unaware of whether the reported incidents had been addressed. Increasing transparency and communication may help reduce perceived discrimination.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"106"},"PeriodicalIF":4.1,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12081815/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roberto Farì, Giulia Besutti, Pierpaolo Pattacini, Guido Ligabue, Francesco Piroli, Francesca Mantovani, Alessandro Navazio, Mario Larocca, Carmine Pinto, Paolo Giorgi Rossi, Luigi Tarantini
{"title":"Correction: The role of imaging in defining cardiovascular risk to help cancer patient management: a scoping review.","authors":"Roberto Farì, Giulia Besutti, Pierpaolo Pattacini, Guido Ligabue, Francesco Piroli, Francesca Mantovani, Alessandro Navazio, Mario Larocca, Carmine Pinto, Paolo Giorgi Rossi, Luigi Tarantini","doi":"10.1186/s13244-025-01981-z","DOIUrl":"10.1186/s13244-025-01981-z","url":null,"abstract":"","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"104"},"PeriodicalIF":4.1,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12081778/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huancheng Yang, Yueyue Zhang, Fan Li, Weihao Liu, Haoyang Zeng, Haoyuan Yuan, Zixi Ye, Zexin Huang, Yangguang Yuan, Ye Xiang, Kai Wu, Hanlin Liu
{"title":"CT-based AI framework leveraging multi-scale features for predicting pathological grade and Ki67 index in clear cell renal cell carcinoma: a multicenter study.","authors":"Huancheng Yang, Yueyue Zhang, Fan Li, Weihao Liu, Haoyang Zeng, Haoyuan Yuan, Zixi Ye, Zexin Huang, Yangguang Yuan, Ye Xiang, Kai Wu, Hanlin Liu","doi":"10.1186/s13244-025-01980-0","DOIUrl":"https://doi.org/10.1186/s13244-025-01980-0","url":null,"abstract":"<p><strong>Purpose: </strong>To explore whether a CT-based AI framework, leveraging multi-scale features, can offer a non-invasive approach to accurately predict pathological grade and Ki67 index in clear cell renal cell carcinoma (ccRCC).</p><p><strong>Methods: </strong>In this multicenter retrospective study, a total of 1073 pathologically confirmed ccRCC patients from seven cohorts were split into internal cohorts (training and validation sets) and an external test set. The AI framework comprised an image processor, a 3D-kidney and tumor segmentation model by 3D-UNet, a multi-scale features extractor built upon unsupervised learning, and a multi-task classifier utilizing XGBoost. A quantitative model interpretation technique, known as SHapley Additive exPlanations (SHAP), was employed to explore the contribution of multi-scale features.</p><p><strong>Results: </strong>The 3D-UNet model showed excellent performance in segmenting both the kidney and tumor regions, with Dice coefficients exceeding 0.92. The proposed multi-scale features model exhibited strong predictive capability for pathological grading and Ki67 index, with AUROC values of 0.84 and 0.87, respectively, in the internal validation set, and 0.82 and 0.82, respectively, in the external test set. The SHAP results demonstrated that features from radiomics, the 3D Auto-Encoder, and dimensionality reduction all made significant contributions to both prediction tasks.</p><p><strong>Conclusions: </strong>The proposed AI framework, leveraging multi-scale features, accurately predicts the pathological grade and Ki67 index of ccRCC.</p><p><strong>Critical relevance statement: </strong>The CT-based AI framework leveraging multi-scale features offers a promising avenue for accurately predicting the pathological grade and Ki67 index of ccRCC preoperatively, indicating a direction for non-invasive assessment.</p><p><strong>Key points: </strong>Non-invasively determining pathological grade and Ki67 index in ccRCC could guide treatment decisions. The AI framework integrates segmentation, classification, and model interpretation, enabling fully automated analysis. The AI framework enables non-invasive preoperative detection of high-risk tumors, assisting clinical decision-making.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"102"},"PeriodicalIF":4.1,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12078187/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MRI-based quantification of intratumoral heterogeneity for intrahepatic mass-forming cholangiocarcinoma grading: a multicenter study.","authors":"Liyong Zhuo, Wenjing Chen, Lihong Xing, Xiaomeng Li, Zijun Song, Jinghui Dong, Yanyan Zhang, Hongjun Li, Jingjing Cui, Yuxiao Han, Jiawei Hao, Jianing Wang, Xiaoping Yin, Caiying Li","doi":"10.1186/s13244-025-01985-9","DOIUrl":"https://doi.org/10.1186/s13244-025-01985-9","url":null,"abstract":"<p><strong>Objective: </strong>This study aimed to develop a quantitative approach to measure intratumor heterogeneity (ITH) using MRI scans and predict the pathological grading of intrahepatic mass-forming cholangiocarcinoma (IMCC).</p><p><strong>Methods: </strong>Preoperative MRI scans from IMCC patients were retrospectively obtained from five academic medical centers, covering the period from March 2018 to April 2024. Radiomic features were extracted from the whole tumor and its subregions, which were segmented using K-means clustering. An ITH index was derived from a habitat model integrating output probabilities of the subregions-based models. Significant variables from clinical laboratory-imaging features, radiomics, and the habitat model were integrated into a predictive model, and its performance was evaluated using the area under the receiver operating characteristic curve (AUC).</p><p><strong>Results: </strong>The final training and internal validation datasets included 197 patients (median age, 59 years [IQR, 52-65 years]); the external validation dataset included 43 patients (median age, 58.5 years [IQR, 52.25-69.75 years]). The habitat model achieved AUCs of 0.847 (95% CI: 0.783, 0.911) in the training set and 0.753 (95% CI: 0.595, 0.911) in the internal validation set. Furthermore, the combined model, integrating imaging variables, the habitat model, and radiomics model, demonstrated improved predictive performance, with AUCs of 0.895 (95% CI: 0.845, 0.944) in the training dataset, 0.790 (95% CI: 0.65, 0.931) in the internal validation dataset, and 0.815 (95% CI: 0.68, 0.951) in the external validation dataset.</p><p><strong>Conclusion: </strong>The combined model based on MRI-derived quantification of ITH, along with clinical, laboratory, radiological, and radiomic features, showed good performance in predicting IMCC grading.</p><p><strong>Critical relevance statement: </strong>This model, integrating MRI-derived intrahepatic mass-forming cholangiocarcinoma (IMCC) classification metrics with quantitative radiomic analysis of intratumor heterogeneity (ITH), demonstrates enhanced accuracy in tumor grade prediction, advancing risk stratification for clinical decision-making in IMCC management.</p><p><strong>Key points: </strong>Grading of intrahepatic mass-forming cholangiocarcinoma (IMCC) is important for risk stratification, clinical decision-making, and personalized therapeutic optimization. Quantitative intratumor heterogeneity can accurately predict the pathological grading of IMCC. This combined model provides higher diagnostic accuracy.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"101"},"PeriodicalIF":4.1,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12078897/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large language models for efficient whole-organ MRI score-based reports and categorization in knee osteoarthritis.","authors":"Yuxue Xie, Zhonghua Hu, Hongyue Tao, Yiwen Hu, Haoyu Liang, Xinmin Lu, Lei Wang, Xiangwen Li, Shuang Chen","doi":"10.1186/s13244-025-01976-w","DOIUrl":"10.1186/s13244-025-01976-w","url":null,"abstract":"<p><strong>Objectives: </strong>To evaluate the performance of large language models (LLMs) in automatically generating whole-organ MRI score (WORMS)-based structured MRI reports and predicting osteoarthritis (OA) severity for the knee.</p><p><strong>Methods: </strong>A total of 160 consecutive patients suspected of OA were included. Knee MRI reports were reviewed by three radiologists to establish the WORMS reference standard for 39 key features. GPT-4o and GPT-4o-mini were prompted using in-context knowledge (ICK) and chain-of-thought (COT) to generate WORMS-based structured reports from original reports and to automatically predict the OA severity. Four Orthopedic surgeons reviewed original and LLM-generated reports to conduct pairwise preference and difficulty tests, and their review times were recorded.</p><p><strong>Results: </strong>GPT-4o demonstrated perfect performance in extracting the laterality of the knee (accuracy = 100%). GPT-4o outperformed GPT-4o mini in generating WORMS reports (Accuracy: 93.9% vs 76.2%, respectively). GPT-4o achieved higher recall (87.3% s 46.7%, p < 0.001), while maintaining higher precision compared to GPT-4o mini (94.2% vs 71.2%, p < 0.001). For predicting OA severity, GPT-4o outperformed GPT-4o mini across all prompt strategies (best accuracy: 98.1% vs 68.7%). Surgeons found it easier to extract information and gave more preference to LLM-generated reports over the original reports (both p < 0.001) while spending less time on each report (51.27 ± 9.41 vs 87.42 ± 20.26 s, p < 0.001).</p><p><strong>Conclusion: </strong>GPT-4o generated expert multi-feature, WORMS-based reports from original free-text knee MRI reports. GPT-4o with COT achieved high accuracy in categorizing OA severity. Surgeons reported greater preference and higher efficiency when using LLM-generated reports.</p><p><strong>Critical relevance statement: </strong>The perfect performance of generating WORMS-based reports and the high efficiency and ease of use suggest that integrating LLMs into clinical workflows could greatly enhance productivity and alleviate the documentation burden faced by clinicians in knee OA.</p><p><strong>Key points: </strong>GPT-4o successfully generated WORMS-based knee MRI reports. GPT-4o with COT prompting achieved impressive accuracy in categorizing knee OA severity. Greater preference and higher efficiency were reported for LLM-generated reports.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"16 1","pages":"100"},"PeriodicalIF":4.1,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12078906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144018594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}