Won-Seok Yoo, Jinwoo Son, Jin Young Kim, Jun Hye Park, Hee Jun Park, Cherry Kim, Byoung Wook Choi, Young Joo Suh
{"title":"Large Language Models Versus Human Readers in CAD-RADS 2.0 Categorization of Coronary CT Angiography Reports.","authors":"Won-Seok Yoo, Jinwoo Son, Jin Young Kim, Jun Hye Park, Hee Jun Park, Cherry Kim, Byoung Wook Choi, Young Joo Suh","doi":"10.1007/s10278-025-01704-2","DOIUrl":"https://doi.org/10.1007/s10278-025-01704-2","url":null,"abstract":"<p><p>This study evaluated the accuracy of large language models (LLMs) in assigning Coronary Artery Disease Reporting and Data System (CAD-RADS) 2.0 categories and modifiers based on real-world coronary CT angiography (CCTA) reports and compared their accuracy with human readers. From 2752 eligible CCTA reports generated at an academic hospital between January and September 2024, 180 were randomly selected to fit a balanced distribution of categories and modifiers. The reference standard was established by consensus between two expert cardiac radiologists with 15 and 14 years of experience, respectively. Four LLMs (O1, GPT-4o, GPT-4, GPT-3.5-turbo) and four human readers (a cardiac radiologist, a fellow, two residents) independently assigned CAD-RADS categories and modifiers for each report. For LLMs, the input prompt consisted of the report and a summary of CAD-RADS 2.0. The accuracy of evaluators in full CAD-RADS categorization was compared with O1 using McNemar tests. O1 demonstrated the highest accuracy (90.7%) in full CAD-RADS categorization, outperforming GPT-4o (73.8%), GPT-4 (59.7%), GPT-3.5-turbo (25.8%), the fellow (83.3%), and resident 1 (83.3%; all P-values ≤ 0.01). However, there was no significant difference in accuracy when compared to the cardiac radiologist (86.1%; P = 0.12) and resident 2 (89.4%; P = 0.68). Processing time per report ranged 1.34-16.61 s for LLMs, whereas human readers required 32.10-55.06 s. In the external validation dataset (n = 327) derived from two independent institutions, O1 achieved 95.7% accuracy for full CAD-RADS categorization. In conclusion, compared to human readers, O1 exhibited similar or higher accuracy and shorter processing times to produce a full CAD-RADS 2.0 categorization based on CCTA reports.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145240690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felipe Kitamura, Timothy Kline, Daniel Warren, Linda Moy, Roxana Daneshjou, Farhad Maleki, Igor Santos, Judy Gichoya, Walter Wiggins, Brian Bialecki, Kevin O'Donnell, Adam E Flanders, Matt Morgan, Nabile Safdar, Katherine P Andriole, Raym Geis, Bibb Allen, Keith Dreyer, Matt Lungren, Monica J Wood, Marc Kohli, Steve Langer, George Shih, Eduardo Farina, Charles E Kahn, Ingrid Reiser, Maryellen Giger, Christoph Wald, John Mongan, Tessa Cook, Neil Tenenholtz
{"title":"Teaching AI for Radiology Applications: a Multisociety-Recommended Syllabus from the AAPM, ACR, RSNA, and SIIM.","authors":"Felipe Kitamura, Timothy Kline, Daniel Warren, Linda Moy, Roxana Daneshjou, Farhad Maleki, Igor Santos, Judy Gichoya, Walter Wiggins, Brian Bialecki, Kevin O'Donnell, Adam E Flanders, Matt Morgan, Nabile Safdar, Katherine P Andriole, Raym Geis, Bibb Allen, Keith Dreyer, Matt Lungren, Monica J Wood, Marc Kohli, Steve Langer, George Shih, Eduardo Farina, Charles E Kahn, Ingrid Reiser, Maryellen Giger, Christoph Wald, John Mongan, Tessa Cook, Neil Tenenholtz","doi":"10.1007/s10278-025-01485-8","DOIUrl":"https://doi.org/10.1007/s10278-025-01485-8","url":null,"abstract":"","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145202736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dan Yao, Chengxi Yan, Wang Du, Jingchao Zhang, Zhenzhen Wang, Sha Zhang, Minglei Yang, Shuangfeng Dai
{"title":"Deep Learning-Based Cardiac CT Coronary Motion Correction Method with Temporal Weight Adjustment: Clinical Data Evaluation.","authors":"Dan Yao, Chengxi Yan, Wang Du, Jingchao Zhang, Zhenzhen Wang, Sha Zhang, Minglei Yang, Shuangfeng Dai","doi":"10.1007/s10278-025-01683-4","DOIUrl":"https://doi.org/10.1007/s10278-025-01683-4","url":null,"abstract":"<p><p>Cardiac motion artifacts frequently degrade the quality and interpretability of coronary computed tomography angiography (CCTA) images, making it difficult for radiologists to identify and evaluate the details of the coronary vessels accurately. In this paper, a deep learning-based approach for coronary artery motion compensation, namely a temporal-weighted motion correction network (TW-MoCoNet), was proposed. Firstly, the motion data required for TW-MoCoNet training were generated using a motion artifact simulation method based on the original no-artifact CCTA images. Secondly, TW-MoCoNet, consisting of a temporal weighting correction module and a differentiable spatial transformer module, was trained using these generated paired images. Finally, the proposed method was evaluated on 67 clinical data with objective metrics including peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), fold-overlap ratio (FOR), low-intensity region score (LIRS), and motion artifact score (MAS). Additionally, subjective image quality was evaluated using a 4-point Likert scale to assess visual improvements. The experimental results demonstrated a substantial improvement in both the objective and subjective evaluations of image quality after motion correction was applied. The proportion of the segments with moderate artifacts, scored 2 points, has a notable decrease of 80.2% (from 26.37 to 5.22%), and the proportion of artifact-free segments (scored 4 points) has reached 50.0%, which is of great clinical significance. In conclusion, the deep learning-based motion correction method proposed in this paper can effectively reduce motion artifacts, enhance image clarity, and improve clinical interpretability, thus effectively assisting doctors in accurately identifying and evaluating the details of coronary vessels.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145202708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Era Stambollxhiu, Leonard Freißmuth, Lukas Jakob Moser, Rafael Adolf, Albrecht Will, Eva Hendrich, Keno Bressem, Martin Hadamitzky
{"title":"3D Convolutional Neural Network for Predicting Clinical Outcome from Coronary Computed Tomography Angiography in Patients with Suspected Coronary Artery Disease.","authors":"Era Stambollxhiu, Leonard Freißmuth, Lukas Jakob Moser, Rafael Adolf, Albrecht Will, Eva Hendrich, Keno Bressem, Martin Hadamitzky","doi":"10.1007/s10278-025-01667-4","DOIUrl":"https://doi.org/10.1007/s10278-025-01667-4","url":null,"abstract":"<p><p>This study aims to develop and assess an optimized three-dimensional convolutional neural network model (3D CNN) for predicting major cardiac events from coronary computed tomography angiography (CCTA) images in patients with suspected coronary artery disease. Patients undergoing CCTA with suspected coronary artery disease (CAD) were retrospectively included in this single-center study and split into training and test sets. The endpoint was defined as a composite of all-cause death, myocardial infarction, unstable angina, or revascularization events. Cardiovascular risk assessment relied on Morise score and the extent of CAD (eoCAD). An optimized 3D CNN mimicking the DenseNet architecture was trained on CCTA images to predict the clinical endpoints. The data was unannotated for presence of coronary plaque. A total of 5562 patients were assigned to the training group (66.4% male, median age 61.1 ± 11.2); 714 to the test group (69.3% male, 61.5 ± 11.4). Over a 7.2-year follow-up, the composite endpoint occurred in 760 training group and 83 test group patients. In the test cohort, the CNN achieved an AUC of 0.872 ± 0.020 for predicting the composite endpoint. The predictive performance improved in a stepwise manner: from an AUC of 0.652 ± 0.031 while using Morise score alone to 0.901 ± 0.016 when adding eoCAD and finally to 0.920 ± 0.015 when combining Morise score, eoCAD, and CNN (p < 0.001 and p = 0.012, respectively). Deep learning-based analysis of CCTA images improves prognostic risk stratification when combined with clinical and imaging risk factors in patients with suspected CAD.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145202728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seth C Coomer, Alexander N Merkle, Vikas V Patel, Pierce D Nunley, John A Hipp, Trevor F Grieco
{"title":"Investigating the Effects of Image Scaling Techniques in Radiographic Measurements of Spinal Alignment and Motion: A Comparative Analysis.","authors":"Seth C Coomer, Alexander N Merkle, Vikas V Patel, Pierce D Nunley, John A Hipp, Trevor F Grieco","doi":"10.1007/s10278-025-01680-7","DOIUrl":"https://doi.org/10.1007/s10278-025-01680-7","url":null,"abstract":"<p><p>Radiographic measurements from spinal radiographs are crucial in many diagnostic and therapeutic decisions. However, widely used manual line drawing techniques in DICOM viewers are associated with significant errors, partly due to the unreliability of DICOM scale factors. A recently developed algorithm has been engineered to scale radiographs using assumed vertebral endplate widths (EPWs). Use of EPW to scale radiographs can eliminate the need to determine image magnification. This study was designed to (1) quantify the measurement error associated with the standard of care DICOM scale factor and (2) evaluate the efficacy of the EPW scaling algorithm. Previously collected cervical and lumbar lateral radiographs with a calibration marker of known size were used to collect radiographic measures of spinal alignment and spinal motion. Three different scale factors were used to acquire mm-based measurements: (1) the ground truth scale factor; (2) the DICOM scale factor; and (3) the EPW scale factor. DICOM and EPW scaled measurements were compared with respect to ground truth measurements. The DICOM scaled radiographic measurements demonstrated significantly more error compared to the EPW scaled radiographic measurements across multiple X-ray types: 19.6% vs. 9.7% in cervical neutral lateral (p < 0.001), 19.8% vs. 9.9% in cervical flexion/extension (p < 0.001), 35.0% vs. 7.5% in lumbar neutral lateral (p < 0.001), and 35.7% vs. 7.7% in lumbar flexion/extension (p < 0.001). The EPW algorithm provides adequate scaling accuracy for many clinical applications and is associated with considerably lower error than traditional DICOM scaled measurements. Clinicians should be aware of the potential scale factor inaccuracies in their imaging and be cautious when making diagnostic and therapeutic decisions based on improperly scaled radiographs.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145194433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiwoo Park, Woo Seob Sim, Jae Yong Yu, Yu Rang Park, Young Han Lee
{"title":"Evaluation of Context-Aware Prompting Techniques for Classification of Tumor Response Categories in Radiology Reports Using Large Language Model.","authors":"Jiwoo Park, Woo Seob Sim, Jae Yong Yu, Yu Rang Park, Young Han Lee","doi":"10.1007/s10278-025-01685-2","DOIUrl":"https://doi.org/10.1007/s10278-025-01685-2","url":null,"abstract":"<p><p>Radiology reports are essential for medical decision-making, providing crucial data for diagnosing diseases, devising treatment plans, and monitoring disease progression. While large language models (LLMs) have shown promise in processing free-text reports, research on effective prompting techniques for radiologic applications remains limited. To evaluate the effectiveness of LLM-driven classification based on radiology reports in terms of tumor response category (TRC), and to optimize the model through a comparison of four different prompt engineering techniques for effectively performing this classification task in clinical applications, we included 3062 whole-spine contrast-enhanced magnetic resonance imaging (MRI) radiology reports for prompt engineering and validation. TRCs were labeled by two radiologists based on criteria modified from the Response Evaluation Criteria in Solid Tumors (RECIST) guidelines. The Llama3 instruct model was used to classify TRCs in this study through four different prompts: General, In-Context Learning (ICL), Chain-of-Thought (CoT), and ICL with CoT. AUROC, accuracy, precision, recall, and F1-score were calculated against each prompt and model (8B, 70B) with the test report dataset. The average AUROC for ICL (0.96 internally, 0.93 externally) and ICL with CoT prompts (0.97 internally, 0.94 externally) outperformed other prompts. Error increased with prompt complexity, including 0.8% incomplete sentence errors and 11.3% probability-classification inconsistencies. This study demonstrates that context-aware LLM prompts substantially improved the efficiency and effectiveness of classifying TRCs from radiology reports, despite potential intrinsic hallucinations. While further improvements are required for real-world application, our findings suggest that context-aware prompts have significant potential for segmenting complex radiology reports and enhancing oncology clinical workflows.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145194400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FairDITA: Disentangled Image-Text Alignment for Fair Skin Cancer Diagnosis.","authors":"Jiwon Park, Seunggyu Lee, Younghoon Lee","doi":"10.1007/s10278-025-01693-2","DOIUrl":"https://doi.org/10.1007/s10278-025-01693-2","url":null,"abstract":"<p><p>Recent advances in deep learning have significantly improved skin cancer classification, yet concerns regarding algorithmic fairness persist because of performance disparities across skin tone groups. Existing methods often attempt to mitigate bias by suppressing sensitive attributes within images. However, they are fundamentally limited by the entanglement of lesion characteristics and skin tone in visual inputs. To address this challenge, we propose a novel contrastive learning framework that leverages explicitly constructed image-text pairs to disentangle lesion condition features from skin tone attributes. Our architecture consists of a shared text encoder and two specialized image encoders that independently align image features with the corresponding textual descriptions of lesion characteristics and skin tone. Furthermore, we measure the semantic distance between lesion conditions and skin color embeddings in both image- and text-embedding spaces and perform optimal representation alignment by matching the distances in the image space to those in the text space. We validated our method using two benchmark datasets, PAD-UFES-20 and Fitzpatrick17k, which span a wide range of skin tones. The experimental results demonstrate that our approach consistently improves both classification accuracy and fairness across multiple evaluation metrics.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145152659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziheng Zhang, Yishan Yang, Minghan Yang, Hu Guo, Jiazhao Yang, Xianyue Shen, Jianye Wang
{"title":"A Framework for Guiding DDPM-Based Reconstruction of Damaged CT Projections Using Traditional Methods.","authors":"Ziheng Zhang, Yishan Yang, Minghan Yang, Hu Guo, Jiazhao Yang, Xianyue Shen, Jianye Wang","doi":"10.1007/s10278-025-01697-y","DOIUrl":"https://doi.org/10.1007/s10278-025-01697-y","url":null,"abstract":"<p><p>Denoising Diffusion Probabilistic Models (DDPM) have emerged as a promising generative framework for sample synthesis, yet their limitations in detail preservation hinder practical applications in computed tomography (CT) image reconstruction. To address these technical constraints and enhance reconstruction quality from compromised CT projection data, this study proposes the Projection Hybrid Inverse Reconstruction Framework (PHIRF) - a novel paradigm integrating conventional reconstruction methodologies with DDPM architecture. The framework implements a dual-phase approach: Initially, conventional CT reconstruction algorithms (e.g., Filtered back projection(FBP), Algebraic Reconstruction Technique(ART), Maximum-Likelihood Expectation Maximization (ML-EM)) are employed to generate preliminary reconstructions from incomplete projections, establishing low-dimensional feature representations. These features are subsequently parameterized and embedded as conditional constraints in the reverse diffusion process of DDPM, thereby guiding the generative model to synthesize enhanced tomographic images with improved structural fidelity. Comprehensive evaluations were conducted on three representative ill-posed projection scenarios: limited-angle projections, sparse-view acquisitions, and low-dose measurements. Experimental results demonstrate that PHIRF achieves state-of-the-art performance across all compromised data conditions, particularly in preserving fine anatomical details and suppressing reconstruction artifacts. Quantitative metrics and visual assessments confirm the framework's consistent superiority over existing deep learning-based reconstruction approaches, substantiating its adaptability to diverse projection degradation patterns. This hybrid architecture establishes a new paradigm for combining physical prior knowledge with data-driven generative models in medical image reconstruction tasks.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transformer-Based Feature Extraction and Optimized Deep Neural Network for Gastric Cancer Detection.","authors":"Emine Uçar","doi":"10.1007/s10278-025-01699-w","DOIUrl":"https://doi.org/10.1007/s10278-025-01699-w","url":null,"abstract":"<p><p>Gastric cancer is among the most common diseases worldwide and can lead to fatal outcomes. Early diagnosis significantly increases the success of treatment, and accurate and rapid analysis of histopathological images is of enormous importance. However, since manual evaluation of these images is time-consuming and open to observational errors, the need for automatic diagnosis systems supported by artificial intelligence is increasing. In this study, a multi-stage artificial intelligence-based model that performs cancer detection on gastric histopathological images is proposed. In the first stage, features were extracted from the images using 11 different state-of-the-art vision transformer models. Then, the most significant features were determined by using feature selection methods such as ANOVA F-Test, Recursive Feature Elimination, and Ridge regression, and separate feature sets consisting of the intersections and unions of these features were created. The obtained feature sets were trained with a deep neural network model optimized with the Particle Swarm Optimization algorithm to increase the classification performance, and the detection of gastric tissues was achieved. Among the tested configurations, the highest classification performance was obtained using 160 × 160 image resolution, the DPT model, and union-based feature selection. This configuration achieved 97.96% accuracy, 96.95% sensitivity, 98.61% specificity, 97.85% precision, and a 97.40% F1-score. Additionally, strong results were observed with other configurations, such as 97.21% accuracy using the DPT model with 120 × 120 images, and 95.78% accuracy with the BEiT model at 80 × 80 resolution. These findings demonstrate that transformer-based feature extraction methods, when combined with effective feature selection strategies, can significantly enhance diagnostic performance.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145152696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Morteza Golzan, Hyunwoo Lee, Telex M N Ngatched, Lihong Zhang, Maciej Michalak, Vincent Chow, Mirza Faisal Beg, Karteek Popuri
{"title":"Automatic Body Region Classification in CT Scans Using Deep Learning.","authors":"Morteza Golzan, Hyunwoo Lee, Telex M N Ngatched, Lihong Zhang, Maciej Michalak, Vincent Chow, Mirza Faisal Beg, Karteek Popuri","doi":"10.1007/s10278-025-01662-9","DOIUrl":"https://doi.org/10.1007/s10278-025-01662-9","url":null,"abstract":"<p><p>Accurate classification of anatomical regions in computed tomography (CT) scans is essential for optimizing downstream diagnostic and analytic workflows in medical imaging. We demonstrate the high performance that deep learning (DL) algorithms can achieve in the classification of whole-body parts in CT images acquired under various protocols. Our model was trained using a dataset consisting of 5485 anonymized neuroimaging informatics technology initiative (NIFTI) CT scans collected from 45 different health centers. The dataset was split into 3290 scans for training, 1097 scans for validation, and 1098 scans for testing. Each body CT scan was classified into six distinct classes covering the whole body: chest, abdomen, pelvis, chest and abdomen, abdomen and pelvis, and chest and abdomen and pelvis. The performance of the DL model stood at an accuracy, precision, recall, and F1-score of 97.53% (95% CI: 96.62%, 98.45%), 97.56% (95% CI: 96.6%, 98.4%), 97.6% (95% CI: 96.7%, 98.5%), and 97.56% (96.6%, 98.4%), respectively, in identifying different body parts. These findings demonstrate the strength of our approach in annotating CT images through a wide variation in both acquisition protocols and patient demographics. This study underlines the potential that DL holds for medical imaging and, in particular, for the automation of body region classification in CT. Our findings confirm that these models could be implemented in clinical routines to improve diagnostic efficiency and harmony.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}