Ilker Ozgur Koska, Alper Selver, Fazil Gelal, Muhsin Engin Uluc, Yusuf Kenan Çetinoğlu, Nursel Yurttutan, Mehmet Serindere, Oğuz Dicle
{"title":"Deep Learning Classification of Ischemic Stroke Territory on Diffusion-Weighted MRI: Added Value of Augmenting the Input with Image Transformations.","authors":"Ilker Ozgur Koska, Alper Selver, Fazil Gelal, Muhsin Engin Uluc, Yusuf Kenan Çetinoğlu, Nursel Yurttutan, Mehmet Serindere, Oğuz Dicle","doi":"10.1007/s10278-024-01277-6","DOIUrl":"10.1007/s10278-024-01277-6","url":null,"abstract":"<p><p>Our primary aim with this study was to build a patient-level classifier for stroke territory in DWI using AI to facilitate fast triage of stroke to a dedicated stroke center. A retrospective collection of DWI images of 271 and 122 consecutive acute ischemic stroke patients from two centers was carried out. Pretrained MobileNetV2 and EfficientNetB0 architectures were used to classify territorial subtypes as middle cerebral artery, posterior circulation, or watershed infarcts along with normal slices. Various input combinations using edge maps, thresholding, and hard attention versions were explored. The effect of augmenting the three-channel inputs of pre-trained models on classification performance was analyzed. ROC analyses and confusion matrix-derived performance metrics of the models were reported. Of the 271 patients included in this study, 151 (55.7%) were male and 120 (44.3%) were female. One hundred twenty-nine patients had MCA (47.6%), 65 patients had posterior circulation (24%), and 77 patients had watershed (28.0%) infarcts for center 1. Of the 122 patients from center 2, 78 (64%) were male and 44 (34%) were female. Fifty-two patients (43%) had MCA, 51 patients had posterior circulation (42%), and 19 (15%) patients had watershed infarcts. The Mobile-Crop model had the best performance with 0.95 accuracy and a 0.91 mean f1 score for slice-wise classification and 0.88 accuracy on external test sets, along with a 0.92 mean AUC. In conclusion, modified pre-trained models may be augmented with the transformation of images to provide a more accurate classification of affected territory by stroke in DWI.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1374-1387"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092885/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142336031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of Deep Learning Image Reconstruction on Image Quality and Pericoronary Fat Attenuation Index.","authors":"Junqing Mei, Chang Chen, Ruoting Liu, Hongbing Ma","doi":"10.1007/s10278-024-01234-3","DOIUrl":"10.1007/s10278-024-01234-3","url":null,"abstract":"<p><p>To compare the image quality and fat attenuation index (FAI) of coronary artery CT angiography (CCTA) under different tube voltages between deep learning image reconstruction (DLIR) and adaptive statistical iterative reconstruction V (ASIR-V). Three hundred one patients who underwent CCTA with automatic tube current modulation were prospectively enrolled and divided into two groups: 120 kV group and low tube voltage group. Images were reconstructed using ASIR-V level 50% (ASIR-V50%) and high-strength DLIR (DLIR-H). In the low tube voltage group, the voltage was selected according to Chinese BMI classification: 70 kV (BMI < 24 kg/m<sup>2</sup>), 80 kV (24 kg/m<sup>2</sup> ≤ BMI < 28 kg/m<sup>2</sup>), 100 kV (BMI ≥ 28 kg/m<sup>2</sup>). At the same tube voltage, the subjective and objective image quality, edge rise distance (ERD), and FAI between different algorithms were compared. Under different tube voltages, we used DLIR-H to compare the differences between subjective, objective image quality, and ERD. Compared with the 120 kV group, the DLIR-H image noise of 70 kV, 80 kV, and 100 kV groups increased by 36%, 25%, and 12%, respectively (all P < 0.001); contrast-to-noise ratio (CNR), subjective score, and ERD were similar (all P > 0.05). In the 70 kV, 80 kV, 100 kV, and 120 kV groups, compared with ASIR-V50%, DLIR-H image noise decreased by 50%, 53%, 47%, and 38-50%, respectively; CNR, subjective score, and FAI value increased significantly (all P < 0.001), ERD decreased. Compared with 120 kV tube voltage, the combination of DLIR-H and low tube voltage maintains image quality. At the same tube voltage, compared with ASIR-V, DLIR-H improves image quality and FAI value.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1881-1890"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142305611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu Huang, Nicholas J Leotta, Lukas Hirsch, Roberto Lo Gullo, Mary Hughes, Jeffrey Reiner, Nicole B Saphier, Kelly S Myers, Babita Panigrahi, Emily Ambinder, Philip Di Carlo, Lars J Grimm, Dorothy Lowell, Sora Yoon, Sujata V Ghate, Lucas C Parra, Elizabeth J Sutton
{"title":"Cross-site Validation of AI Segmentation and Harmonization in Breast MRI.","authors":"Yu Huang, Nicholas J Leotta, Lukas Hirsch, Roberto Lo Gullo, Mary Hughes, Jeffrey Reiner, Nicole B Saphier, Kelly S Myers, Babita Panigrahi, Emily Ambinder, Philip Di Carlo, Lars J Grimm, Dorothy Lowell, Sora Yoon, Sujata V Ghate, Lucas C Parra, Elizabeth J Sutton","doi":"10.1007/s10278-024-01266-9","DOIUrl":"10.1007/s10278-024-01266-9","url":null,"abstract":"<p><p>This work aims to perform a cross-site validation of automated segmentation for breast cancers in MRI and to compare the performance to radiologists. A three-dimensional (3D) U-Net was trained to segment cancers in dynamic contrast-enhanced axial MRIs using a large dataset from Site 1 (n = 15,266; 449 malignant and 14,817 benign). Performance was validated on site-specific test data from this and two additional sites, and common publicly available testing data. Four radiologists from each of the three clinical sites provided two-dimensional (2D) segmentations as ground truth. Segmentation performance did not differ between the network and radiologists on the test data from Sites 1 and 2 or the common public data (median Dice score Site 1, network 0.86 vs. radiologist 0.85, n = 114; Site 2, 0.91 vs. 0.91, n = 50; common: 0.93 vs. 0.90). For Site 3, an affine input layer was fine-tuned using segmentation labels, resulting in comparable performance between the network and radiologist (0.88 vs. 0.89, n = 42). Radiologist performance differed on the common test data, and the network numerically outperformed 11 of the 12 radiologists (median Dice: 0.85-0.94, n = 20). In conclusion, a deep network with a novel supervised harmonization technique matches radiologists' performance in MRI tumor segmentation across clinical sites. We make code and weights publicly available to promote reproducible AI in radiology.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1642-1652"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092851/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142336029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MobileNet-V2: An Enhanced Skin Disease Classification by Attention and Multi-Scale Features.","authors":"Nirupama, Virupakshappa","doi":"10.1007/s10278-024-01271-y","DOIUrl":"10.1007/s10278-024-01271-y","url":null,"abstract":"<p><p>The increasing prevalence of skin diseases necessitates accurate and efficient diagnostic tools. This research introduces a novel skin disease classification model leveraging advanced deep learning techniques. The proposed architecture combines the MobileNet-V2 backbone, Squeeze-and-Excitation (SE) blocks, Atrous Spatial Pyramid Pooling (ASPP), and a Channel Attention Mechanism. The model was trained on four diverse datasets such as PH2 dataset, Skin Cancer MNIST: HAM10000 dataset, DermNet. dataset, and Skin Cancer ISIC dataset. Data preprocessing techniques, including image resizing, and normalization, played a crucial role in optimizing model performance. In this paper, the MobileNet-V2 backbone is implemented to extract hierarchical features from the preprocessed dermoscopic images. The multi-scale contextual information is fused by the ASPP model for generating a feature map. The attention mechanisms contributed significantly, enhancing the extraction ability of inter-channel relationships and multi-scale contextual information for enhancing the discriminative power of the features. Finally, the output feature map is converted into probability distribution through the softmax function. The proposed model outperformed several baseline models, including traditional machine learning approaches, emphasizing its superiority in skin disease classification with 98.6% overall accuracy. Its competitive performance with state-of-the-art methods positions it as a valuable tool for assisting dermatologists in early classification. The study also identified limitations and suggested avenues for future research, emphasizing the model's potential for practical implementation in the field of dermatology.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1734-1754"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092329/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BCCHI-HCNN: Breast Cancer Classification from Histopathological Images Using Hybrid Deep CNN Models.","authors":"Saroj Kumar Pandey, Yogesh Kumar Rathore, Manoj Kumar Ojha, Rekh Ram Janghel, Anurag Sinha, Ankit Kumar","doi":"10.1007/s10278-024-01297-2","DOIUrl":"10.1007/s10278-024-01297-2","url":null,"abstract":"<p><p>Breast cancer is the most common cancer in women globally, imposing a significant burden on global public health due to high death rates. Data from the World Health Organization show an alarming annual incidence of nearly 2.3 million new cases, drawing the attention of patients, healthcare professionals, and governments alike. Through the examination of histopathological pictures, this study aims to revolutionize the early and precise identification of breast cancer by utilizing the capabilities of a deep convolutional neural network (CNN)-based model. The model's performance is improved by including numerous classifiers, including support vector machine (SVM), decision tree, and K-nearest neighbors (KNN), using transfer learning techniques. The studies include evaluating two separate feature vectors, one with and one without principal component analysis (PCA). Extensive comparisons are made to measure the model's performance against current deep learning models, including critical metrics such as false positive rate, true positive rate, accuracy, precision, and recall. The data show that the SVM algorithm with PCA features achieves excellent speed and accuracy, with an amazing accuracy of 99.5%. Furthermore, although being somewhat slower than SVM, the decision tree model has the greatest accuracy of 99.4% without PCA. This study suggests a viable strategy for improving early breast cancer diagnosis, opening the path for more effective healthcare treatments and better patient outcomes.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1690-1703"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142485256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of Malignancy and Pathological Types of Solid Lung Nodules on CT Scans Using a Volumetric SWIN Transformer.","authors":"Huicong Chen, Yanhua Wen, Wensheng Wu, Yingying Zhang, Xiaohuan Pan, Yubao Guan, Dajiang Qin","doi":"10.1007/s10278-024-01090-1","DOIUrl":"10.1007/s10278-024-01090-1","url":null,"abstract":"<p><p>Lung adenocarcinoma and squamous cell carcinoma are the two most common pathological lung cancer subtypes. Accurate diagnosis and pathological subtyping are crucial for lung cancer treatment. Solitary solid lung nodules with lobulation and spiculation signs are often indicative of lung cancer; however, in some cases, postoperative pathology finds benign solid lung nodules. It is critical to accurately identify solid lung nodules with lobulation and spiculation signs before surgery; however, traditional diagnostic imaging is prone to misdiagnosis, and studies on artificial intelligence-assisted diagnosis are few. Therefore, we introduce a volumetric SWIN Transformer-based method. It is a multi-scale, multi-task, and highly interpretable model for distinguishing between benign solid lung nodules with lobulation and spiculation signs, lung adenocarcinomas, and lung squamous cell carcinoma. The technique's effectiveness was improved by using 3-dimensional (3D) computed tomography (CT) images instead of conventional 2-dimensional (2D) images to combine as much information as possible. The model was trained using 352 of the 441 CT image sequences and validated using the rest. The experimental results showed that our model could accurately differentiate between benign lung nodules with lobulation and spiculation signs, lung adenocarcinoma, and squamous cell carcinoma. On the test set, our model achieves an accuracy of 0.9888, precision of 0.9892, recall of 0.9888, and an F1-score of 0.9888, along with a class activation mapping (CAM) visualization of the 3D model. Consequently, our method could be used as a preoperative tool to assist in diagnosing solitary solid lung nodules with lobulation and spiculation signs accurately and provide a theoretical basis for developing appropriate clinical diagnosis and treatment plans for the patients.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1509-1517"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092873/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142485259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Davidson, Arthur Morley-Bunker, George Wiggins, Logan Walker, Gavin Harris, Ramakrishnan Mukundan, kConFab Investigators
{"title":"Deep Learning Segmentation of Chromogenic Dye RNAscope From Breast Cancer Tissue.","authors":"Andrew Davidson, Arthur Morley-Bunker, George Wiggins, Logan Walker, Gavin Harris, Ramakrishnan Mukundan, kConFab Investigators","doi":"10.1007/s10278-024-01301-9","DOIUrl":"10.1007/s10278-024-01301-9","url":null,"abstract":"<p><p>RNAscope staining of breast cancer tissue allows pathologists to deduce genetic characteristics of the cancer by inspection at the microscopic level, which can lead to better diagnosis and treatment. Chromogenic RNAscope staining is easy to fit into existing pathology workflows, but manually analyzing the resulting tissue samples is time consuming. There is also a lack of peer-reviewed, performant solutions for automated analysis of chromogenic RNAscope staining. This paper covers the development and optimization of a novel deep learning method focused on accurate segmentation of RNAscope dots (which signify gene expression) from breast cancer tissue. The deep learning network is convolutional and uses ConvNeXt as its backbone. The upscaling portions of the network use custom, heavily regularized blocks to prevent overfitting and early convergence on suboptimal solutions. The resulting network is modest in size for a segmentation network and able to function well with little training data. This deep learning network was also able to outperform manual expert annotation at finding the positions of RNAscope dots, having a final <math> <msub><mrow><mi>F</mi></mrow> <mn>1</mn></msub> </math> -score of 0.745. In comparison, the expert inter-rater <math> <msub><mrow><mi>F</mi></mrow> <mn>1</mn></msub> </math> -score was 0.596.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1704-1721"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092323/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Badr, Abdullah Elkasaby, Mohammed Alrahmawy, Sara El-Metwally
{"title":"A Multi-model Deep Learning Architecture for Diagnosing Multi-class Skin Diseases.","authors":"Mohamed Badr, Abdullah Elkasaby, Mohammed Alrahmawy, Sara El-Metwally","doi":"10.1007/s10278-024-01300-w","DOIUrl":"10.1007/s10278-024-01300-w","url":null,"abstract":"<p><p>Skin diseases are a significant global public health concern, affecting 21-85% of the world's population, particularly those in low- and middle-income countries. Accurate and timely diagnosis is crucial for effective treatment and improved patient outcomes. This study introduces a novel deep-learning multi-model architecture designed for high-precision skin disease diagnosis. The system employs a five-category Xception model to classify skin lesions into five classes: Atopic Dermatitis, Acne and Rosacea, Skin Cancer, Bullous, and Others. Trained on 25,010 images, the model achieved 95% accuracy and an AUROC of 99.4%. To further enhance accuracy, transfer learning was applied, resulting in specialized models for each class, with strong performance across 40 skin conditions. Specifically, the Acne and Rosacea model achieved an accuracy of 90.0%, with a precision of 90.7%, recall of 90.1%, f1-score of 90.2%, and an AUROC of 99.0%. The Skin Cancer model demonstrated 94.0% accuracy, 94.8% precision, 94.2% recall, 94.1% f1-score, and a 99.5% AUROC. The Atopic Dermatitis model reported 91.8% accuracy, 92.2% precision, 91.8% recall, 91.9% f1-score, and a 98.8% AUROC. Finally, the Bullous model showed 90.0% accuracy, 90.6% precision, 90.0% recall, 90.0% f1-score, and a 98.9% AUROC. This approach surpasses previous studies, offering a more comprehensive diagnostic tool for skin diseases. To facilitate result reproducibility, the training and testing codes for the models utilized in this study are accessible via the GitHub repository ( https://github.com/SaraEl-Metwally/A-Multi-Model-Deep-Learning-for-Diagnosing-Skin-Diseases ).</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1776-1795"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12092911/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142560008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fusion of Texture Features Applied to H. pylori Infection Classification from Histopathological Images.","authors":"André Ricardo Backes","doi":"10.1007/s10278-025-01562-y","DOIUrl":"https://doi.org/10.1007/s10278-025-01562-y","url":null,"abstract":"<p><p>Helicobacter pylori (H. pylori) is a globally prevalent pathogenic bacterium. It affects over 4 billion people worldwide and contributes to many gastric diseases such as gastritis, peptic ulcers, and cancer. Its diagnosis traditionally relies on histopathological analysis of endoscopic biopsies by trained pathologists. It is a labor-intensive and time-consuming process that risks overlooking small bacterial populations. Another limiting factor is the cost, which can vary from a few dozen to hundreds of dollars. In order to automate this process, our study evaluated the potential of various texture features for binary classification of 204 histopathological images (H. pylori-positive and H. pylori-negative cases). Texture is an important attribute and describes the appearance of a surface based on its composition and structure. In our study, we discarded the color information present in the samples and computed texture features from various methods, selected based on their performance, novelty, and ability to highlight different aspects of the image. We also investigated how the combination of these features, performed by the application of Particle Swarm Optimization (PSO) algorithm, impact on the performance of classification. Results demonstrated that well known texture analysis methods are still competitive in terms of performance, obtaining the highest accuracy (94.61%) and F1-score (94.47%), suggesting a robust balance between precision and recall, surpassing state-of-the-art techniques such as ResNet-101 by a margin of 4.41%.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144188712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wavelet Transform and Hierarchical Hybrid Matching for Enhancing End-to-End Pediatric Wrist Fracture Detection.","authors":"Bin Yan, Yuliang Zhang, Qiuming He","doi":"10.1007/s10278-025-01512-8","DOIUrl":"https://doi.org/10.1007/s10278-025-01512-8","url":null,"abstract":"<p><p>With the increasing frequency of daily physical activities among children and adolescents, the incidence of wrist fractures has been rising annually. Without precise and prompt diagnosis, these fractures may remain undetected, potentially leading to complications. Recent advancements in computer-aided diagnosis (CAD) technologies have facilitated the development of sophisticated diagnostic tools, which significantly improve the accuracy of fracture detection. To enhance the capability of detecting pediatric wrist fractures, this study presents the WH-DETR model, specifically designed for pediatric wrist fracture detection. WH-DETR is configured as a DEtection TRansformer framework, an end-to-end object detection algorithm that obviates the need for non-maximum suppression post-processing. To further enhance its performance, this study first introduces a wavelet transform projection module to capture different frequency features from the feature maps extracted by the backbone. This module allows the network to effectively capture multi-scale and multi-frequency information, improving the detection of subtle and complex features in medical images. Secondly, this study designs a hierarchical hybrid matching framework that decouples the prediction tasks of different decoder layers during training, thereby improving the final predictive capabilities of the model. The framework improves prediction robustness while maintaining inference efficiency. Extensive experiments on the GRAZPEDWRI-DX dataset demonstrate that our WH-DETR model achieves state-of-the-art performance with only 43 M parameters, attaining an <math><msub><mtext>mAP</mtext> <mn>50</mn></msub> </math> score of 68.8%, an <math><msub><mtext>mAP</mtext> <mrow><mn>50</mn> <mo>-</mo> <mn>90</mn></mrow> </msub> </math> score of 48.3%, and an F1 score of 64.1%. These results represent improvements of 1.78% in <math><msub><mtext>mAP</mtext> <mn>50</mn></msub> </math> , 1.69% in <math><msub><mtext>mAP</mtext> <mrow><mn>50</mn> <mo>-</mo> <mn>90</mn></mrow> </msub> </math> , and 1.75% in F1 score, respectively, over the next best-performing model, highlighting its superior efficiency and robustness in pediatric wrist fracture detection.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144188713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}