Yannik Frisch, Ssharvien Kumar Sivakumar, Çağhan Köksal, Elsa Böhm, Felix Wagner, Adrian Gericke, Ghazal Ghazaei, Anirban Mukhopadhyay
{"title":"SurGrID: controllable surgical simulation via Scene Graph to Image Diffusion.","authors":"Yannik Frisch, Ssharvien Kumar Sivakumar, Çağhan Köksal, Elsa Böhm, Felix Wagner, Adrian Gericke, Ghazal Ghazaei, Anirban Mukhopadhyay","doi":"10.1007/s11548-025-03397-y","DOIUrl":"https://doi.org/10.1007/s11548-025-03397-y","url":null,"abstract":"<p><strong>Purpose: </strong>Surgical simulation offers a promising addition to conventional surgical training. However, available simulation tools lack photorealism and rely on hard-coded behaviour. Denoising Diffusion Models are a promising alternative for high-fidelity image synthesis, but existing state-of-the-art conditioning methods fall short in providing precise control or interactivity over the generated scenes.</p><p><strong>Methods: </strong>We introduce SurGrID, a Scene Graph to Image Diffusion Model, allowing for controllable surgical scene synthesis by leveraging Scene Graphs. These graphs encode a surgical scene's components' spatial and semantic information, which are then translated into an intermediate representation using our novel pre-training step that explicitly captures local and global information.</p><p><strong>Results: </strong>Our proposed method improves the fidelity of generated images and their coherence with the graph input over the state of the art. Further, we demonstrate the simulation's realism and controllability in a user assessment study involving clinical experts.</p><p><strong>Conclusion: </strong>Scene Graphs can be effectively used for precise and interactive conditioning of Denoising Diffusion Models for simulating surgical scenes, enabling high-fidelity and interactive control over the generated content.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Chiara Fiorentino, Giovanna Migliorelli, Francesca Pia Villani, Emanuele Frontoni, Sara Moccia
{"title":"Contrastive prototype federated learning against noisy labels in fetal standard plane detection.","authors":"Maria Chiara Fiorentino, Giovanna Migliorelli, Francesca Pia Villani, Emanuele Frontoni, Sara Moccia","doi":"10.1007/s11548-025-03400-6","DOIUrl":"https://doi.org/10.1007/s11548-025-03400-6","url":null,"abstract":"<p><strong>Purpose: </strong>This study aims to improve federated learning (FL) for ultrasound fetal standard plane detection by addressing noisy labels and data size variability across decentralized clients. We propose a federated denoising framework leveraging prototypes from the largest dataset in the federation to refine noisy labels and enhance predictions in all clients while preserving privacy.</p><p><strong>Methods: </strong>The proposed framework consists of two main steps. First, contrastive learning (SimCLR) is applied to the images of the largest client, generating robust embeddings. These embeddings are used to refine noisy labels in the same client by leveraging the latent space structure using a threshold-based k-nearest neighbors re-labeling strategy. As a second step, image prototypes, computed from the embeddings with noise-free labels, along with SimCLR trained backbone, are shared with the smallest client to guide the FL process effectively, without requiring the use of labels from the smallest client. To address possible image distribution shifts, an ensemble strategy is introduced, which uses a majority voting scheme to optimize label refinement in the smallest dataset while minimizing image discard.</p><p><strong>Results: </strong>Our framework showed improved performance compared to traditional FL approaches in standard plane detection, achieving the highest mean F1-score across planes.</p><p><strong>Conclusions: </strong>The proposed strategy effectively improves fetal standard plane detection by leveraging high-quality prototypes, enabling robust performance even with noisy and heterogeneous data size across clients, while preserving privacy.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesse Haworth, Manish Sahu, Katherine Zhu, Jacob Hammond, Hisashi Ishida, Adnan Munawar, Robin Yang, Russell Taylor
{"title":"A cooperatively controlled robotic system with active constraints for enhancing efficacy in bilateral sagittal split osteotomy.","authors":"Jesse Haworth, Manish Sahu, Katherine Zhu, Jacob Hammond, Hisashi Ishida, Adnan Munawar, Robin Yang, Russell Taylor","doi":"10.1007/s11548-025-03403-3","DOIUrl":"https://doi.org/10.1007/s11548-025-03403-3","url":null,"abstract":"<p><strong>Purpose: </strong>Precise osteotomies are vital in maxillofacial procedures such as the bilateral sagittal split osteotomy (BSSO) where surgical accuracy and precision directly impacts patient outcomes. Conventional freehand drilling can lead to unfavorable splits, negatively impacting surgical outcome.</p><p><strong>Methods: </strong>This paper presents the development work of a cooperatively controlled robot system designed to enhance the efficacy of osteotomies during BSSO. The system features two assistive modes for the execution of a patient-specific surgical plan: (1) a Haptic guidance mode that helps the surgeon align the surgical drill with the planned cutting plane to improve surgical accuracy of the cut and (2) an Active constraint mode that restricts deviations from the cutting plane to enhance surgical precision during drilling. We validated the system through feasibility experiments involving 36 mandible phantoms and a cadaveric specimen, with a surgeon, a surgical resident, and a medical student performing osteotomies freehand and with robotic assistance. Additionally, NASA TLX surveys were conducted to assess the perceived ease of use of the robotic system.</p><p><strong>Results: </strong>Compared to freehand methods, the robotic system improved the efficacy of the cut from <math><mrow><mn>2.16</mn> <mo>±</mo> <mn>0.98</mn></mrow> </math> to <math><mrow><mn>0.71</mn> <mo>±</mo> <mn>0.53</mn></mrow> </math> mm for the med student, <math><mrow><mn>1.74</mn> <mo>±</mo> <mn>0.95</mn></mrow> </math> to <math><mrow><mn>0.53</mn> <mo>±</mo> <mn>0.35</mn></mrow> </math> mm for the resident, and <math><mrow><mn>1.64</mn> <mo>±</mo> <mn>0.85</mn></mrow> </math> to <math><mrow><mn>0.63</mn> <mo>±</mo> <mn>0.24</mn></mrow> </math> mm for the surgeon while reducing the task load.</p><p><strong>Conclusion: </strong>Our experimental results demonstrate that the proposed robotic system can enhance the precision of surgical drilling in the BSSO compared to a freehand approach. These findings indicate the potential of robotic systems to reduce errors and enhance patient outcomes in maxillofacial surgery.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Caballero, Manuel J Pérez-Salazar, Juan A Sánchez-Margallo, Francisco M Sánchez-Margallo
{"title":"Optimization of an artificial neural network for predicting stress in robot-assisted laparoscopic surgery based on EDA sensor data.","authors":"Daniel Caballero, Manuel J Pérez-Salazar, Juan A Sánchez-Margallo, Francisco M Sánchez-Margallo","doi":"10.1007/s11548-025-03399-w","DOIUrl":"https://doi.org/10.1007/s11548-025-03399-w","url":null,"abstract":"<p><strong>Purpose: </strong>This study aims to optimize tunable hyperparameters of the multilayer perceptron (MLP) setup. The optimization procedure is aimed at more accurately predicting potential health risks to the surgeon during robotic-assisted surgery (RAS).</p><p><strong>Methods: </strong>Data related to physiological parameters (electrodermal activity-EDA, blood pressure and body temperature) were collected during twenty RAS sessions completed by nine surgeons with different levels of experience. Once the dataset was generated, two preprocessing techniques (scaling and normalized) were applied. These datasets were divided into two subsets: with 80% data for training and cross-validation and 20% for testing. MLP was selected as the prediction technique. Three MLP hyperparameters were selected for optimization: number of epochs, learning rate and momentum. A central composite design (CCD) was applied with a full factorial design with five center points, with 31 combinations for each dataset. Once the models were generated on the training dataset, the optimized models were selected and then validated on the cross-validation and test datasets.</p><p><strong>Results: </strong>The optimized models were generated with an optimal number of epochs (500), the most applied learning rate was 0.01 and the most applied momentum was 0.05. These results showed significant improvement for EDA (R<sup>2</sup> = 0.9722), blood pressure (R<sup>2</sup> = 0.9977) and body temperature (R<sup>2</sup> = 0.9941).</p><p><strong>Conclusions: </strong>MLP parameters have been successfully optimized, and the enhanced models were successfully validated on cross-validation and test datasets. This fact invites us to optimize different AI techniques that could improve results in clinical practice.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SASVi: segment any surgical video.","authors":"Ssharvien Kumar Sivakumar, Yannik Frisch, Amin Ranem, Anirban Mukhopadhyay","doi":"10.1007/s11548-025-03408-y","DOIUrl":"https://doi.org/10.1007/s11548-025-03408-y","url":null,"abstract":"<p><strong>Purpose: </strong>Foundation models, trained on multitudes of public datasets, often require additional fine-tuning or re-prompting mechanisms to be applied to visually distinct target domains such as surgical videos. Further, without domain knowledge, they cannot model the specific semantics of the target domain. Hence, when applied to surgical video segmentation, they fail to generalise to sections where previously tracked objects leave the scene or new objects enter.</p><p><strong>Methods: </strong>We propose SASVi, a novel re-prompting mechanism based on a frame-wise object detection Overseer model, which is trained on a minimal amount of scarcely available annotations for the target domain. This model automatically re-prompts the foundation model SAM2 when the scene constellation changes, allowing for temporally smooth and complete segmentation of full surgical videos.</p><p><strong>Results: </strong>Re-prompting based on our Overseer model significantly improves the temporal consistency of surgical video segmentation compared to similar prompting techniques and especially frame-wise segmentation, which neglects temporal information, by at least 2.4%. Our proposed approach allows us to successfully deploy SAM2 to surgical videos, which we quantitatively and qualitatively demonstrate for three different cholecystectomy and cataract surgery datasets.</p><p><strong>Conclusion: </strong>SASVi can serve as a new baseline for smooth and temporally consistent segmentation of surgical videos with scarcely available annotation data. Our method allows us to leverage scarce annotations and obtain complete annotations for full videos of the large-scale counterpart datasets. We make those annotations publicly available, providing extensive annotation data for the future development of surgical data science models.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roman Flepp, Leon Nissen, Bastian Sigrist, Arend Nieuwland, Nicola Cavalcanti, Philipp Fürnstahl, Thomas Dreher, Lilian Calvet
{"title":"Automatic multi-view X-ray/CT registration using bone substructure contours.","authors":"Roman Flepp, Leon Nissen, Bastian Sigrist, Arend Nieuwland, Nicola Cavalcanti, Philipp Fürnstahl, Thomas Dreher, Lilian Calvet","doi":"10.1007/s11548-025-03391-4","DOIUrl":"https://doi.org/10.1007/s11548-025-03391-4","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate intraoperative X-ray/CT registration is essential for surgical navigation in orthopedic procedures. However, existing methods struggle with consistently achieving sub-millimeter accuracy, robustness under broad initial pose estimates or need manual key-point annotations. This work aims to address these challenges by proposing a novel multi-view X-ray/CT registration method for intraoperative bone registration.</p><p><strong>Methods: </strong>The proposed registration method consists of a multi-view, contour-based iterative closest point (ICP) optimization. Unlike previous methods, which attempt to match bone contours across the entire silhouette in both imaging modalities, we focus on matching specific subcategories of contours corresponding to bone substructures. This leads to reduced ambiguity in the ICP matches, resulting in a more robust and accurate registration solution. This approach requires only two X-ray images and operates fully automatically. Additionally, we contribute a dataset of 5 cadaveric specimens, including real X-ray images, X-ray image poses and the corresponding CT scans.</p><p><strong>Results: </strong>The proposed registration method is evaluated on real X-ray images using mean reprojection error (mRPD). The method consistently achieves sub-millimeter accuracy with a mRPD 0.67 mm compared to 5.35 mm by a commercial solution requiring manual intervention. Furthermore, the method offers improved practical applicability, being fully automatic.</p><p><strong>Conclusion: </strong>Our method offers a practical, accurate, and efficient solution for multi-view X-ray/CT registration in orthopedic surgeries, which can be easily combined with tracking systems. By improving registration accuracy and minimizing manual intervention, it enhances intraoperative navigation, contributing to more accurate and effective surgical outcomes in computer-assisted surgery (CAS). The source code and the dataset are publicly available at: https://github.com/rflepp/MultiviewXrayCT-Registration .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nathan Lampen, Xuanang Xu, Daeseung Kim, Tianshu Kuang, Jungwook Lee, Hannah H Deng, Jaime Gateno, Pingkun Yan
{"title":"AI-assisted mesh generation for subject-specific modeling of facial soft tissues.","authors":"Nathan Lampen, Xuanang Xu, Daeseung Kim, Tianshu Kuang, Jungwook Lee, Hannah H Deng, Jaime Gateno, Pingkun Yan","doi":"10.1007/s11548-025-03419-9","DOIUrl":"https://doi.org/10.1007/s11548-025-03419-9","url":null,"abstract":"<p><strong>Purpose: </strong>Simulation of reconstructive and cosmetic facial surgeries, such as orthognathic surgery, requires precise, patient-specific soft tissue meshes for outcome prediction. Conventional meshing methods rely on labor-intensive processes, including manual landmark digitization and mesh editing, and often lack point correspondence among subjects. These limitations reduce their efficiency, scalability, and utility in fast-paced clinical environments, highlighting the need for innovative and streamlined meshing techniques.</p><p><strong>Methods: </strong>This study presents a novel AI-assisted mesh generation (AAMG) approach using Google MediaPipe for real-time facial landmark detection to automate the creation of volumetric meshes of facial soft tissues. By leveraging these landmarks as reference points, the AAMG method generates detailed meshes that accurately reflect individual facial anatomy without manual intervention. To evaluate performance, we compared our automated method with a clinically validated, expert-guided mesh generation (EGMG) method that relies on manual landmark digitization and mesh editing. Both methods were tested on a dataset of 29 subjects who had undergone orthognathic surgery.</p><p><strong>Results: </strong>The AAMG method demonstrated high-quality metrics, with a mean Jacobian ratio of 0.83, skewness of 0.25, and an aspect ratio of 2.15, comparable to the EGMG method. Additionally, Chamfer distance analysis showed no significant differences affecting simulation performance between the two methods.</p><p><strong>Conclusion: </strong>The proposed AI-assisted mesh generation method significantly reduces mesh generation time from several hours to under a minute, while maintaining comparable mesh quality and accuracy to a clinically validated, expert-guided mesh generation method. Our method ensures consistent subject-specific meshing by leveraging real-time landmark detection and automated interpolation, improving workflow efficiency for surgical planning.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144102944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuwei Xing, Inaara Ahmed-Fazal, Utsav Pardasani, Uditha Jayarathne, Scott Illsley, Aaron Fenster, Terry M Peters, Elvis C S Chen
{"title":"Virtual fluoroscopy for interventional guidance using magnetic tracking.","authors":"Shuwei Xing, Inaara Ahmed-Fazal, Utsav Pardasani, Uditha Jayarathne, Scott Illsley, Aaron Fenster, Terry M Peters, Elvis C S Chen","doi":"10.1007/s11548-025-03395-0","DOIUrl":"https://doi.org/10.1007/s11548-025-03395-0","url":null,"abstract":"<p><strong>Purpose: </strong>In conventional fluoroscopy-guided interventions, the 2D projective nature of X-ray imaging limits depth perception and leads to prolonged radiation exposure. Virtual fluoroscopy, combined with spatially tracked surgical instruments, is a promising strategy to mitigate these limitations. While magnetic tracking shows unique advantages, particularly in tracking flexible instruments, it remains under-explored due to interference from ferromagnetic materials in the C-arm room. This work proposes a virtual fluoroscopy workflow by effectively integrating magnetic tracking and demonstrates its clinical efficacy METHODS: An automatic virtual fluoroscopy workflow was developed using a radiolucent tabletop field generator prototype. Specifically, we developed a fluoro-CT registration approach with automatic 2D-3D shared landmark correspondence to establish the C-arm-patient relationship, along with a general C-arm modelling approach to calculate desired poses and generate corresponding virtual fluoroscopic images.</p><p><strong>Results: </strong>Testing on a dataset with views ranging from RAO <math> <msup><mrow><mn>90</mn></mrow> <mo>∘</mo></msup> </math> to LAO <math> <msup><mrow><mn>90</mn></mrow> <mo>∘</mo></msup> </math> , simulated fluoroscopic images showed visually imperceptible differences from the real ones, achieving a mean target projection distance error of 1.55 mm. An \"endoleak\" phantom insertion experiment highlighted the effectiveness of simulating multiplanar views with real-time instrument overlays, achieving a mean needle tip error of 3.42 mm.</p><p><strong>Conclusions: </strong>Results demonstrated the efficacy of virtual fluoroscopy integrated with magnetic tracking, improving depth perception during navigation. The broad capture range of virtual fluoroscopy showed promise in improving the users' understanding of X-ray imaging principles, facilitating more efficient image acquisition.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144086552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victoria Marting, Noor Borren, Max R van Diepen, Esther M M van Lieshout, Mathieu M E Wijffels, Theo van Walsum
{"title":"A deep learning-based approach to automated rib fracture detection and CWIS classification.","authors":"Victoria Marting, Noor Borren, Max R van Diepen, Esther M M van Lieshout, Mathieu M E Wijffels, Theo van Walsum","doi":"10.1007/s11548-025-03390-5","DOIUrl":"https://doi.org/10.1007/s11548-025-03390-5","url":null,"abstract":"<p><strong>Purpose: </strong>Trauma-induced rib fractures are a common injury. The number and characteristics of these fractures influence whether a patient is treated nonoperatively or surgically. Rib fractures are typically diagnosed using CT scans, yet 19.2-26.8% of fractures are still missed during assessment. Another challenge in managing rib fractures is the interobserver variability in their classification. Purpose of this study was to develop and assess an automated method that detects rib fractures in CT scans, and classifies them according to the Chest Wall Injury Society (CWIS) classification.</p><p><strong>Methods: </strong>198 CT scans were collected, of which 170 were used for training and internal validation, and 28 for external validation. Fractures and their classifications were manually annotated in each of the scans. A detection and classification network was trained for each of the three components of the CWIS classifications. In addition, a rib number labeling network was trained for obtaining the rib number of a fracture. Experiments were performed to assess the method performance.</p><p><strong>Results: </strong>On the internal test set, the method achieved a detection sensitivity of 80%, at a precision of 87%, and an F1-score of 83%, with a mean number of FPPS (false positives per scan) of 1.11. Classification sensitivity varied, with the lowest being 25% for complex fractures and the highest being 97% for posterior fractures. The correct rib number was assigned to 94% of the detected fractures. The custom-trained nnU-Net correctly labeled 95.5% of all ribs and 98.4% of fractured ribs in 30 patients. The detection and classification performance on the external validation dataset was slightly better, with a fracture detection sensitivity of 84%, precision of 85%, F1-score of 84%, FPPS of 0.96 and 95% of the fractures were assigned the correct rib number.</p><p><strong>Conclusion: </strong>The method developed is able to accurately detect and classify rib fractures in CT scans, there is room for improvement in the (rare and) underrepresented classes in the training set.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144081970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ren Kawaguchi, Tomoya Minagawa, Kensuke Hori, Takeyuki Hashimoto
{"title":"Application of deep learning with fractal images to sparse-view CT.","authors":"Ren Kawaguchi, Tomoya Minagawa, Kensuke Hori, Takeyuki Hashimoto","doi":"10.1007/s11548-025-03378-1","DOIUrl":"https://doi.org/10.1007/s11548-025-03378-1","url":null,"abstract":"<p><strong>Purpose: </strong>Deep learning has been widely used in research on sparse-view computed tomography (CT) image reconstruction. While sufficient training data can lead to high accuracy, collecting medical images is often challenging due to legal or ethical concerns, making it necessary to develop methods that perform well with limited data. To address this issue, we explored the use of nonmedical images for pre-training. Therefore, in this study, we investigated whether fractal images could improve the quality of sparse-view CT images, even with a reduced number of medical images.</p><p><strong>Methods: </strong>Fractal images generated by an iterated function system (IFS) were used for nonmedical images, and medical images were obtained from the CHAOS dataset. Sinograms were then generated using 36 projections in sparse-view and the images were reconstructed by filtered back-projection (FBP). FBPConvNet and WNet (first module: learning fractal images, second module: testing medical images, and third module: learning output) were used as networks. The effectiveness of pre-training was then investigated for each network. The quality of the reconstructed images was evaluated using two indices: structural similarity (SSIM) and peak signal-to-noise ratio (PSNR).</p><p><strong>Results: </strong>The network parameters pre-trained with fractal images showed reduced artifacts compared to the network trained exclusively with medical images, resulting in improved SSIM. WNet outperformed FBPConvNet in terms of PSNR. Pre-training WNet with fractal images produced the best image quality, and the number of medical images required for main-training was reduced from 5000 to 1000 (80% reduction).</p><p><strong>Conclusion: </strong>Using fractal images for network training can reduce the number of medical images required for artifact reduction in sparse-view CT. Therefore, fractal images can improve accuracy even with a limited amount of training data in deep learning.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144080487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}