Md Motaleb Hossen Manik, Md Zabirul Islam, Ge Wang
{"title":"MILU: a consensus ensemble benchmark for multimodal medical imaging lecture understanding.","authors":"Md Motaleb Hossen Manik, Md Zabirul Islam, Ge Wang","doi":"10.1117/1.JMI.13.6.062202","DOIUrl":"https://doi.org/10.1117/1.JMI.13.6.062202","url":null,"abstract":"<p><strong>Purpose: </strong>Vision-language models (VLMs) are increasingly used to interpret multimodal educational materials, yet their reliability on diagram-, equation-, and text-dense scientific lecture slides remains poorly understood. This work introduces Medical Imaging Lecture Understanding (MILU), a large-scale benchmark designed to characterize cross-model variability in structured understanding of real medical imaging lectures.</p><p><strong>Approach: </strong>MILU includes 23 lecture sets with 1117 slides. LLaVA-OneVision, InternVL3-14B, Qwen2-VL-7B, and Qwen3-VL-4B were evaluated using unified prompts to generate structured JSON. We assessed parsing coverage, pairwise agreement, lecture-level patterns, and how outputs aligned with a simple consensus ensemble to identify shared concepts and relations across slides and models effectively.</p><p><strong>Results: </strong>All models produced valid JSON for most slides (92% to 99% coverage), but semantic agreement was extremely low. Pairwise concept Jaccard indices ranged from 0.03 to 0.09, and triple-level <math><mrow><mi>F</mi> <mn>1</mn></mrow> </math> scores from 0.001 to 0.033. Lecture-level patterns revealed higher stability in mathematically structured lectures and lower stability in diagram-heavy content. The consensus ensemble showed modest alignment with individual models (concept Jaccard 0.056 to 0.179; triple <math><mrow><mi>F</mi> <mn>1</mn></mrow> </math> 0.014 to 0.044), exposing areas of consistent convergence while also highlighting systematic disagreement.</p><p><strong>Conclusions: </strong>MILU provides the first comprehensive benchmark for evaluating structured understanding of scientific lecture slides. The results show that current VLMs achieve high formatting reliability but low semantic consistency. MILU establishes a foundation for future expert-annotated benchmarks, diagram- and math-aware modeling, and improved methods for scientific lecture interpretation.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 6","pages":"062202"},"PeriodicalIF":1.7,"publicationDate":"2026-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13082354/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147700301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charles Guan, Alexander P Rockhill, Masashi Sode, Gianmarco Pinton
{"title":"mach: ultrafast ultrasound beamforming.","authors":"Charles Guan, Alexander P Rockhill, Masashi Sode, Gianmarco Pinton","doi":"10.1117/1.JMI.13.6.062203","DOIUrl":"https://doi.org/10.1117/1.JMI.13.6.062203","url":null,"abstract":"<p><strong>Purpose: </strong>Volumetric ultrafast ultrasound produces massive datasets with high frame rates, dense reconstruction grids, and large channel counts. Beamforming computational demands limit research throughput and prevent real-time applications in emerging modalities such as elastography, functional neuroimaging, and microscopy.</p><p><strong>Approach: </strong>We developed mach, an open-source, GPU-accelerated beamformer with a highly optimized delay-and-sum CUDA kernel and an accessible Python interface. mach uses a hybrid delay computation strategy that substantially reduces memory overhead compared with fully precomputed approaches. The CUDA implementation optimizes memory layout for coalesced access and reuses delay computations across frames via shared memory. We benchmarked mach on the PyMUST rotating disk dataset and validated numerical accuracy against existing open-source beamformers.</p><p><strong>Results: </strong>mach processes 1.1 trillion points per second on a consumer-grade GPU, achieving > <math><mrow><mn>10</mn> <mo>×</mo></mrow> </math> faster performance than existing open-source GPU beamformers. On the PyMUST rotating disk benchmark, mach completes reconstruction in 0.23 ms, 6× faster than the acoustic round-trip time to the imaging depth. Validation against other beamformers confirms numerical accuracy with errors below <math><mrow><mo>-</mo> <mn>60</mn> <mtext> </mtext> <mi>dB</mi></mrow> </math> for Power Doppler and <math><mrow><mo>-</mo> <mn>120</mn> <mtext> </mtext> <mi>dB</mi></mrow> </math> for B-mode.</p><p><strong>Conclusions: </strong>mach achieves 1.1 trillion points per second throughput, enabling real-time 3D ultrafast ultrasound reconstruction for the first time on consumer-grade hardware. By eliminating the beamforming bottleneck, mach enables real-time applications such as 3D functional neuroimaging, intraoperative guidance, and ultrasound localization microscopy. mach is freely available at https://github.com/Forest-Neurotech/mach.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 6","pages":"062203"},"PeriodicalIF":1.7,"publicationDate":"2026-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13053060/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147634858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lena Giebeler, Deepa Krishnaswamy, David Clunie, Jakob Wasserthal, Lalith Kumar Shiyam Sundar, Andres Diaz-Pinto, Klaus H Maier-Hein, Murong Xu, Bjoern Menze, Steve Pieper, Ron Kikinis, Andrey Fedorov
{"title":"In search of truth: evaluating concordance of AI-based anatomy segmentation models.","authors":"Lena Giebeler, Deepa Krishnaswamy, David Clunie, Jakob Wasserthal, Lalith Kumar Shiyam Sundar, Andres Diaz-Pinto, Klaus H Maier-Hein, Murong Xu, Bjoern Menze, Steve Pieper, Ron Kikinis, Andrey Fedorov","doi":"10.1117/1.JMI.13.6.062204","DOIUrl":"10.1117/1.JMI.13.6.062204","url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence based methods for anatomy segmentation can help automate characterization of large imaging datasets. The growing number of similar functionality models raises the challenge of evaluating them on datasets that do not contain ground truth annotations. We introduce a practical framework to assist in this task.</p><p><strong>Approach: </strong>We harmonize the segmentation results into a standard, interoperable representation, which enables consistent, terminology-based labeling of the structures. We extend 3D Slicer to streamline loading and comparison of these harmonized segmentations and demonstrate how standard representation simplifies review of the results using interactive summary plots and browser-based visualization using the OHIF Viewer. To demonstrate the utility of the approach, we apply it to evaluating segmentation of 31 anatomical structures (lungs, vertebrae, ribs, and heart) by 6 open-source models-TotalSegmentator 1.5 and 2.6, Auto3DSeg, MOOSE, MultiTalent, and CADS-for a sample of computed tomography scans from the publicly available National Lung Screening Trial dataset.</p><p><strong>Results: </strong>We demonstrate the utility of the framework in enabling automating loading, structure-wise inspection, and comparison across models. Preliminary results ascertain the practical utility of the approach in allowing quick detection and review of problematic results. The comparison shows excellent agreement segmenting some (e.g., lung) but not all structures (e.g., some models produce invalid vertebrae or rib segmentations).</p><p><strong>Conclusions: </strong>The open-source resources developed include segmentation harmonization scripts, interactive summary plots, and visualization tools. These resources assist in segmentation model evaluation in the absence of ground truth, ultimately enabling informed model selection.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 6","pages":"062204"},"PeriodicalIF":1.7,"publicationDate":"2026-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13050620/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147634872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yurim Lee, Maxwell J Kiernan, Carol C Mitchell, Shahriar Salamat, Stephanie M Wilbrand, Robert J Dempsey, Tomy Varghese
{"title":"Comparison of 2D and 3D carotid plaque analysis and longitudinal <i>in vivo</i> ultrasound registration using 3D histology.","authors":"Yurim Lee, Maxwell J Kiernan, Carol C Mitchell, Shahriar Salamat, Stephanie M Wilbrand, Robert J Dempsey, Tomy Varghese","doi":"10.1117/1.JMI.13.2.027501","DOIUrl":"https://doi.org/10.1117/1.JMI.13.2.027501","url":null,"abstract":"<p><strong>Purpose: </strong>Characterizing carotid plaque specimens based on two-dimensional (2D) \"representative\" histology sections is considered standard practice in clinics. In comparison, three-dimensional (3D) histology has the potential to provide much more useful, volumetric information about carotid plaques. Despite this, due to the increased requirement for manual labor, 3D histology has been less employed for carotid plaque characterization. Evaluating the representativeness of 2D histology and exploring clinical applications of 3D carotid plaque histology, particularly in the arena of registration to and correlations with <i>in vivo</i> ultrasound, could be insightful.</p><p><strong>Approach: </strong>Using 3D carotid plaque histology models, we evaluated the representativeness of 2D histology by comparing the predicted specimen composition based on 2D histology to the actual specimen composition based on 3D histology. We introduced a workflow that properly orients 3D carotid plaque histology based on transverse ultrasound and takes virtual histology slices at an angle to register histology to the longitudinal ultrasound. We correlated 3D histology composition to <i>in vivo</i> ultrasound parameters such as strain and grayscale features.</p><p><strong>Results: </strong>The 2D histology successfully predicted specimen composition (to within 3%) for 11 specimens out of 34. The 2D representative slice predictions generally overestimated calcification for more calcified specimens ( <math><mrow><mo>≳</mo> <mn>30</mn> <mo>%</mo></mrow> </math> calcified). Using 3D histology, we registered virtual histology to <i>in vivo</i> longitudinal ultrasound B-mode and strain. For B-mode, the registrations had higher IoU with respect to the ultrasonographer's annotations ( <math><mrow><mn>0.54</mn> <mo>±</mo> <mn>0.05</mn></mrow> </math> ) compared with the registrations with conventional 2D histology ( <math><mrow><mn>0.30</mn> <mo>±</mo> <mn>0.08</mn></mrow> </math> ). 3D histology composition was loosely related to all strain indices and grayscale features used in the study. In one of the cases, we note that hemorrhage corresponded to opposing strains.</p><p><strong>Conclusions: </strong>3D histology can be helpful for carotid plaque characterization as it enables a better understanding of plaque composition and better histology to <i>in vivo</i> ultrasound imaging registration.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"027501"},"PeriodicalIF":1.7,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12973674/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147436648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accuracy and reliability of artery-vein differentiation in small-field macular OCT angiography.","authors":"Haneen Alfauri, Tugce Ilayda Turer, Cyriac Manjaly, Aditya Santoki, Senyue Hao, Marin Woronets, Chao Zhou, Rithwick Rajagopal","doi":"10.1117/1.JMI.13.2.025501","DOIUrl":"10.1117/1.JMI.13.2.025501","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate artery-vein (AV) differentiation in small-field macular optical coherence tomography angiography (OCTA) remains challenging due to a lack of standardized guidelines. We propose and validate criteria for <math><mrow><mn>3</mn> <mo>×</mo> <mn>3</mn> <mtext> </mtext> <msup><mrow><mi>mm</mi></mrow> <mrow><mn>2</mn></mrow> </msup> </mrow> </math> ( <math><mrow><mn>10</mn> <mtext> </mtext> <mi>deg</mi> <mo>×</mo> <mtext> </mtext> <mn>10</mn> <mtext> </mtext> <mi>deg</mi></mrow> </math> on Spectralis; <math><mrow><mo>∼</mo> <mn>2.9</mn> <mo>×</mo> <mn>2.9</mn> <mtext> </mtext> <msup><mrow><mi>mm</mi></mrow> <mrow><mn>2</mn></mrow> </msup> </mrow> </math> ) macular scans.</p><p><strong>Approach: </strong>Small field-of-view (FOV) OCTA scans were analyzed using established AV criteria for large-field ( <math><mrow><mn>12</mn> <mo>×</mo> <mn>12</mn> <mtext> </mtext> <msup><mrow><mi>mm</mi></mrow> <mrow><mn>2</mn></mrow> </msup> </mrow> </math> ) OCTA, as applied by two masked readers and validated against color fundus photographs (CFPs) and near-infrared reflectance (NIR) images. Accuracy and reliability (Cohen's <math><mrow><mi>κ</mi></mrow> </math> ) were assessed. Pixel-level AV masks were annotated with a standardized threshold. Vessel diameters and intensities were compared within our dataset and in the publicly available OCTA-500 dataset to assess whether intrinsic vessel features support AV differentiation.</p><p><strong>Results: </strong>A total of 465 vessels from 20 healthy eyes were evaluated across 3 pseudo-branching orders using the criteria for OCTA. Annotators achieved high accuracy (95.1%, 92.3%) and strong intra/inter-rater reliability ( <math><mrow><mi>κ</mi> <mo>=</mo> <mn>0.84</mn></mrow> </math> ) with similarly high AV classification accuracy within pseudo-third-order vessels (97.15%). No significant AV diameter differences were observed in either dataset ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.261</mn></mrow> </math> and 0.442). The mean intensity was similar in our dataset ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.277</mn></mrow> </math> ; <math><mrow><mo>|</mo> <mi>Δ</mi> <mo>|</mo> <mo>=</mo> <mn>3.28</mn></mrow> </math> , 1.45% relative difference) but higher for veins in OCTA-500 ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.0001</mn></mrow> </math> ; <math><mrow><mo>|</mo> <mi>Δ</mi> <mo>|</mo> <mo>=</mo> <mn>3.42</mn></mrow> </math> , 1.63% relative difference).</p><p><strong>Conclusions: </strong>Accurate and reproducible AV labeling is feasible in <math><mrow><mn>3</mn> <mo>×</mo> <mn>3</mn> <mtext> </mtext> <msup><mrow><mi>mm</mi></mrow> <mrow><mn>2</mn></mrow> </msup> </mrow> </math> scans, with strong inter- and intra-rater agreement. Vessel diameter and intensity add limited value. NIR-based alignment of OCTA with CFP provides reliable ground truth, supporting consistent manual labeling and OCTA segmentation.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"025501"},"PeriodicalIF":1.7,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13004414/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Florian Goldmann, Michael Wels, Thomas Allmendinger, Manuela Goldmann, Ralf Gutjahr, Markus Jürgens, Jonas Neumann, Leonhard Rist, Karl Stierstorfer, Michael Sühling, Andreas Maier
{"title":"Challenging Hounsfield Unit cutoffs: spectral thresholding for synthetic coronary plaque phantoms on photon-counting CT.","authors":"Florian Goldmann, Michael Wels, Thomas Allmendinger, Manuela Goldmann, Ralf Gutjahr, Markus Jürgens, Jonas Neumann, Leonhard Rist, Karl Stierstorfer, Michael Sühling, Andreas Maier","doi":"10.1117/1.JMI.13.2.024003","DOIUrl":"https://doi.org/10.1117/1.JMI.13.2.024003","url":null,"abstract":"<p><strong>Purpose: </strong>Assess whether photon-counting computed tomography (PCCT) improves discrimination of vulnerable coronary soft-plaque components by extending one-dimensional Hounsfield Unit (HU) thresholding to a simple, interpretable two-dimensional linear rule.</p><p><strong>Approach: </strong>We generated a synthetic cohort of <math><mrow><mi>N</mi> <mo>=</mo> <mn>225</mn></mrow> </math> coronary plaque phantoms with randomized anatomy, tissue composition (lipid-rich, fibrotic, calcified), and iodine concentrations. Ultra-high-resolution PCCT data were reconstructed into polychromatic T3D, high energy threshold, material-specific, and virtual monoenergetic images (VMIs). Voxel-wise logistic regression implemented single-image (1D) and dual-image (2D) decision rules; performance was assessed by the area under the receiver operating characteristic curve (ROC-AUC). Partial-volume behavior was quantified as correctness versus Euclidean distance to the nearest out-of-class voxel using isotonic regression with a phantom-level bootstrap.</p><p><strong>Results: </strong>Combining T3D with low-keV VMI yielded the best separation of lipid-rich and fibrous soft-plaque subtypes. A 2D linear rule on T3D + <math> <mrow><msub><mi>VMI</mi> <mn>50</mn></msub> </mrow> </math> achieved <math><mrow><mi>AUC</mi> <mo>=</mo> <mn>0.925</mn></mrow> </math> (95% CI: [0.912, 0.937]), exceeding 1D thresholding on T3D ( <math><mrow><mi>AUC</mi> <mo>=</mo> <mn>0.850</mn></mrow> </math> ; 95% CI: [0.821, 0.875]) and on <math> <mrow><msub><mi>VMI</mi> <mn>50</mn></msub> </mrow> </math> ( <math><mrow><mi>AUC</mi> <mo>=</mo> <mn>0.814</mn></mrow> </math> ; 95% CI: [0.780, 0.843]). Correctness increased with distance to the nearest out-of-class voxel and was <math><mrow><mo>≥</mo> <mn>95</mn> <mo>%</mo></mrow> </math> for voxels at distances <math><mrow><mi>D</mi> <mo>≥</mo> <mn>0.28</mn> <mtext> </mtext> <mi>mm</mi></mrow> </math> (lipid-rich) and <math><mrow><mi>D</mi> <mo>≥</mo> <mn>0.43</mn> <mtext> </mtext> <mi>mm</mi></mrow> </math> (fibrous) (lower 95% CI bounds: 0.20 and 0.41 mm). Accuracy degraded below these thresholds.</p><p><strong>Conclusions: </strong>A transparent, affine 2D threshold that combines routinely reconstructed PCCT images improves voxel-wise discrimination of lipid-rich versus fibrous plaque over conventional HU binning, yielding higher AUCs with tighter 95% confidence intervals. The derived boundary-distance guidance indicates where voxel-level decisions remain reliable, supporting interpretable, clinically pragmatic plaque assessment.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"024003"},"PeriodicalIF":1.7,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13048716/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147624464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yipeng Sun, Linda-Sophie Schneider, Siyuan Mei, Jinhua Wang, Ge Hu, Mingxuan Gu, Chengze Ye, Fabian Wagner, Lan Song, Siming Bayer, Andreas Maier
{"title":"Filter2Noise: a framework for interpretable and zero-shot low-dose CT image denoising.","authors":"Yipeng Sun, Linda-Sophie Schneider, Siyuan Mei, Jinhua Wang, Ge Hu, Mingxuan Gu, Chengze Ye, Fabian Wagner, Lan Song, Siming Bayer, Andreas Maier","doi":"10.1117/1.JMI.13.2.024004","DOIUrl":"https://doi.org/10.1117/1.JMI.13.2.024004","url":null,"abstract":"<p><strong>Purpose: </strong>Deep learning has achieved remarkable progress in low-dose computed tomography (LDCT) denoising; however, radiologists struggle to trust black-box models they cannot verify or control. Zero-shot methods eliminate training data requirements but fail on computed tomography's (CT) spatially correlated noise. We demonstrate that a transparent mathematical operator, when made content-adaptive, can match deep learning performance while remaining fully interpretable.</p><p><strong>Approach: </strong>We introduce Filter2Noise (F2N), which replaces conventional deep networks with an attention-guided bilateral filter that adapts to local anatomy. A lightweight attention module (3.6k parameters) predicts optimal filtering strategies for each image region by analyzing tissue type, texture, and noise characteristics. To enable robust learning from a single noisy image with correlated noise, we develop Euclidean local shuffle, which strategically disrupts noise correlations while preserving anatomical structure, and a multi-scale self-supervised loss that enforces consistency across resolutions.</p><p><strong>Results: </strong>On the Mayo Clinic LDCT Grand Challenge, F2N achieves 39.76 dB peak signal-to-noise ratio, outperforming the next-best zero-shot method by 1.88 dB, while using 360× fewer parameters (3.6k versus 1.3M). Clinical validation on photon-counting CT demonstrates that F2N elevates low-dose images to full-dose quality (no statistically significant difference in contrast-to-noise ratio, <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.10</mn></mrow> </math> ). The learned filtering strategy is fully visualizable: parameter maps reveal content-aware behavior. Radiologists can interactively adjust these parameters post-training to refine denoising in diagnostically critical regions.</p><p><strong>Conclusions: </strong>F2N reconciles competitive performance with complete interpretability and user control, providing radiologists with a verifiable tool that works across scanners and protocols without retraining.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"024004"},"PeriodicalIF":1.7,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13082356/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147700218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yannuo Wen, Kathleen M Curran, Xinzhu Wang, Nuala A Healy, John J Healy
{"title":"Synthesizing breast cancer ultrasound images from healthy samples using latent diffusion models.","authors":"Yannuo Wen, Kathleen M Curran, Xinzhu Wang, Nuala A Healy, John J Healy","doi":"10.1117/1.JMI.13.2.024002","DOIUrl":"https://doi.org/10.1117/1.JMI.13.2.024002","url":null,"abstract":"<p><strong>Purpose: </strong>Breast ultrasound is widely used for cancer screening, but data scarcity and annotation challenges hinder deep learning adoption. Synthetic image generation offers a promising solution to enhance training datasets while preserving patient privacy. However, problems such as inadequate quality of synthesized images and the need for large amounts of data to train the synthesis models remain significant.</p><p><strong>Approach: </strong>We propose a three-stage latent diffusion model (LDM) workflow-enhanced by Vision Transformers and fine-tuned with low-rank adaptation-that synthesizes realistic malignant and benign breast ultrasound images directly from healthy samples while simultaneously generating accurate segmentation masks. Stage division significantly reduces the task complexity of a single synthesis model. Applied to the BUSI dataset (133 healthy, 487 benign, and 210 malignant images), the method generates synthetic cases of each tumor type.</p><p><strong>Results: </strong>A ResNet101 classifier could not reliably distinguish synthetic from real images (AUC = 0.563), indicating high visual plausibility. Quantitative metrics confirmed strong fidelity: Fréchet inception distance = 15.2 and inception score = 1.79, indicating low distributional divergence in feature space and high similarity to real data. When used for training a U-Net segmentation model, the augmented dataset improved the <math><mrow><mi>F</mi> <mn>1</mn></mrow> </math> -score from 0.870 to 0.896, demonstrating substantial gains in diagnostic accuracy.</p><p><strong>Conclusions: </strong>These results show that the proposed three-stage LDM can generate high-quality, anatomically coherent breast cancer images from healthy controls, effectively alleviating data scarcity and enabling more robust training of medical AI systems without compromising clinical realism.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"024002"},"PeriodicalIF":1.7,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter-efficient deep-learning-based model for segmentation with radiomic feature extraction.","authors":"Daniel Sleiman, Navchetan Awasthi","doi":"10.1117/1.JMI.13.2.024502","DOIUrl":"https://doi.org/10.1117/1.JMI.13.2.024502","url":null,"abstract":"<p><strong>Purpose: </strong>Magnetic resonance imaging (MRI), particularly dynamic contrast-enhanced MRI (DCE-MRI), plays a vital role in breast cancer assessment by highlighting tumor regions. Accurate 3D segmentation of tumors can significantly aid in diagnosis, disease monitoring, and treatment planning. Current state-of-the-art models such as nnU-Net are computationally expensive, with high parameter counts and memory requirements. In this work, we propose a parameter-efficient convolutional neural network-based architecture tailored for breast tumor segmentation in DCE-MRI.</p><p><strong>Approach: </strong>The model integrates lightweight residual blocks into the SegResNet backbone and is trained on the first 3 DCE-MRI phases. We test the addition of the FRLoss from the HCMA-UNet model in the FRLoss ablation. Its encoder-decoder design also enables exploration of the treatment response prediction to neoadjuvant chemotherapy measured by the pathological complete response, using the output of the last encoder block. The last encoder block output is average-pooled and used as input to an XGBoost model with two estimators (max depth 5, learning rate 1).</p><p><strong>Results: </strong>Evaluated on the public MAMA-MIA dataset, our proposed model achieves comparable performance to nnU-Net and SegResNet, with a 0.99% higher Dice score than nnU-Net, while reducing parameter count by 91.5%, FLOPs by 85.05%, and memory usage by 31.94% compared with nnU-Net. Therefore, the proposed model is significantly more efficient than nnU-Net and also offers superior accuracy to the Mamba-based baseline, even though the Mamba baseline remains computationally lighter demonstrated by its faster inference speed and lower giga floating point operations per second (38 versus 87). The XGBoost model for treatment response prediction does not demonstrate competitive performance with a balanced accuracy of 57.2% and receiver operating characteristic-area under curve of 0.542.</p><p><strong>Conclusion: </strong>Our results demonstrate that parameter-efficient models can achieve competitive performance in DCE-MRI tumor segmentation.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"024502"},"PeriodicalIF":1.7,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13082353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147700336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic dental crown generation with spatial constraint modeling.","authors":"Golriz Hosseinimanesh, Farida Cheriet, Ammar Alsheghri, Victoria-Mae Carrière, Julia Keren, Francois Guibault","doi":"10.1117/1.JMI.13.2.023501","DOIUrl":"https://doi.org/10.1117/1.JMI.13.2.023501","url":null,"abstract":"<p><strong>Purpose: </strong>Deep learning algorithms offer the potential to automate dental crown generation, reducing time-intensive manual design in dental laboratories. However, achieving crowns suitable for direct clinical use requires both geometric precision and functional accuracy to minimize post-generation adjustments. Current approaches focus primarily on shape completion without explicitly modeling critical spatial relationships, including margin line boundaries, occlusal contact patterns, and adjacent tooth interactions. These limitations result in generated crowns lacking spatial accuracy necessary for direct clinical application.</p><p><strong>Approach: </strong>We present a comprehensive framework employing transformer encoder-decoder architecture integrated with differentiable Poisson surface reconstruction for direct dental crown mesh generation. The framework incorporates two key innovations to address clinical limitations. First, margin line data is integrated as direct network input, concatenated with master and antagonist arch geometries, providing explicit boundary constraints during crown generation. Second, spatial constraint losses ensure anatomically valid relationships through antagonist interaction loss for proper occlusal contact patterns and intersection loss to prevent crown penetration into adjacent teeth.</p><p><strong>Results: </strong>The proposed framework achieves substantial improvements over existing state-of-the-art methods, with geometric accuracy gains ranging from 35.9% to 40.6% across evaluation metrics. Margin line integration yields a 31.2% improvement in geometric precision, with maximum boundary errors reduced from 1.37 to 0.74 mm and a 58.4% reduction in variability. Antagonist interaction loss provides 9.51% improvement in occlusal alignment, while intersection loss substantially reduces crown penetration into adjacent teeth.</p><p><strong>Conclusions: </strong>Substantial performance improvements validate the effectiveness of integrating spatial constraint modeling and direct margin line input into the generation process, establishing a foundation for clinical deployment of automated dental crown design systems.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"023501"},"PeriodicalIF":1.7,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13109996/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147785802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}