Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献

筛选
英文 中文
RT-GAN: Recurrent Temporal GAN for Adding Lightweight Temporal Consistency to Frame-Based Domain Translation Approaches. RT-GAN:为基于帧的域翻译方法增加轻量级时间一致性的递归时间GAN。
Shawn Mathew, Saad Nadeem, Arie Kaufman
{"title":"RT-GAN: Recurrent Temporal GAN for Adding Lightweight Temporal Consistency to Frame-Based Domain Translation Approaches.","authors":"Shawn Mathew, Saad Nadeem, Arie Kaufman","doi":"10.1007/978-3-032-05127-1_43","DOIUrl":"10.1007/978-3-032-05127-1_43","url":null,"abstract":"<p><p>Fourteen million colonoscopies are performed annually just in the U.S. However, the videos from these colonoscopies are not saved due to storage constraints (each video from a high-definition colonoscope camera can be in tens of gigabytes). Instead, a few relevant individual frames are saved for documentation/reporting purposes and these are the frames on which most current colonoscopy AI models are trained on. While developing new unsupervised domain translation methods for colonoscopy (e.g. to translate between real optical and virtual/CT colonoscopy), it is thus typical to start with approaches that initially work for individual frames without temporal consistency. Once an individual-frame model has been finalized, additional contiguous frames are added with a modified deep learning architecture to train a new model from scratch for temporal consistency. This transition to temporally-consistent deep learning models, however, requires significantly more computational and memory resources for training. In this paper, we present a lightweight solution with a tunable temporal parameter, RT-GAN (Recurrent Temporal GAN), for adding temporal consistency to individual frame-based approaches that reduces training requirements by a factor of 5. We demonstrate the effectiveness of our approach on two challenging use cases in colonoscopy: haustral fold segmentation (indicative of missed surface) and realistic colonoscopy simulator video generation. We also release a first-of-its kind temporal dataset for colonoscopy for the above use cases. The datasets, accompanying code, and pretrained models will be made available on our Computational Endoscopy Platform GitHub (https://github.com/nadeemlab/CEP).</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15969 ","pages":"446-455"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12906703/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146208607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unpaired Multi-Site Brain MRI Harmonization with Image Style-Guided Latent Diffusion. 基于图像风格引导的潜在扩散的非配对多位点脑MRI协调。
Mengqi Wu, Minhui Yu, Weili Lin, Pew-Thian Yap, Mingxia Liu
{"title":"Unpaired Multi-Site Brain MRI Harmonization with Image Style-Guided Latent Diffusion.","authors":"Mengqi Wu, Minhui Yu, Weili Lin, Pew-Thian Yap, Mingxia Liu","doi":"10.1007/978-3-032-04947-6_65","DOIUrl":"10.1007/978-3-032-04947-6_65","url":null,"abstract":"<p><p>Multi-site brain MRI heterogeneity caused by differences in scanner field strengths, acquisition protocols, and software versions poses a significant challenge for consistent analysis. Image-level harmonization, leveraging advanced learning methods, has attracted increasing attention. However, existing methods often rely on paired data (<i>e.g.</i>, human traveling phantoms) for training, which are not always available. Some methods perform MRI harmonization by transferring target-style features to source images but require explicitly learning disentangled image styles (<i>e.g.</i>, contrast) via encoder-decoder networks, which increases computational complexity. This paper presents an unpaired MRI harmonization (UMH) framework based on a new image style-guided diffusion model. UMH operates in two stages: (1) a <i>coarse harmonizer</i> that aligns multi-site MRIs to a unified domain via a conditional latent diffusion model while preserving anatomical content; and (2) a <i>fine harmonizer</i> that adapts coarsely harmonized images to a specific target using style embeddings derived from a pre-trained Contrastive Language-Image Pre-training (CLIP) encoder, which captures semantic style differences between the original MRIs and their coarsely-aligned counterparts, eliminating the need for paired data. By leveraging rich semantic style representations of CLIP, UMH avoids learning image styles explicitly, thereby reducing computation costs. We evaluate UMH on 4,123 MRIs from three distinct multi-site datasets, with results suggesting its superiority over several state-of-the-art (SOTA) methods across image-level comparison, downstream classification, and brain tissue segmentation tasks.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15962 ","pages":"683-693"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12706746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145776983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FluoroSAM: A Language-promptable Foundation Model for Flexible X-ray Image Segmentation. FluoroSAM:用于柔性x射线图像分割的语言提示基础模型。
Benjamin D Killeen, Liam J Wang, Blanca Iñígo, Han Zhang, Mehran Armand, Russell H Taylor, Greg Osgood, Mathias Unberath
{"title":"FluoroSAM: A Language-promptable Foundation Model for Flexible X-ray Image Segmentation.","authors":"Benjamin D Killeen, Liam J Wang, Blanca Iñígo, Han Zhang, Mehran Armand, Russell H Taylor, Greg Osgood, Mathias Unberath","doi":"10.1007/978-3-032-04981-0_24","DOIUrl":"10.1007/978-3-032-04981-0_24","url":null,"abstract":"<p><p>Language promptable X-ray image segmentation would enable greater flexibility for human-in-the-loop workflows in diagnostic and interventional precision medicine. Prior efforts have contributed task-specific models capable of solving problems within a narrow scope, but expanding to broader use requires additional data, annotations, and training time. Recently, language-aligned foundation models (LFMs) - machine learning models trained on large amounts of highly variable image and text data thus enabling broad applicability - have emerged as promising tools for automated image analysis. Existing foundation models for medical image analysis focus on scenarios and modalities where large, richly annotated datasets are available. However, the X-ray imaging modality features highly variable image appearance and applications, from diagnostic chest X-rays to interventional fluoroscopy, with varying availability of data. To pave the way toward an LFM for comprehensive and language-aligned analysis of arbitrary medical X-ray images, we introduce FluoroSAM, a language-promptable variant of the Segment-Anything Model, trained from scratch on 3M synthetic X-ray images from a wide variety of human anatomies, imaging geometries, and viewing angles. These include pseudo-ground truth masks for 128 organ types and 464 tools with associated text descriptions. FluoroSAM is capable of segmenting myriad anatomical structures and tools based on natural language prompts, thanks to the novel incorporation of vector quantization (VQ) of text embeddings in the training process. We demonstrate FluoroSAM's performance quantitatively on real X-ray images and showcase on several applications how FluoroSAM is a key enabler for rich human-machine interaction in the X-ray image acquisition and analysis context. <i>Code is available at</i> https://github.com/arcadelab/fluorosam.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15966 ","pages":"248-258"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12822567/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146032259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Core-Periphery Principle Guided State Space Model for Functional Connectome Classification. 核心-外围原则指导的功能连接体分类状态空间模型。
Minheng Chen, Xiaowei Yu, Jing Zhang, Tong Chen, Chao Cao, Yan Zhuang, Yanjun Lyu, Lu Zhang, Tianming Liu, Dajiang Zhu
{"title":"Core-Periphery Principle Guided State Space Model for Functional Connectome Classification.","authors":"Minheng Chen, Xiaowei Yu, Jing Zhang, Tong Chen, Chao Cao, Yan Zhuang, Yanjun Lyu, Lu Zhang, Tianming Liu, Dajiang Zhu","doi":"10.1007/978-3-032-05162-2_23","DOIUrl":"10.1007/978-3-032-05162-2_23","url":null,"abstract":"<p><p>Understanding the organization of human brain networks has become a central focus in neuroscience, particularly in the study of functional connectivity, which plays a crucial role in diagnosing neurological disorders. Advances in functional magnetic resonance imaging and machine learning techniques have significantly improved brain network analysis. However, traditional machine learning approaches struggle to capture the complex relationships between brain regions, while deep learning methods, particularly Transformer-based models, face computational challenges due to their quadratic complexity in long-sequence modeling. To address these limitations, we propose a Core-Periphery State-Space Model (CP-SSM), an innovative framework for functional connectome classification. Specifically, we introduce Mamba, a selective state-space model with linear complexity, to effectively capture long-range dependencies in functional brain networks. Furthermore, inspired by the core-periphery (CP) organization, a fundamental characteristic of brain networks that enhances efficient information transmission, we design CP-MoE, a CP-guided Mixture-of-Experts that improves the representation learning of brain connectivity patterns. We evaluate CP-SSM on two benchmark fMRI datasets: ABIDE and ADNI. Experimental results demonstrate that CP-SSM surpasses Transformer-based models in classification performance while significantly reducing computational complexity. These findings highlight the effectiveness and efficiency of CP-SSM in modeling brain functional connectivity, offering a promising direction for neuroimaging-based neurological disease diagnosis. Our code is available at https://github.com/m1nhengChen/cpssm.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15971 ","pages":"236-246"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12715851/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145807067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LNODE: Uncovering the Latent Dynamics of A β in Alzheimer's Disease. LNODE:揭示阿尔茨海默病中A β的潜在动力学。
Zheyu Wen, George Biros
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">LNODE: Uncovering the Latent Dynamics of <ns0:math><ns0:mi>A</ns0:mi> <ns0:mi>β</ns0:mi></ns0:math> in Alzheimer's Disease.","authors":"Zheyu Wen, George Biros","doi":"10.1007/978-3-032-05182-0_31","DOIUrl":"10.1007/978-3-032-05182-0_31","url":null,"abstract":"<p><p><math><mtext>A</mtext> <mi>β</mi></math> Positron Emission Tomography (PET) is often used to manage Alzheimer's disease (AD). To better understand <math><mtext>A</mtext> <mi>β</mi></math> progression, we introduce and evaluate a mathematical model that couples <math><mtext>A</mtext> <mi>β</mi></math> at parcellated gray matter regions. We term this model LNODE for \"<i>latent network ordinary differential equations</i>\". At each region, we track normal <math><mtext>A</mtext> <mi>β</mi></math> , abnormal <math><mtext>A</mtext> <mi>β</mi></math> , and <math><mi>m</mi></math> latent states that intend to capture unobservable mechanisms coupled to <math><mtext>A</mtext> <mi>β</mi></math> progression. LNODE is parameterized by subject-specific parameters and cohort parameters. We jointly invert for these parameters by fitting the model to <math><mtext>A</mtext> <mi>β</mi></math> -PET data from 585 subjects from the ADNI dataset. Although underparameterized, our model achieves population <math> <msup><mrow><mi>R</mi></mrow> <mrow><mn>2</mn></mrow> </msup> <mo>≥</mo> <mn>98</mn> <mo>%</mo></math> compared to <math> <msup><mrow><mi>R</mi></mrow> <mrow><mn>2</mn></mrow> </msup> <mo>≤</mo> <mn>60</mn> <mo>%</mo></math> when fitting without latent states. Furthermore, these preliminary results suggest the existence of different subtypes of <math><mtext>A</mtext> <mi>β</mi></math> progression.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15974 ","pages":"313-322"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12784421/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145954748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models. 基于大视觉语言模型的胃肠道图像分析的幻觉感知多模态基准。
Bidur Khanal, Sandesh Pokhrel, Sanjay Bhandari, Ramesh Rana, Nikesh Shrestha, Ram B Gurung, Cristian Linte, Angus Watson, Yash R Shrestha, Binod Bhattarai
{"title":"Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models.","authors":"Bidur Khanal, Sandesh Pokhrel, Sanjay Bhandari, Ramesh Rana, Nikesh Shrestha, Ram B Gurung, Cristian Linte, Angus Watson, Yash R Shrestha, Binod Bhattarai","doi":"10.1007/978-3-032-05127-1_23","DOIUrl":"10.1007/978-3-032-05127-1_23","url":null,"abstract":"<p><p>Vision-Language Models (VLMs) are becoming increasingly popular in the medical domain, bridging the gap between medical images and clinical language. Existing VLMs demonstrate an impressive ability to comprehend medical images and text queries to generate detailed, descriptive diagnostic medical reports. However, hallucination-the tendency to generate descriptions that are inconsistent with the visual content-remains a significant issue in VLMs, with particularly severe implications in the medical field. To facilitate VLM research on gastrointestinal (GI) image analysis and study hallucination, we curate a multimodal image-text GI dataset: Gut-VLM. This dataset is created using a two-stage pipeline: first, descriptive medical reports of Kvasir-v2 images are generated using ChatGPT, which introduces some hallucinated or incorrect texts. In the second stage, medical experts systematically review these reports, and identify and correct potential inaccuracies to ensure high-quality, clinically reliable annotations. Unlike traditional datasets that contain only descriptive texts, our dataset also features tags identifying hallucinated sentences and their corresponding corrections. A common approach to reducing hallucination in VLM is to finetune the model on a small-scale, problem-specific dataset. However, we take a different strategy using our dataset. Instead of finetuning the VLM solely for generating textual reports, we finetune it to detect and correct hallucinations, an approach we call hallucination-aware finetuning. Our results show that this approach is better than simply finetuning for descriptive report generation. Additionally, we conduct an extensive evaluation of state-of-the-art VLMs across several metrics, establishing a benchmark.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15969 ","pages":"235-245"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13007979/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147518151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometry-Guided Local Alignment for Multi-View Visual Language Pre-Training in Mammography. 乳房x线造影中多视点视觉语言预训练的几何引导局部对齐。
Yuexi Du, Lihui Chen, Nicha C Dvornek
{"title":"Geometry-Guided Local Alignment for Multi-View Visual Language Pre-Training in Mammography.","authors":"Yuexi Du, Lihui Chen, Nicha C Dvornek","doi":"10.1007/978-3-032-04978-0_29","DOIUrl":"https://doi.org/10.1007/978-3-032-04978-0_29","url":null,"abstract":"<p><p>Mammography screening is an essential tool for early detection of breast cancer. The speed and accuracy of mammography interpretation has the potential to be improved with deep learning methods. However, the development of a foundation visual language model (VLM) is hindered by limited data and domain differences between natural and medical images. Existing mammography VLMs, adapted from natural images, often ignore domain-specific characteristics, such as multi-view relationships in mammography. Unlike radiologists who analyze both views together to process ipsilateral correspondence, current methods treat them as independent images or do not properly model the multi-view correspondence learning, losing critical geometric context and resulting in suboptimal prediction. We propose <b>GLAM</b>: <b>G</b>lobal and <b>L</b>ocal <b>A</b>lignment for <b>M</b>ulti-view mammography for VLM pretraining using geometry guidance. By leveraging the prior knowledge about the multi-view imaging process of mammograms, our model learns local cross-view alignments and fine-grained local features through joint global and local, visual-visual, and visual-language contrastive learning. Pretrained on EMBED [14], one of the largest open mammography datasets, our model outperforms baselines across multiple datasets under different settings.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15965 ","pages":"299-310"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13127458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147825032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WASABI: A Metric for Evaluating Morphometric Plausibility of Synthetic Brain MRIs. WASABI:一种评估合成脑核磁共振成像形态学合理性的度量。
Bahram Jafrasteh, Wei Peng, Cheng Wan, Yimin Luo, Ehsan Adeli, Qingyu Zhao
{"title":"WASABI: A Metric for Evaluating Morphometric Plausibility of Synthetic Brain MRIs.","authors":"Bahram Jafrasteh, Wei Peng, Cheng Wan, Yimin Luo, Ehsan Adeli, Qingyu Zhao","doi":"10.1007/978-3-032-04937-7_65","DOIUrl":"https://doi.org/10.1007/978-3-032-04937-7_65","url":null,"abstract":"<p><p>Generative models enhance neuroimaging through data augmentation, quality improvement, and rare condition studies. Despite advances in realistic synthetic MRIs, evaluations focus on texture and perception, lacking sensitivity to crucial morphometric fidelity. This study proposes a new metric, called WASABI (Wasserstein-Based Anatomical Brain Index), to assess the morphometric plausibility of synthetic brain MRIs. WASABI leverages <i>SynthSeg</i>, a deep learning-based brain parcellation tool, to derive volumetric measures of brain regions in each MRI and uses the multivariate Wasserstein distance to compare distributions between real and synthetic anatomies. Based on controlled experiments on two real datasets and synthetic MRIs from five generative models, WASABI demonstrates higher sensitivity in quantifying morphometric discrepancies compared to traditional image-level metrics, even when synthetic images achieve near-perfect visual quality. Our findings advocate for shifting the evaluation paradigm beyond visual inspection and conventional metrics, emphasizing morphometric fidelity as a crucial benchmark for clinically meaningful brain MRI synthesis. Our code is available at https://github.com/BahramJafrasteh/wasabi-mri.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15961 ","pages":"684-694"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13102318/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147793807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Fetal Pose Estimation across Gestational Ages via Cross-Population Augmentation. 跨胎龄胎儿姿态的稳健估计。
Sebastian Diaz, Benjamin Billot, Neel Dey, Molin Zhang, Esra Abaci Turk, P Ellen Grant, Polina Golland, Elfar Adalsteinsson
{"title":"Robust Fetal Pose Estimation across Gestational Ages via Cross-Population Augmentation.","authors":"Sebastian Diaz, Benjamin Billot, Neel Dey, Molin Zhang, Esra Abaci Turk, P Ellen Grant, Polina Golland, Elfar Adalsteinsson","doi":"10.1007/978-3-032-04981-0_52","DOIUrl":"10.1007/978-3-032-04981-0_52","url":null,"abstract":"<p><p>Fetal motion is a critical indicator of neurological development and intrauterine health, yet its quantification remains challenging, particularly at earlier gestational ages (GA). Current methods track fetal motion by predicting the location of annotated landmarks on 3D echo planar imaging (EPI) time-series, primarily in third-trimester fetuses. The predicted landmarks enable simplification of the fetal body for downstream analysis. While these methods perform well within their training age distribution, they consistently fail to generalize to early GAs due to significant anatomical changes in both mother and fetus across gestation, as well as the difficulty of obtaining annotated early GA EPI data. In this work, we develop a cross-population data augmentation framework that enables pose estimation models to robustly generalize to younger GA clinical cohorts using only annotated images from older GA cohorts. Specifically, we introduce a fetal-specific augmentation strategy that simulates the distinct intrauterine environment and fetal positioning of early GAs. Our experiments find that cross-population augmentation yields reduced variability and significant improvements across both older GA and challenging early GA cases. By enabling more reliable pose estimation across gestation, our work potentially facilitates early clinical detection and intervention in challenging 4D fetal imaging settings. Code is available at https://github.com/sebodiaz/cross-population-pose.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15966 ","pages":"549-559"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13021261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147577317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fetuses Made Simple: Modeling and Tracking of Fetal Shape and Pose. 胎儿简单:胎儿形状和姿势的建模和跟踪。
Yingcheng Liu, Peiqi Wang, Sebastian Diaz, Esra Abaci Turk, Benjamin Billot, P Ellen Grant, Polina Golland
{"title":"Fetuses Made Simple: Modeling and Tracking of Fetal Shape and Pose.","authors":"Yingcheng Liu, Peiqi Wang, Sebastian Diaz, Esra Abaci Turk, Benjamin Billot, P Ellen Grant, Polina Golland","doi":"10.1007/978-3-032-05141-7_19","DOIUrl":"10.1007/978-3-032-05141-7_19","url":null,"abstract":"<p><p>Analyzing fetal body motion and shape is paramount in prenatal diagnostics and monitoring. Existing methods for fetal MRI analysis mainly rely on anatomical keypoints or volumetric body segmentations. Keypoints simplify body structure to facilitate motion analysis, but may ignore important details of full-body shape. Body segmentations capture complete shape information but complicate temporal analysis due to large non-local fetal movements. To address these limitations, we construct a 3D articulated statistical fetal body model based on the Skinned Multi-Person Linear Model (SMPL). Our algorithm iteratively estimates body pose in the image space and body shape in the canonical pose space. This approach improves robustness to MRI motion artifacts and intensity distortions, and reduces the impact of incomplete surface observations due to challenging fetal poses. We train our model on segmentations and keypoints derived from 19, 816 MRI volumes across 53 subjects. Our model captures body shape and motion across time series and provides intuitive visualization. Furthermore, it enables automated anthropometric measurements traditionally difficult to obtain from segmentations and keypoints. When tested on unseen fetal body shapes, our method yields a surface alignment error of 3.2 mm for 3 mm MRI voxel size. To our knowledge, this represents the first 3D articulated statistical fetal body model, paving the way for enhanced fetal motion and shape analysis in prenatal diagnostics. The code is available at https://github.com/MedicalVisionGroup/fetal-smpl.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15970 ","pages":"189-198"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13036607/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147597330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书