Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献

筛选
英文 中文
Image2SSM: Reimagining Statistical Shape Models from Images with Radial Basis Functions. Image2SSM:利用径向基函数从图像中重塑统计形状模型。
Hong Xu, Shireen Y Elhabian
{"title":"Image2SSM: Reimagining Statistical Shape Models from Images with Radial Basis Functions.","authors":"Hong Xu, Shireen Y Elhabian","doi":"10.1007/978-3-031-43907-0_49","DOIUrl":"10.1007/978-3-031-43907-0_49","url":null,"abstract":"<p><p>Statistical shape modeling (SSM) is an essential tool for analyzing variations in anatomical morphology. In a typical SSM pipeline, 3D anatomical images, gone through segmentation and rigid registration, are represented using lower-dimensional shape features, on which statistical analysis can be performed. Various methods for constructing compact shape representations have been proposed, but they involve laborious and costly steps. We propose Image2SSM, a novel deep-learning-based approach for SSM that leverages image-segmentation pairs to learn a radial-basis-function (RBF)-based representation of shapes directly from images. This RBF-based shape representation offers a rich self-supervised signal for the network to estimate a continuous, yet compact representation of the underlying surface that can adapt to complex geometries in a data-driven manner. Image2SSM can characterize populations of biological structures of interest by constructing statistical landmark-based shape models of ensembles of anatomical shapes while requiring minimal parameter tuning and no user assistance. Once trained, Image2SSM can be used to infer low-dimensional shape representations from new unsegmented images, paving the way toward scalable approaches for SSM, especially when dealing with large cohorts. Experiments on synthetic and real datasets show the efficacy of the proposed method compared to the state-of-art correspondence-based method for SSM.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14220 ","pages":"508-517"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11555643/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142635487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can point cloud networks learn statistical shape models of anatomies? 点云网络能否学习解剖学的统计形状模型?
Jadie Adams, Shireen Elhabian
{"title":"Can point cloud networks learn statistical shape models of anatomies?","authors":"Jadie Adams, Shireen Elhabian","doi":"10.1007/978-3-031-43907-0_47","DOIUrl":"10.1007/978-3-031-43907-0_47","url":null,"abstract":"<p><p>Statistical Shape Modeling (SSM) is a valuable tool for investigating and quantifying anatomical variations within populations of anatomies. However, traditional correspondence-based SSM generation methods have a prohibitive inference process and require complete geometric proxies (e.g., high-resolution binary volumes or surface meshes) as input shapes to construct the SSM. Unordered 3D point cloud representations of shapes are more easily acquired from various medical imaging practices (e.g., thresholded images and surface scanning). Point cloud deep networks have recently achieved remarkable success in learning permutation-invariant features for different point cloud tasks (e.g., completion, semantic segmentation, classification). However, their application to learning SSM from point clouds is to-date unexplored. In this work, we demonstrate that existing point cloud encoder-decoder-based completion networks can provide an untapped potential for SSM, capturing population-level statistical representations of shapes while reducing the inference burden and relaxing the input requirement. We discuss the limitations of these techniques to the SSM application and suggest future improvements. Our work paves the way for further exploration of point cloud deep learning for SSM, a promising avenue for advancing shape analysis literature and broadening SSM to diverse use cases.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14220 ","pages":"486-496"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534086/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fully Bayesian VIB-DeepSSM. 完全贝叶斯 VIB-DeepSSM
Jadie Adams, Shireen Y Elhabian
{"title":"Fully Bayesian VIB-DeepSSM.","authors":"Jadie Adams, Shireen Y Elhabian","doi":"10.1007/978-3-031-43898-1_34","DOIUrl":"10.1007/978-3-031-43898-1_34","url":null,"abstract":"<p><p>Statistical shape modeling (SSM) enables population-based quantitative analysis of anatomical shapes, informing clinical diagnosis. Deep learning approaches predict correspondence-based SSM directly from unsegmented 3D images but require calibrated uncertainty quantification, motivating Bayesian formulations. Variational information bottleneck DeepSSM (VIB-DeepSSM) is an effective, principled framework for predicting probabilistic shapes of anatomy from images with aleatoric uncertainty quantification. However, VIB is only half-Bayesian and lacks epistemic uncertainty inference. We derive a fully Bayesian VIB formulation and demonstrate the efficacy of two scalable implementation approaches: concrete dropout and batch ensemble. Additionally, we introduce a novel combination of the two that further enhances uncertainty calibration via multimodal marginalization. Experiments on synthetic shapes and left atrium data demonstrate that the fully Bayesian VIB network predicts SSM from images with improved uncertainty reasoning without sacrificing accuracy.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14222 ","pages":"346-356"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11536909/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating Realistic Brain MRIs via a Conditional Diffusion Probabilistic Model. 通过条件扩散概率模型生成逼真的大脑 MRI 图像
Wei Peng, Ehsan Adeli, Tomas Bosschieter, Sang Hyun Park, Qingyu Zhao, Kilian M Pohl
{"title":"Generating Realistic Brain MRIs via a Conditional Diffusion Probabilistic Model.","authors":"Wei Peng, Ehsan Adeli, Tomas Bosschieter, Sang Hyun Park, Qingyu Zhao, Kilian M Pohl","doi":"10.1007/978-3-031-43993-3_2","DOIUrl":"10.1007/978-3-031-43993-3_2","url":null,"abstract":"<p><p>As acquiring MRIs is expensive, neuroscience studies struggle to attain a sufficient number of them for properly training deep learning models. This challenge could be reduced by MRI synthesis, for which Generative Adversarial Networks (GANs) are popular. GANs, however, are commonly unstable and struggle with creating diverse and high-quality data. A more stable alternative is Diffusion Probabilistic Models (DPMs) with a fine-grained training strategy. To overcome their need for extensive computational resources, we propose a conditional DPM (cDPM) with a memory-efficient process that generates realistic-looking brain MRIs. To this end, we train a 2D cDPM to generate an MRI subvolume conditioned on another subset of slices from the same MRI. By generating slices using arbitrary combinations between condition and target slices, the model only requires limited computational resources to learn interdependencies between slices even if they are spatially far apart. After having learned these dependencies via an attention network, a new anatomy-consistent 3D brain MRI is generated by repeatedly applying the cDPM. Our experiments demonstrate that our method can generate high-quality 3D MRIs that share a similar distribution to real MRIs while still diversifying the training set. The code is available at https://github.com/xiaoiker/mask3DMRI_diffusion and also will be released as part of MONAI, at https://github.com/Project-MONAI/GenerativeModels.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14227 ","pages":"14-24"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10758344/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139089834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Expected Appearances for Intraoperative Registration during Neurosurgery. 学习神经外科手术中术中注册的预期外观
Nazim Haouchine, Reuben Dorent, Parikshit Juvekar, Erickson Torio, William M Wells, Tina Kapur, Alexandra J Golby, Sarah Frisken
{"title":"Learning Expected Appearances for Intraoperative Registration during Neurosurgery.","authors":"Nazim Haouchine, Reuben Dorent, Parikshit Juvekar, Erickson Torio, William M Wells, Tina Kapur, Alexandra J Golby, Sarah Frisken","doi":"10.1007/978-3-031-43996-4_22","DOIUrl":"10.1007/978-3-031-43996-4_22","url":null,"abstract":"<p><p>We present a novel method for intraoperative patient-to-image registration by learning Expected Appearances. Our method uses preoperative imaging to synthesize patient-specific expected views through a surgical microscope for a predicted range of transformations. Our method estimates the camera pose by minimizing the dissimilarity between the intraoperative 2D view through the optical microscope and the synthesized expected texture. In contrast to conventional methods, our approach transfers the processing tasks to the preoperative stage, reducing thereby the impact of low-resolution, distorted, and noisy intraoperative images, that often degrade the registration accuracy. We applied our method in the context of neuronavigation during brain surgery. We evaluated our approach on synthetic data and on retrospective data from 6 clinical cases. Our method outperformed state-of-the-art methods and achieved accuracies that met current clinical standards.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14228 ","pages":"227-237"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10870253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139901119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental Learning for Heterogeneous Structure Segmentation in Brain Tumor MRI. 脑肿瘤磁共振成像中异质结构分割的增量学习
Xiaofeng Liu, Helen A Shih, Fangxu Xing, Emiliano Santarnecchi, Georges El Fakhri, Jonghye Woo
{"title":"Incremental Learning for Heterogeneous Structure Segmentation in Brain Tumor MRI.","authors":"Xiaofeng Liu, Helen A Shih, Fangxu Xing, Emiliano Santarnecchi, Georges El Fakhri, Jonghye Woo","doi":"10.1007/978-3-031-43895-0_5","DOIUrl":"https://doi.org/10.1007/978-3-031-43895-0_5","url":null,"abstract":"<p><p>Deep learning (DL) models for segmenting various anatomical structures have achieved great success via a static DL model that is trained in a single source domain. Yet, the static DL model is likely to perform poorly in a continually evolving environment, requiring appropriate model updates. In an incremental learning setting, we would expect that well-trained static models are updated, following continually evolving target domain data-e.g., additional lesions or structures of interest-collected from different sites, without catastrophic forgetting. This, however, poses challenges, due to distribution shifts, additional structures not seen during the initial model training, and the absence of training data in a source domain. To address these challenges, in this work, we seek to progressively evolve an \"off-the-shelf\" trained segmentation model to diverse datasets with additional anatomical categories in a unified manner. Specifically, we first propose a divergence-aware dual-flow module with balanced rigidity and plasticity branches to decouple old and new tasks, which is guided by continuous batch renormalization. Then, a complementary pseudo-label training scheme with self-entropy regularized momentum MixUp decay is developed for adaptive network optimization. We evaluated our framework on a brain tumor segmentation task with continually changing target domains-i.e., new MRI scanners/modalities with incremental structures. Our framework was able to well retain the discriminability of previously learned structures, hence enabling the realistic life-long segmentation model extension along with the widespread accumulation of big medical data.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14221 ","pages":"46-56"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11045038/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140869740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An AI-Ready Multiplex Staining Dataset for Reproducible and Accurate Characterization of Tumor Immune Microenvironment. 用于肿瘤免疫微环境再现和精确表征的AI就绪多重染色数据集。
Parmida Ghahremani, Joseph Marino, Juan Hernandez-Prera, Janis V de la Iglesia, Robbert Jc Slebos, Christine H Chung, Saad Nadeem
{"title":"An AI-Ready Multiplex Staining Dataset for Reproducible and Accurate Characterization of Tumor Immune Microenvironment.","authors":"Parmida Ghahremani,&nbsp;Joseph Marino,&nbsp;Juan Hernandez-Prera,&nbsp;Janis V de la Iglesia,&nbsp;Robbert Jc Slebos,&nbsp;Christine H Chung,&nbsp;Saad Nadeem","doi":"10.1007/978-3-031-43987-2_68","DOIUrl":"10.1007/978-3-031-43987-2_68","url":null,"abstract":"<p><p>We introduce a new AI-ready computational pathology dataset containing restained and co-registered digitized images from eight head-and-neck squamous cell carcinoma patients. Specifically, the same tumor sections were stained with the expensive multiplex immunofluorescence (mIF) assay first and then restained with cheaper multiplex immunohistochemistry (mIHC). This is a first public dataset that demonstrates the equivalence of these two staining methods which in turn allows several use cases; due to the equivalence, our cheaper mIHC staining protocol can offset the need for expensive mIF staining/scanning which requires highly-skilled lab technicians. As opposed to subjective and error-prone immune cell annotations from individual pathologists (disagreement > 50%) to drive SOTA deep learning approaches, this dataset provides objective immune and tumor cell annotations via mIF/mIHC restaining for more reproducible and accurate characterization of tumor immune microenvironment (e.g. for immunotherapy). We demonstrate the effectiveness of this dataset in three use cases: (1) IHC quantification of CD3/CD8 tumor-infiltrating lymphocytes via style transfer, (2) virtual translation of cheap mIHC stains to more expensive mIF stains, and (3) virtual tumor/immune cellular phenotyping on standard hematoxylin images. The dataset is available at https://github.com/nadeemlab/DeepLIIF.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14225 ","pages":"704-713"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10571229/pdf/nihms-1933600.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41242890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bidirectional Mapping with Contrastive Learning on Multimodal Neuroimaging Data. 多模态神经成像数据的双向映射与对比学习
Kai Ye, Haoteng Tang, Siyuan Dai, Lei Guo, Johnny Yuehan Liu, Yalin Wang, Alex Leow, Paul M Thompson, Heng Huang, Liang Zhan
{"title":"Bidirectional Mapping with Contrastive Learning on Multimodal Neuroimaging Data.","authors":"Kai Ye, Haoteng Tang, Siyuan Dai, Lei Guo, Johnny Yuehan Liu, Yalin Wang, Alex Leow, Paul M Thompson, Heng Huang, Liang Zhan","doi":"10.1007/978-3-031-43898-1_14","DOIUrl":"10.1007/978-3-031-43898-1_14","url":null,"abstract":"<p><p>The modeling of the interaction between brain structure and function using deep learning techniques has yielded remarkable success in identifying potential biomarkers for different clinical phenotypes and brain diseases. However, most existing studies focus on one-way mapping, either projecting brain function to brain structure or inversely. This type of unidirectional mapping approach is limited by the fact that it treats the mapping as a one-way task and neglects the intrinsic unity between these two modalities. Moreover, when dealing with the same biological brain, mapping from structure to function and from function to structure yields dissimilar outcomes, highlighting the likelihood of bias in one-way mapping. To address this issue, we propose a novel bidirectional mapping model, named Bidirectional Mapping with Contrastive Learning (BMCL), to reduce the bias between these two unidirectional mappings via ROI-level contrastive learning. We evaluate our framework on clinical phenotype and neurodegenerative disease predictions using two publicly available datasets (HCP and OASIS). Our results demonstrate the superiority of BMCL compared to several state-of-the-art methods.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14222 ","pages":"138-148"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11245326/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141617890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ACTION++: Improving Semi-supervised Medical Image Segmentation with Adaptive Anatomical Contrast. ACTION++:利用自适应解剖对比度改进半监督医学图像分割。
Chenyu You, Weicheng Dai, Yifei Min, Lawrence Staib, Jas Sekhon, James S Duncan
{"title":"ACTION++: Improving Semi-supervised Medical Image Segmentation with Adaptive Anatomical Contrast.","authors":"Chenyu You, Weicheng Dai, Yifei Min, Lawrence Staib, Jas Sekhon, James S Duncan","doi":"10.1007/978-3-031-43901-8_19","DOIUrl":"10.1007/978-3-031-43901-8_19","url":null,"abstract":"<p><p>Medical data often exhibits long-tail distributions with heavy class imbalance, which naturally leads to difficulty in classifying the minority classes (<i>i.e</i>., boundary regions or rare objects). Recent work has significantly improved semi-supervised medical image segmentation in long-tailed scenarios by equipping them with unsupervised contrastive criteria. However, it remains unclear how well they will perform in the labeled portion of data where class distribution is also highly imbalanced. In this work, we present <b>ACTION++</b>, an improved contrastive learning framework with adaptive anatomical contrast for semi-supervised medical segmentation. Specifically, we propose an adaptive supervised contrastive loss, where we first compute the optimal locations of class centers uniformly distributed on the embedding space (<i>i.e</i>., off-line), and then perform online contrastive matching training by encouraging different class features to adaptively match these distinct and uniformly distributed class centers. Moreover, we argue that blindly adopting a <i>constant</i> temperature <math><mi>τ</mi></math> in the contrastive loss on long-tailed medical data is not optimal, and propose to use a <i>dynamic</i> <math><mi>τ</mi></math> via a simple cosine schedule to yield better separation between majority and minority classes. Empirically, we evaluate ACTION++ on ACDC and LA benchmarks and show that it achieves state-of-the-art across two semi-supervised settings. Theoretically, we analyze the performance of adaptive anatomical contrast and confirm its superiority in label efficiency.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14223 ","pages":"194-205"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11136572/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141177034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer. 通过塑料变压器从标记磁共振成像和非负矩阵因式分解合成语音音频
Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Sidney Fels, Jerry L Prince, Georges El Fakhri, Jonghye Woo
{"title":"Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer.","authors":"Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Sidney Fels, Jerry L Prince, Georges El Fakhri, Jonghye Woo","doi":"10.1007/978-3-031-43990-2_41","DOIUrl":"https://doi.org/10.1007/978-3-031-43990-2_41","url":null,"abstract":"<p><p>The tongue's intricate 3D structure, comprising localized functional units, plays a crucial role in the production of speech. When measured using tagged MRI, these functional units exhibit cohesive displacements and derived quantities that facilitate the complex process of speech production. Non-negative matrix factorization-based approaches have been shown to estimate the functional units through motion features, yielding a set of building blocks and a corresponding weighting map. Investigating the link between weighting maps and speech acoustics can offer significant insights into the intricate process of speech production. To this end, in this work, we utilize two-dimensional spectrograms as a proxy representation, and develop an end-to-end deep learning framework for translating weighting maps to their corresponding audio waveforms. Our proposed plastic light transformer (PLT) framework is based on directional product relative position bias and single-level spatial pyramid pooling, thus enabling flexible processing of weighting maps with variable size to fixed-size spectrograms, without input information loss or dimension expansion. Additionally, our PLT framework efficiently models the global correlation of wide matrix input. To improve the realism of our generated spectrograms with relatively limited training samples, we apply pair-wise utterance consistency with Maximum Mean Discrepancy constraint and adversarial training. Experimental results on a dataset of 29 subjects speaking two utterances demonstrated that our framework is able to synthesize speech audio waveforms from weighting maps, outperforming conventional convolution and transformer models.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14226 ","pages":"435-445"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11034915/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140863045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信