ToothMaker: Realistic Panoramic Dental Radiograph Generation via Disentangled Control.

IF 9.8 1区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

IEEE Transactions on Medical Imaging Pub Date : 2025-07-28 DOI:10.1109/tmi.2025.3588466

Weihao Yu,Xiaoqing Guo,Wuyang Li,Xinyu Liu,Hui Chen,Yixuan Yuan

{"title":"ToothMaker: Realistic Panoramic Dental Radiograph Generation via Disentangled Control.","authors":"Weihao Yu,Xiaoqing Guo,Wuyang Li,Xinyu Liu,Hui Chen,Yixuan Yuan","doi":"10.1109/tmi.2025.3588466","DOIUrl":null,"url":null,"abstract":"Generating high-fidelity dental radiographs is essential for training diagnostic models. Despite the development of numerous methods for other medical data, generative approaches in dental radiology remain unexplored. Due to the intricate tooth structures and specialized terminology, these methods often yield ambiguous tooth regions and incorrect dental concepts when applied to dentistry. In this paper, we take the first attempt to investigate diffusion-based teeth X-ray image generation and propose ToothMaker, a novel framework specifically designed for the dental domain. Firstly, to synthesize X-ray images that possess accurate tooth structures and realistic radiological styles simultaneously, we design control-disentangled fine-tuning (CDFT) strategy. Specifically, we present two separate controllers to handle style and layout control respectively, and introduce a gradient-based decoupling method that optimizes each using their corresponding disentangled gradients. Secondly, to enhance model's understanding of dental terminology, we propose prior-disentangled guidance module (PDGM), enabling precise synthesis of dental concepts. It utilizes large language model to decompose dental terminology into a series of meta-knowledge elements and performs interactions and refinements through hypergraph neural network. These elements are then fed into the network to guide the generation of dental concepts. Extensive experiments demonstrate the high fidelity and diversity of the images synthesized by our approach. By incorporating the generated data, we achieve substantial performance improvements on downstream segmentation and visual question answering tasks, indicating that our method can greatly reduce the reliance on manually annotated data. Code will be public available at https://github.com/CUHK-AIM-Group/ToothMaker.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"90 1","pages":""},"PeriodicalIF":9.8000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Medical Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/tmi.2025.3588466","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Generating high-fidelity dental radiographs is essential for training diagnostic models. Despite the development of numerous methods for other medical data, generative approaches in dental radiology remain unexplored. Due to the intricate tooth structures and specialized terminology, these methods often yield ambiguous tooth regions and incorrect dental concepts when applied to dentistry. In this paper, we take the first attempt to investigate diffusion-based teeth X-ray image generation and propose ToothMaker, a novel framework specifically designed for the dental domain. Firstly, to synthesize X-ray images that possess accurate tooth structures and realistic radiological styles simultaneously, we design control-disentangled fine-tuning (CDFT) strategy. Specifically, we present two separate controllers to handle style and layout control respectively, and introduce a gradient-based decoupling method that optimizes each using their corresponding disentangled gradients. Secondly, to enhance model's understanding of dental terminology, we propose prior-disentangled guidance module (PDGM), enabling precise synthesis of dental concepts. It utilizes large language model to decompose dental terminology into a series of meta-knowledge elements and performs interactions and refinements through hypergraph neural network. These elements are then fed into the network to guide the generation of dental concepts. Extensive experiments demonstrate the high fidelity and diversity of the images synthesized by our approach. By incorporating the generated data, we achieve substantial performance improvements on downstream segmentation and visual question answering tasks, indicating that our method can greatly reduce the reliance on manually annotated data. Code will be public available at https://github.com/CUHK-AIM-Group/ToothMaker.

查看原文本刊更多论文

牙机：现实全景牙科x光片生成通过解开控制。

生成高保真牙科x光片对于训练诊断模型至关重要。尽管发展了许多其他医学数据的方法，但牙科放射学的生成方法仍然未被探索。由于复杂的牙齿结构和专业术语，这些方法往往产生模糊的牙齿区域和不正确的牙科概念，当应用于牙科。在本文中，我们首次尝试研究基于扩散的牙齿x射线图像生成，并提出了专门为牙科领域设计的新框架ToothMaker。首先，为了合成同时具有准确牙齿结构和真实放射风格的x射线图像，我们设计了控制解纠缠微调（CDFT）策略。具体来说，我们分别提出了两个独立的控制器来处理样式和布局控制，并引入了一种基于梯度的解耦方法，该方法使用它们对应的解纠缠梯度来优化每个控制器。其次，为了增强模型对牙科术语的理解，我们提出了先验解纠缠引导模块（prior- disentanglement guidance module， PDGM），实现了牙科概念的精确合成。它利用大型语言模型将牙科术语分解为一系列元知识元素，并通过超图神经网络进行交互和细化。然后将这些元素输入到网络中，以指导牙科概念的生成。大量的实验表明，该方法合成的图像具有较高的保真度和多样性。通过合并生成的数据，我们在下游分割和可视化问答任务上取得了实质性的性能改进，这表明我们的方法可以大大减少对手动注释数据的依赖。代码将在https://github.com/CUHK-AIM-Group/ToothMaker上公开。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Medical Imaging 医学-成像科学与照相技术

CiteScore

21.80

自引率

5.70%

发文量

637

审稿时长

5.6 months

期刊介绍： The IEEE Transactions on Medical Imaging (T-MI) is a journal that welcomes the submission of manuscripts focusing on various aspects of medical imaging. The journal encourages the exploration of body structure, morphology, and function through different imaging techniques, including ultrasound, X-rays, magnetic resonance, radionuclides, microwaves, and optical methods. It also promotes contributions related to cell and molecular imaging, as well as all forms of microscopy. T-MI publishes original research papers that cover a wide range of topics, including but not limited to novel acquisition techniques, medical image processing and analysis, visualization and performance, pattern recognition, machine learning, and other related methods. The journal particularly encourages highly technical studies that offer new perspectives. By emphasizing the unification of medicine, biology, and imaging, T-MI seeks to bridge the gap between instrumentation, hardware, software, mathematics, physics, biology, and medicine by introducing new analysis methods. While the journal welcomes strong application papers that describe novel methods, it directs papers that focus solely on important applications using medically adopted or well-established methods without significant innovation in methodology to other journals. T-MI is indexed in Pubmed® and Medline®, which are products of the United States National Library of Medicine.