Leveraging Pretrained Transformers for Efficient Segmentation and Lesion Detection in Cone-Beam Computed Tomography Scans

IF 3.5 2区医学 Q1 DENTISTRY, ORAL SURGERY & MEDICINE

Journal of endodontics Pub Date : 2024-10-01 DOI:10.1016/j.joen.2024.07.012

Rui Qi Chen PhD , Yeonju Lee PhD , Hao Yan MS, PhD , Muralidhar Mupparapu DMD, MDS, DipABOMR , Fleming Lure PhD , Jing Li PhD , Frank C. Setzer DMD, PhD, MS

{"title":"Leveraging Pretrained Transformers for Efficient Segmentation and Lesion Detection in Cone-Beam Computed Tomography Scans","authors":"Rui Qi Chen PhD , Yeonju Lee PhD , Hao Yan MS, PhD , Muralidhar Mupparapu DMD, MDS, DipABOMR , Fleming Lure PhD , Jing Li PhD , Frank C. Setzer DMD, PhD, MS","doi":"10.1016/j.joen.2024.07.012","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Cone-beam computed tomography (CBCT) is widely used to detect jaw lesions, although CBCT interpretation is time-consuming and challenging. Artificial intelligence for CBCT segmentation may improve lesion detection accuracy. However, consistent automated lesion detection remains difficult, especially with limited training data. This study aimed to assess the applicability of pretrained transformer-based architectures for semantic segmentation of CBCT volumes when applied to periapical lesion detection.</div></div><div><h3>Methods</h3><div>CBCT volumes (<em>n</em> = 138) were collected and annotated by expert clinicians using 5 labels – \"lesion,\" \"restorative material,\" \"bone,\" \"tooth structure,\" and \"background.\" U-Net (convolutional neural network-based) and Swin-UNETR (transformer-based) models, pretrained (Swin-UNETR-PRETRAIN), and from scratch (Swin-UNETR-SCRATCH), were trained with subsets of the annotated CBCTs. These models were then evaluated for semantic segmentation performance using the Sørensen–Dice coefficient (DICE), lesion detection performance using sensitivity and specificity, and training sample size requirements by comparing models trained with 20, 40, 60, or 103 samples.</div></div><div><h3>Results</h3><div>Trained with 103 samples, Swin-UNETR-PRETRAIN achieved a DICE of 0.8512 for \"lesion,\" 0.8282 for \"restorative materials,\" 0.9178 for \"bone,\" 0.9029 for \"tooth structure,\" and 0.9901 for \"background.\" “Lesion” DICE was statistically similar between Swin-UNETR-PRETRAIN trained with 103 and 60 images (<em>P</em> > .05), with the latter achieving 1.00 sensitivity and 0.94 specificity in lesion detection. With small training sets, Swin-UNETR-PRETRAIN outperformed Swin-UNETR-SCRATCH in DICE over all labels (<em>P</em> < .001 [<em>n</em> = 20], <em>P</em> < .001 [<em>n</em> = 40]), and U-Net in lesion detection specificity (<em>P</em> = .006 [<em>n</em> = 20], <em>P</em> = .031 [<em>n</em> = 40]).</div></div><div><h3>Conclusions</h3><div>Transformer-based Swin-UNETR architectures allowed for excellent semantic segmentation and periapical lesion detection. Pretrained, it may provide an alternative with smaller training datasets compared to classic U-Net architectures.</div></div>","PeriodicalId":15703,"journal":{"name":"Journal of endodontics","volume":"50 10","pages":"Pages 1505-1514.e1"},"PeriodicalIF":3.5000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of endodontics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0099239924004084","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction

Cone-beam computed tomography (CBCT) is widely used to detect jaw lesions, although CBCT interpretation is time-consuming and challenging. Artificial intelligence for CBCT segmentation may improve lesion detection accuracy. However, consistent automated lesion detection remains difficult, especially with limited training data. This study aimed to assess the applicability of pretrained transformer-based architectures for semantic segmentation of CBCT volumes when applied to periapical lesion detection.

Methods

CBCT volumes (n = 138) were collected and annotated by expert clinicians using 5 labels – "lesion," "restorative material," "bone," "tooth structure," and "background." U-Net (convolutional neural network-based) and Swin-UNETR (transformer-based) models, pretrained (Swin-UNETR-PRETRAIN), and from scratch (Swin-UNETR-SCRATCH), were trained with subsets of the annotated CBCTs. These models were then evaluated for semantic segmentation performance using the Sørensen–Dice coefficient (DICE), lesion detection performance using sensitivity and specificity, and training sample size requirements by comparing models trained with 20, 40, 60, or 103 samples.

Results

Trained with 103 samples, Swin-UNETR-PRETRAIN achieved a DICE of 0.8512 for "lesion," 0.8282 for "restorative materials," 0.9178 for "bone," 0.9029 for "tooth structure," and 0.9901 for "background." “Lesion” DICE was statistically similar between Swin-UNETR-PRETRAIN trained with 103 and 60 images (P > .05), with the latter achieving 1.00 sensitivity and 0.94 specificity in lesion detection. With small training sets, Swin-UNETR-PRETRAIN outperformed Swin-UNETR-SCRATCH in DICE over all labels (P < .001 [n = 20], P < .001 [n = 40]), and U-Net in lesion detection specificity (P = .006 [n = 20], P = .031 [n = 40]).

Conclusions

Transformer-based Swin-UNETR architectures allowed for excellent semantic segmentation and periapical lesion detection. Pretrained, it may provide an alternative with smaller training datasets compared to classic U-Net architectures.

查看原文本刊更多论文

利用预训练变压器在锥形束 CT 扫描中进行高效分割和病变检测

简介锥形束计算机断层扫描（CBCT）被广泛用于检测颌骨病变，但 CBCT 的判读耗时且具有挑战性。用于 CBCT 分段的人工智能（AI）可提高病变检测的准确性。然而，一致的自动病变检测仍然很困难，尤其是在训练数据有限的情况下。本研究旨在评估基于变压器的预训练架构在应用于根尖周病变检测时对 CBCT 图像进行语义分割的适用性：方法：收集 CBCT 图像（n=138），由临床专家使用 "病变"、"修复材料"、"骨"、"牙齿结构 "和 "背景 "五个标签进行标注。使用注释 CBCT 的子集对 U-Net（基于卷积神经网络 (CNN)）和 Swin-UNETR（基于转换器）模型进行了预训练（Swin-UNETR-PRETRAIN）和从头开始训练（Swin-UNETR-SCRATCH）。然后使用索伦森-戴斯系数（DICE）对这些模型的语义分割性能进行评估，使用灵敏度和特异性对病变检测性能进行评估，并通过比较使用 20、40、60 或 103 个样本训练的模型，对训练样本的大小进行评估：使用 103 个样本进行训练后，Swin-UNETR-PRETRAIN 的 "病变 "DICE 为 0.8512，"修复材料 "DICE 为 0.8282，"骨骼 "DICE 为 0.9178，"牙齿结构 "DICE 为 0.9029，"背景 "DICE 为 0.9901。用 103 张图像和 60 张图像训练的 Swin-UNETR-PRETRAIN 的 "病变 "DICE 在统计学上相似（P>.05），后者在病变检测方面的灵敏度为 1.00，特异度为 0.94。在使用小型训练集的情况下，Swin-UNETR-PRETRAIN 在所有标签的 DICE 中的表现优于 Swin-UNETR-SCRATCH（PConclusions：基于变换器的 Swin-UNETR 架构可实现出色的语义分割和根尖周病变检测。与传统的 U-Net 架构相比，经过预先训练的 Swin-UNETR-SCRATCH 可为较小的训练数据集提供替代方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of endodontics 医学-牙科与口腔外科

CiteScore

8.80

自引率

9.50%

发文量

224

审稿时长

42 days

期刊介绍： The Journal of Endodontics, the official journal of the American Association of Endodontists, publishes scientific articles, case reports and comparison studies evaluating materials and methods of pulp conservation and endodontic treatment. Endodontists and general dentists can learn about new concepts in root canal treatment and the latest advances in techniques and instrumentation in the one journal that helps them keep pace with rapid changes in this field.