Smartphone-Based Oral Lesion Image Segmentation Using Deep Learning.

Tapabrat Thakuria, Lipi B Mahanta, Sanjib Kumar Khataniar, Rahul Dev Goswami, Nevica Baruah, Trailokya Bharali
{"title":"Smartphone-Based Oral Lesion Image Segmentation Using Deep Learning.","authors":"Tapabrat Thakuria, Lipi B Mahanta, Sanjib Kumar Khataniar, Rahul Dev Goswami, Nevica Baruah, Trailokya Bharali","doi":"10.1007/s10278-025-01455-0","DOIUrl":null,"url":null,"abstract":"<p><p>Early detection of oral diseases, including and excluding cancer, is essential for improved outcomes. Segmentation of these lesions from the background is a crucial step in diagnosis, aiding clinicians in isolating affected areas and enhancing the accuracy of deep learning (DL) models. This study aims to develop a DL-based solution for segmenting oral lesions using smartphone-captured images. We designed a novel UNet-based model, OralSegNet, incorporating EfficientNetV2L as the encoder, along with Atrous Spatial Pyramid Pooling (ASPP) and residual blocks to enhance segmentation accuracy. The dataset consisted of 538 raw images with an average resolution of 1394 × 1524 pixels, along with corresponding annotated images of oral lesions. These images were pre-processed and resized to 256 × 256 pixels, and data augmentation techniques were applied to enhance the model's robustness. Our model achieved Dice coefficients of 0.9530 and 0.8518 and IoU scores of 0.9104 and 0.7550 in the validation and test phases, respectively, outperforming traditional and state-of-the-art models. The efficient architecture achieves the lowest FLOPS (34.30 GFLOPs) despite being the most parameter-heavy model (104.46 million). Given the widespread availability of smartphones, OralSegNet offers a cost-effective, non-invasive CNN model for clinicians, making early diagnosis accessible even in rural areas.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-025-01455-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Early detection of oral diseases, including and excluding cancer, is essential for improved outcomes. Segmentation of these lesions from the background is a crucial step in diagnosis, aiding clinicians in isolating affected areas and enhancing the accuracy of deep learning (DL) models. This study aims to develop a DL-based solution for segmenting oral lesions using smartphone-captured images. We designed a novel UNet-based model, OralSegNet, incorporating EfficientNetV2L as the encoder, along with Atrous Spatial Pyramid Pooling (ASPP) and residual blocks to enhance segmentation accuracy. The dataset consisted of 538 raw images with an average resolution of 1394 × 1524 pixels, along with corresponding annotated images of oral lesions. These images were pre-processed and resized to 256 × 256 pixels, and data augmentation techniques were applied to enhance the model's robustness. Our model achieved Dice coefficients of 0.9530 and 0.8518 and IoU scores of 0.9104 and 0.7550 in the validation and test phases, respectively, outperforming traditional and state-of-the-art models. The efficient architecture achieves the lowest FLOPS (34.30 GFLOPs) despite being the most parameter-heavy model (104.46 million). Given the widespread availability of smartphones, OralSegNet offers a cost-effective, non-invasive CNN model for clinicians, making early diagnosis accessible even in rural areas.

基于智能手机的口腔病变图像深度学习分割。
早期发现口腔疾病,包括和不包括癌症,对于改善结果至关重要。从背景中分割这些病变是诊断的关键步骤,有助于临床医生隔离受影响的区域并提高深度学习(DL)模型的准确性。本研究旨在开发一种基于dl的解决方案,用于使用智能手机捕获的图像分割口腔病变。我们设计了一种新的基于unet的模型OralSegNet,将EfficientNetV2L作为编码器,结合Atrous空间金字塔池(ASPP)和残差块来提高分割精度。数据集包括538张平均分辨率为1394 × 1524像素的原始图像,以及相应的口腔病变注释图像。这些图像经过预处理并调整为256 × 256像素,并应用数据增强技术来增强模型的鲁棒性。我们的模型在验证和测试阶段的Dice系数分别为0.9530和0.8518,IoU得分分别为0.9104和0.7550,优于传统模型和最先进的模型。高效的架构实现了最低的FLOPS (34.30 GFLOPs),尽管是参数最多的模型(1.0446亿)。鉴于智能手机的广泛使用,OralSegNet为临床医生提供了一种成本效益高、无创的CNN模型,即使在农村地区也可以进行早期诊断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信