Tapabrat Thakuria, Lipi B Mahanta, Sanjib Kumar Khataniar, Rahul Dev Goswami, Nevica Baruah, Trailokya Bharali
{"title":"Smartphone-Based Oral Lesion Image Segmentation Using Deep Learning.","authors":"Tapabrat Thakuria, Lipi B Mahanta, Sanjib Kumar Khataniar, Rahul Dev Goswami, Nevica Baruah, Trailokya Bharali","doi":"10.1007/s10278-025-01455-0","DOIUrl":null,"url":null,"abstract":"<p><p>Early detection of oral diseases, including and excluding cancer, is essential for improved outcomes. Segmentation of these lesions from the background is a crucial step in diagnosis, aiding clinicians in isolating affected areas and enhancing the accuracy of deep learning (DL) models. This study aims to develop a DL-based solution for segmenting oral lesions using smartphone-captured images. We designed a novel UNet-based model, OralSegNet, incorporating EfficientNetV2L as the encoder, along with Atrous Spatial Pyramid Pooling (ASPP) and residual blocks to enhance segmentation accuracy. The dataset consisted of 538 raw images with an average resolution of 1394 × 1524 pixels, along with corresponding annotated images of oral lesions. These images were pre-processed and resized to 256 × 256 pixels, and data augmentation techniques were applied to enhance the model's robustness. Our model achieved Dice coefficients of 0.9530 and 0.8518 and IoU scores of 0.9104 and 0.7550 in the validation and test phases, respectively, outperforming traditional and state-of-the-art models. The efficient architecture achieves the lowest FLOPS (34.30 GFLOPs) despite being the most parameter-heavy model (104.46 million). Given the widespread availability of smartphones, OralSegNet offers a cost-effective, non-invasive CNN model for clinicians, making early diagnosis accessible even in rural areas.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-025-01455-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Early detection of oral diseases, including and excluding cancer, is essential for improved outcomes. Segmentation of these lesions from the background is a crucial step in diagnosis, aiding clinicians in isolating affected areas and enhancing the accuracy of deep learning (DL) models. This study aims to develop a DL-based solution for segmenting oral lesions using smartphone-captured images. We designed a novel UNet-based model, OralSegNet, incorporating EfficientNetV2L as the encoder, along with Atrous Spatial Pyramid Pooling (ASPP) and residual blocks to enhance segmentation accuracy. The dataset consisted of 538 raw images with an average resolution of 1394 × 1524 pixels, along with corresponding annotated images of oral lesions. These images were pre-processed and resized to 256 × 256 pixels, and data augmentation techniques were applied to enhance the model's robustness. Our model achieved Dice coefficients of 0.9530 and 0.8518 and IoU scores of 0.9104 and 0.7550 in the validation and test phases, respectively, outperforming traditional and state-of-the-art models. The efficient architecture achieves the lowest FLOPS (34.30 GFLOPs) despite being the most parameter-heavy model (104.46 million). Given the widespread availability of smartphones, OralSegNet offers a cost-effective, non-invasive CNN model for clinicians, making early diagnosis accessible even in rural areas.