Zhenzhen You, Botao Han, Zhenghao Shi, Shuangli Du, Minghua Zhao, Zhiyong Lv, Xinhong Hei, Haiqin Liu, Xiaoyong Ren, Yan Yan
{"title":"Single-View Contrastive Learning for Laryngeal Leukoplakia Classification With NBI Laryngoscopy Images.","authors":"Zhenzhen You, Botao Han, Zhenghao Shi, Shuangli Du, Minghua Zhao, Zhiyong Lv, Xinhong Hei, Haiqin Liu, Xiaoyong Ren, Yan Yan","doi":"10.1002/hed.28157","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Laryngeal cancer is the second most common upper respiratory tract cancer. Early and accurate diagnosis can improve the cure rate of patients. Laryngoscopy with NBI is a commonly used tool that can help endoscopists diagnose laryngeal diseases. However, the fine classification of laryngeal leukoplakia using NBI images is challenging for computer-aided diagnosis.</p><p><strong>Methods: </strong>In this article, we propose a single-view contrastive learning network to locate lesion regions, construct sample pairs for contrastive learning, and provide pseudo-labels to unlabeled data in order to achieve fine classification under small samples. Firstly, we pretrain the backbone network using the original NBI images. Secondly, in order to augment the number of samples for contrastive learning, we design different patch generation methods based on an attention-guided network. The original NBI images are cropped into small patches for the purpose of generating lesion-related regions and complementary samples. The pseudo-labels of these small patches are obtained by applying the pre-trained backbone network. Finally, we combine the contrastive loss function and the cross-entropy loss function for jointly training the backbone network and contrastive learning network. Our NBI dataset is classified into six categories: normal tissue, inflammatory keratosis, mild dysplasia, moderate dysplasia, severe dysplasia, and squamous cell carcinoma.</p><p><strong>Results and conclusion: </strong>Experimental results demonstrate that our model achieves an accuracy of 96.12%, which is higher than the current mainstream models. Our model also achieves high specificity and sensitivity. The code is available at https://github.com/hans-bbt/single-view-contrastive-learning.</p>","PeriodicalId":55072,"journal":{"name":"Head and Neck-Journal for the Sciences and Specialties of the Head and Neck","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Head and Neck-Journal for the Sciences and Specialties of the Head and Neck","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/hed.28157","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Laryngeal cancer is the second most common upper respiratory tract cancer. Early and accurate diagnosis can improve the cure rate of patients. Laryngoscopy with NBI is a commonly used tool that can help endoscopists diagnose laryngeal diseases. However, the fine classification of laryngeal leukoplakia using NBI images is challenging for computer-aided diagnosis.
Methods: In this article, we propose a single-view contrastive learning network to locate lesion regions, construct sample pairs for contrastive learning, and provide pseudo-labels to unlabeled data in order to achieve fine classification under small samples. Firstly, we pretrain the backbone network using the original NBI images. Secondly, in order to augment the number of samples for contrastive learning, we design different patch generation methods based on an attention-guided network. The original NBI images are cropped into small patches for the purpose of generating lesion-related regions and complementary samples. The pseudo-labels of these small patches are obtained by applying the pre-trained backbone network. Finally, we combine the contrastive loss function and the cross-entropy loss function for jointly training the backbone network and contrastive learning network. Our NBI dataset is classified into six categories: normal tissue, inflammatory keratosis, mild dysplasia, moderate dysplasia, severe dysplasia, and squamous cell carcinoma.
Results and conclusion: Experimental results demonstrate that our model achieves an accuracy of 96.12%, which is higher than the current mainstream models. Our model also achieves high specificity and sensitivity. The code is available at https://github.com/hans-bbt/single-view-contrastive-learning.
期刊介绍:
Head & Neck is an international multidisciplinary publication of original contributions concerning the diagnosis and management of diseases of the head and neck. This area involves the overlapping interests and expertise of several surgical and medical specialties, including general surgery, neurosurgery, otolaryngology, plastic surgery, oral surgery, dermatology, ophthalmology, pathology, radiotherapy, medical oncology, and the corresponding basic sciences.