基于ConvNeXt-SE-attn模型的深度学习方法在体外口腔鳞状细胞癌及化疗分析中的应用

IF 1.9 Q2 MULTIDISCIPLINARY SCIENCES

MethodsX Pub Date : 2025-07-17 DOI:10.1016/j.mex.2025.103519

Abhay Nath , Om Roy , Priyanka Silveri , Sanskruti Patel

{"title":"基于ConvNeXt-SE-attn模型的深度学习方法在体外口腔鳞状细胞癌及化疗分析中的应用","authors":"Abhay Nath , Om Roy , Priyanka Silveri , Sanskruti Patel","doi":"10.1016/j.mex.2025.103519","DOIUrl":null,"url":null,"abstract":"<div><div>Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.</div><div>The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.</div><div>The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.</div><div>Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.</div><div>The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103519"},"PeriodicalIF":1.9000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning approach with ConvNeXt-SE-attn model for in vitro oral squamous cell carcinoma and chemotherapy analysis\",\"authors\":\"Abhay Nath , Om Roy , Priyanka Silveri , Sanskruti Patel\",\"doi\":\"10.1016/j.mex.2025.103519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.</div><div>The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.</div><div>The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.</div><div>Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.</div><div>The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.</div></div>\",\"PeriodicalId\":18446,\"journal\":{\"name\":\"MethodsX\",\"volume\":\"15 \",\"pages\":\"Article 103519\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MethodsX\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2215016125003632\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MethodsX","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215016125003632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

口腔鳞状细胞癌（OSCC）仍然是一个主要的全球卫生保健问题，因为患者的生存结果很差，而且经常复发。Globocan预测，到2022年，全球将有389,846例OSCC新病例和188,438例死亡，同时维持极低的5年生存率，约为50%。我们的方法将残余连接与挤压和激励块以及混合注意系统和增强的激活函数和优化算法结合起来，在特征提取过程中促进梯度运动。与已建立的传统CNN主梁（VGG16、ResNet50、DenseNet121等）相比，本文提出的ConvNeXt-SE-Attn模型在识别和校准方面均优于传统的CNN主梁（VGG16、ResNet50、DenseNet121等），精度97.88% （vs.≤94.2%）、灵敏度96.82% （vs.≤92.5%）、特异性95.94% （vs.≤93.1%）、F1评分97.31% （vs.≤93.8%）、AUC 0.9644 （vs.≤0.945）、MCC 0.9397 （vs.≤0.910）。这些发现对于增强特征表示能力和体系结构分类的鲁棒性至关重要。提出的体系结构采用带有SE块的ConvNeXt主干和混合关注来提取类边界内标准模型通常遗漏的重要细节，通过基于高斯的GReLU激活结合Swish激活和DropPath正则化来产生平滑的梯度模式，从而产生跨不平衡数据集的可泛化特征。Grad-CAM通过显示哪些图像部分导致预测，从而增强了可解释性，从而使临床决策成为可能。该模型证明了其作为口腔细胞微小变化的有效检测方法的能力，为OSCC的精确非侵入性治疗提供了支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Deep learning approach with ConvNeXt-SE-attn model for in vitro oral squamous cell carcinoma and chemotherapy analysis

查看原文本刊更多论文

Deep learning approach with ConvNeXt-SE-attn model for in vitro oral squamous cell carcinoma and chemotherapy analysis

Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.

The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.

The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.

Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.

The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊