Abhay Nath , Om Roy , Priyanka Silveri , Sanskruti Patel
{"title":"基于ConvNeXt-SE-attn模型的深度学习方法在体外口腔鳞状细胞癌及化疗分析中的应用","authors":"Abhay Nath , Om Roy , Priyanka Silveri , Sanskruti Patel","doi":"10.1016/j.mex.2025.103519","DOIUrl":null,"url":null,"abstract":"<div><div>Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.</div><div>The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.</div><div>The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.</div><div>Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.</div><div>The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103519"},"PeriodicalIF":1.9000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning approach with ConvNeXt-SE-attn model for in vitro oral squamous cell carcinoma and chemotherapy analysis\",\"authors\":\"Abhay Nath , Om Roy , Priyanka Silveri , Sanskruti Patel\",\"doi\":\"10.1016/j.mex.2025.103519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.</div><div>The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.</div><div>The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.</div><div>Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.</div><div>The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.</div></div>\",\"PeriodicalId\":18446,\"journal\":{\"name\":\"MethodsX\",\"volume\":\"15 \",\"pages\":\"Article 103519\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MethodsX\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2215016125003632\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MethodsX","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215016125003632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Deep learning approach with ConvNeXt-SE-attn model for in vitro oral squamous cell carcinoma and chemotherapy analysis
Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.
The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.
The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.
Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.
The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.