ViSwNeXtNet基于深度补丁的视觉变压器和ConvNeXt集成，用于稳健的二值组织病理学分类。

IF 3 3区医学 Q1 MEDICINE, GENERAL & INTERNAL

Diagnostics Pub Date : 2025-06-13 DOI:10.3390/diagnostics15121507

Özgen Arslan Solmaz, Burak Tasci

{"title":"ViSwNeXtNet基于深度补丁的视觉变压器和ConvNeXt集成，用于稳健的二值组织病理学分类。","authors":"Özgen Arslan Solmaz, Burak Tasci","doi":"10.3390/diagnostics15121507","DOIUrl":null,"url":null,"abstract":"Background: Intestinal metaplasia (IM) is a precancerous gastric condition that requires accurate histopathological diagnosis to enable early intervention and cancer prevention. Traditional evaluation of H&E-stained tissue slides can be labor-intensive and prone to interobserver variability. Recent advances in deep learning, particularly transformer-based models, offer promising tools for improving diagnostic accuracy. Methods: We propose ViSwNeXtNet, a novel patch-wise ensemble framework that integrates three transformer-based architectures-ConvNeXt-Tiny, Swin-Tiny, and ViT-Base-for deep feature extraction. Features from each model (12,288 per model) were concatenated into a 36,864-dimensional vector and refined using iterative neighborhood component analysis (INCA) to select the most discriminative 565 features. A quadratic SVM classifier was trained using these selected features. The model was evaluated on two datasets: (1) a custom-collected dataset consisting of 516 intestinal metaplasia cases and 521 control cases, and (2) the public GasHisSDB dataset, which includes 20,160 normal and 13,124 abnormal H&E-stained image patches of size 160 × 160 pixels. Results: On the collected dataset, the proposed method achieved 94.41% accuracy, 94.63% sensitivity, and 94.40% F1 score. On the GasHisSDB dataset, it reached 99.20% accuracy, 99.39% sensitivity, and 99.16% F1 score, outperforming individual backbone models and demonstrating strong generalizability across datasets. Conclusions: ViSwNeXtNet successfully combines local, regional, and global representations of tissue structure through an ensemble of transformer-based models. The addition of INCA-based feature selection significantly enhances classification performance while reducing dimensionality. These findings suggest the method's potential for integration into clinical pathology workflows. Future work will focus on multiclass classification, multicenter validation, and integration of explainable AI techniques.","PeriodicalId":11225,"journal":{"name":"Diagnostics","volume":"15 12","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12192024/pdf/","citationCount":"0","resultStr":"{\"title\":\"ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification.\",\"authors\":\"Özgen Arslan Solmaz, Burak Tasci\",\"doi\":\"10.3390/diagnostics15121507\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Intestinal metaplasia (IM) is a precancerous gastric condition that requires accurate histopathological diagnosis to enable early intervention and cancer prevention. Traditional evaluation of H&E-stained tissue slides can be labor-intensive and prone to interobserver variability. Recent advances in deep learning, particularly transformer-based models, offer promising tools for improving diagnostic accuracy. Methods: We propose ViSwNeXtNet, a novel patch-wise ensemble framework that integrates three transformer-based architectures-ConvNeXt-Tiny, Swin-Tiny, and ViT-Base-for deep feature extraction. Features from each model (12,288 per model) were concatenated into a 36,864-dimensional vector and refined using iterative neighborhood component analysis (INCA) to select the most discriminative 565 features. A quadratic SVM classifier was trained using these selected features. The model was evaluated on two datasets: (1) a custom-collected dataset consisting of 516 intestinal metaplasia cases and 521 control cases, and (2) the public GasHisSDB dataset, which includes 20,160 normal and 13,124 abnormal H&E-stained image patches of size 160 × 160 pixels. Results: On the collected dataset, the proposed method achieved 94.41% accuracy, 94.63% sensitivity, and 94.40% F1 score. On the GasHisSDB dataset, it reached 99.20% accuracy, 99.39% sensitivity, and 99.16% F1 score, outperforming individual backbone models and demonstrating strong generalizability across datasets. Conclusions: ViSwNeXtNet successfully combines local, regional, and global representations of tissue structure through an ensemble of transformer-based models. The addition of INCA-based feature selection significantly enhances classification performance while reducing dimensionality. These findings suggest the method's potential for integration into clinical pathology workflows. Future work will focus on multiclass classification, multicenter validation, and integration of explainable AI techniques.\",\"PeriodicalId\":11225,\"journal\":{\"name\":\"Diagnostics\",\"volume\":\"15 12\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12192024/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Diagnostics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/diagnostics15121507\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/diagnostics15121507","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

摘要

背景：肠化生（IM）是一种胃癌前病变，需要准确的组织病理学诊断才能进行早期干预和癌症预防。h&e染色组织切片的传统评估可能是劳动密集型的，并且容易在观察者之间发生变化。深度学习的最新进展，特别是基于变压器的模型，为提高诊断准确性提供了有前途的工具。方法：我们提出了一种新的基于补丁的集成框架ViSwNeXtNet，它集成了三种基于变压器的体系结构——convnext - tiny、swwin - tiny和viti -base，用于深度特征提取。每个模型（每个模型12,288个）的特征被连接到一个36,864维的向量中，并使用迭代邻域成分分析（INCA）进行细化，以选择最具判别性的565个特征。利用这些特征训练二次支持向量机分类器。该模型在两个数据集上进行评估：(1)由516例肠化生病例和521例对照病例组成的定制数据集；(2)公共GasHisSDB数据集，包括大小为160 × 160像素的20160个正常和13124个异常h&e染色图像斑块。结果：在所收集的数据集上，本文方法的准确率为94.41%，灵敏度为94.63%，F1得分为94.40%。在GasHisSDB数据集上，它的准确率达到99.20%，灵敏度达到99.39%，F1得分达到99.16%，优于单个骨干模型，并在数据集上显示出很强的泛化能力。结论：ViSwNeXtNet通过基于转换器的模型集合成功地结合了组织结构的局部、区域和全局表示。基于印加的特征选择的加入在降低维数的同时显著提高了分类性能。这些发现表明该方法有潜力整合到临床病理工作流程中。未来的工作将集中在多类分类、多中心验证和可解释人工智能技术的集成上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ViSwNeXtNet Deep Patch-Wise Ensemble of Vision Transformers and ConvNeXt for Robust Binary Histopathology Classification.

Background: Intestinal metaplasia (IM) is a precancerous gastric condition that requires accurate histopathological diagnosis to enable early intervention and cancer prevention. Traditional evaluation of H&E-stained tissue slides can be labor-intensive and prone to interobserver variability. Recent advances in deep learning, particularly transformer-based models, offer promising tools for improving diagnostic accuracy. Methods: We propose ViSwNeXtNet, a novel patch-wise ensemble framework that integrates three transformer-based architectures-ConvNeXt-Tiny, Swin-Tiny, and ViT-Base-for deep feature extraction. Features from each model (12,288 per model) were concatenated into a 36,864-dimensional vector and refined using iterative neighborhood component analysis (INCA) to select the most discriminative 565 features. A quadratic SVM classifier was trained using these selected features. The model was evaluated on two datasets: (1) a custom-collected dataset consisting of 516 intestinal metaplasia cases and 521 control cases, and (2) the public GasHisSDB dataset, which includes 20,160 normal and 13,124 abnormal H&E-stained image patches of size 160 × 160 pixels. Results: On the collected dataset, the proposed method achieved 94.41% accuracy, 94.63% sensitivity, and 94.40% F1 score. On the GasHisSDB dataset, it reached 99.20% accuracy, 99.39% sensitivity, and 99.16% F1 score, outperforming individual backbone models and demonstrating strong generalizability across datasets. Conclusions: ViSwNeXtNet successfully combines local, regional, and global representations of tissue structure through an ensemble of transformer-based models. The addition of INCA-based feature selection significantly enhances classification performance while reducing dimensionality. These findings suggest the method's potential for integration into clinical pathology workflows. Future work will focus on multiclass classification, multicenter validation, and integration of explainable AI techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Diagnostics Biochemistry, Genetics and Molecular Biology-Clinical Biochemistry

CiteScore

4.70

自引率

8.30%

发文量

2699

审稿时长

19.64 days

期刊介绍： Diagnostics (ISSN 2075-4418) is an international scholarly open access journal on medical diagnostics. It publishes original research articles, reviews, communications and short notes on the research and development of medical diagnostics. There is no restriction on the length of the papers. Our aim is to encourage scientists to publish their experimental and theoretical research in as much detail as possible. Full experimental and/or methodological details must be provided for research articles.