Principal component analysis and fine-tuned vision transformation integrating model explainability for breast cancer prediction.

IF 6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Visual Computing for Industry Biomedicine and Art Pub Date : 2025-03-10 DOI:10.1186/s42492-025-00186-x

Huong Hoang Luong, Phuc Phan Hong, Dat Vo Minh, Thinh Nguyen Le Quang, Anh Dinh The, Nguyen Thai-Nghe, Hai Thanh Nguyen

{"title":"Principal component analysis and fine-tuned vision transformation integrating model explainability for breast cancer prediction.","authors":"Huong Hoang Luong, Phuc Phan Hong, Dat Vo Minh, Thinh Nguyen Le Quang, Anh Dinh The, Nguyen Thai-Nghe, Hai Thanh Nguyen","doi":"10.1186/s42492-025-00186-x","DOIUrl":null,"url":null,"abstract":"<p><p>Breast cancer, which is the most commonly diagnosed cancers among women, is a notable health issues globally. Breast cancer is a result of abnormal cells in the breast tissue growing out of control. Histopathology, which refers to the detection and learning of tissue diseases, has appeared as a solution for breast cancer treatment as it plays a vital role in its diagnosis and classification. Thus, considerable research on histopathology in medical and computer science has been conducted to develop an effective method for breast cancer treatment. In this study, a vision Transformer (ViT) was employed to classify tumors into two classes, benign and malignant, in the Breast Cancer Histopathological Database (BreakHis). To enhance the model performance, we introduced the novel multi-head locality large kernel self-attention during fine-tuning, achieving an accuracy of 95.94% at 100× magnification, thereby improving the accuracy by 3.34% compared to a standard ViT (which uses multi-head self-attention). In addition, the application of principal component analysis for dimensionality reduction led to an accuracy improvement of 3.34%, highlighting its role in mitigating overfitting and reducing the computational complexity. In the final phase, SHapley Additive exPlanations, Local Interpretable Model-agnostic Explanations, and Gradient-weighted Class Activation Mapping were used for the interpretability and explainability of machine-learning models, aiding in understanding the feature importance and local explanations, and visualizing the model attention. In another experiment, ensemble learning with VGGIN further boosted the performance to 97.13% accuracy. Our approach exhibited a 0.98% to 17.13% improvement in accuracy compared with state-of-the-art methods, establishing a new benchmark for breast cancer histopathological image classification.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"5"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11893953/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Computing for Industry Biomedicine and Art","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s42492-025-00186-x","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Breast cancer, which is the most commonly diagnosed cancers among women, is a notable health issues globally. Breast cancer is a result of abnormal cells in the breast tissue growing out of control. Histopathology, which refers to the detection and learning of tissue diseases, has appeared as a solution for breast cancer treatment as it plays a vital role in its diagnosis and classification. Thus, considerable research on histopathology in medical and computer science has been conducted to develop an effective method for breast cancer treatment. In this study, a vision Transformer (ViT) was employed to classify tumors into two classes, benign and malignant, in the Breast Cancer Histopathological Database (BreakHis). To enhance the model performance, we introduced the novel multi-head locality large kernel self-attention during fine-tuning, achieving an accuracy of 95.94% at 100× magnification, thereby improving the accuracy by 3.34% compared to a standard ViT (which uses multi-head self-attention). In addition, the application of principal component analysis for dimensionality reduction led to an accuracy improvement of 3.34%, highlighting its role in mitigating overfitting and reducing the computational complexity. In the final phase, SHapley Additive exPlanations, Local Interpretable Model-agnostic Explanations, and Gradient-weighted Class Activation Mapping were used for the interpretability and explainability of machine-learning models, aiding in understanding the feature importance and local explanations, and visualizing the model attention. In another experiment, ensemble learning with VGGIN further boosted the performance to 97.13% accuracy. Our approach exhibited a 0.98% to 17.13% improvement in accuracy compared with state-of-the-art methods, establishing a new benchmark for breast cancer histopathological image classification.

查看原文本刊更多论文

主成分分析与精细视觉转换整合乳癌预测模型可解释性。

乳腺癌是妇女中最常见的癌症，是一个引人注目的全球健康问题。乳腺癌是乳房组织中异常细胞生长失控的结果。组织病理学是指组织疾病的检测和学习，它在乳腺癌的诊断和分类中起着至关重要的作用，已经成为治疗乳腺癌的一种解决方案。因此，在医学和计算机科学方面进行了大量的组织病理学研究，以开发一种有效的乳腺癌治疗方法。本研究采用vision Transformer （ViT）将乳腺癌组织病理学数据库（BreakHis）中的肿瘤分为良性和恶性两类。为了提高模型的性能，我们在微调过程中引入了新的多头局部大核自注意，在100倍的放大倍数下实现了95.94%的准确率，从而比标准ViT（使用多头自注意）提高了3.34%的准确率。此外，应用主成分分析进行降维，精度提高了3.34%，突出了其在缓解过拟合和降低计算复杂度方面的作用。在最后阶段，SHapley加性解释、局部可解释模型不可知论解释和梯度加权类激活映射用于机器学习模型的可解释性和可解释性，帮助理解特征重要性和局部解释，并将模型注意力可视化。在另一个实验中，VGGIN的集成学习进一步提高了性能，准确率达到97.13%。与最先进的方法相比，我们的方法的准确率提高了0.98%至17.13%，为乳腺癌组织病理学图像分类建立了新的基准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Visual Computing for Industry Biomedicine and Art Multiple-

CiteScore

5.60

自引率

0.00%

发文量