基于视觉变换和cnn的乳腺肿瘤检测集成体系结构

IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Saif Ur Rehman Khan, Sohaib Asif, Omair Bilal
{"title":"基于视觉变换和cnn的乳腺肿瘤检测集成体系结构","authors":"Saif Ur Rehman Khan,&nbsp;Sohaib Asif,&nbsp;Omair Bilal","doi":"10.1002/ima.70090","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Addressing the complexities of classifying distinct object classes in computer vision presents several challenges, including effectively capturing features such as color, form, and tissue size for each class, correlating class vulnerabilities, singly capturing features, and predicting class labels accurately. To tackle these issues, we introduce a novel hybrid deep dense learning technique that combines deep transfer learning with a transformer architecture. Our approach utilizes ResNet50, EfficientNetB1, and our proposed ProDense block as the backbone models. By integrating the Vit-L16 transformer, we can focus on relevant features in mammography and extract high-value pair features, offering two alternative methods for feature extraction. This allows our model to adaptively shift the region of interest towards the class type in slides. The transformer architecture, particularly Vit-L16, enhances feature extraction by efficiently capturing long-range dependencies in the data, enabling the model to better understand the context and relationships between features. This aids in more accurate classification, especially when fine-tuning pretrained models, as it helps the model adapt to specific characteristics of the target dataset while retaining valuable information learned from the pretraining phase. Furthermore, we employ a stack ensemble technique to leverage both the deep transfer learning model and the ProDense block extension for training extensive features for breast cancer classification. The fine-tuning process employed by our hybrid model helps refine the dense layers, enhancing classification accuracy. Evaluating our method on the INbreast dataset, we observe a significant improvement in predicting the binary cancer category, outperforming the current state-of-the-art classifier by 98.08% in terms of accuracy.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ensemble Architecture of Vision Transformer and CNNs for Breast Cancer Tumor Detection From Mammograms\",\"authors\":\"Saif Ur Rehman Khan,&nbsp;Sohaib Asif,&nbsp;Omair Bilal\",\"doi\":\"10.1002/ima.70090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Addressing the complexities of classifying distinct object classes in computer vision presents several challenges, including effectively capturing features such as color, form, and tissue size for each class, correlating class vulnerabilities, singly capturing features, and predicting class labels accurately. To tackle these issues, we introduce a novel hybrid deep dense learning technique that combines deep transfer learning with a transformer architecture. Our approach utilizes ResNet50, EfficientNetB1, and our proposed ProDense block as the backbone models. By integrating the Vit-L16 transformer, we can focus on relevant features in mammography and extract high-value pair features, offering two alternative methods for feature extraction. This allows our model to adaptively shift the region of interest towards the class type in slides. The transformer architecture, particularly Vit-L16, enhances feature extraction by efficiently capturing long-range dependencies in the data, enabling the model to better understand the context and relationships between features. This aids in more accurate classification, especially when fine-tuning pretrained models, as it helps the model adapt to specific characteristics of the target dataset while retaining valuable information learned from the pretraining phase. Furthermore, we employ a stack ensemble technique to leverage both the deep transfer learning model and the ProDense block extension for training extensive features for breast cancer classification. The fine-tuning process employed by our hybrid model helps refine the dense layers, enhancing classification accuracy. Evaluating our method on the INbreast dataset, we observe a significant improvement in predicting the binary cancer category, outperforming the current state-of-the-art classifier by 98.08% in terms of accuracy.</p>\\n </div>\",\"PeriodicalId\":14027,\"journal\":{\"name\":\"International Journal of Imaging Systems and Technology\",\"volume\":\"35 3\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-04-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Imaging Systems and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ima.70090\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70090","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

在计算机视觉中对不同的对象类别进行分类的复杂性提出了几个挑战,包括有效地捕获每个类别的颜色、形状和组织大小等特征,关联类别漏洞,单独捕获特征,以及准确地预测类别标签。为了解决这些问题,我们引入了一种新的混合深度密集学习技术,该技术将深度迁移学习与变压器架构相结合。我们的方法使用ResNet50、EfficientNetB1和我们提出的ProDense块作为主干模型。通过集成vitl - l16转换器,我们可以专注于乳房x线摄影的相关特征,提取高价值对特征,提供两种可选的特征提取方法。这允许我们的模型自适应地将感兴趣的区域转移到幻灯片中的类类型。转换器体系结构,特别是vitl - l16,通过有效地捕获数据中的远程依赖关系来增强特征提取,使模型能够更好地理解上下文和特征之间的关系。这有助于更准确的分类,特别是在微调预训练模型时,因为它有助于模型适应目标数据集的特定特征,同时保留从预训练阶段学习到的有价值的信息。此外,我们采用堆栈集成技术来利用深度迁移学习模型和ProDense块扩展来训练用于乳腺癌分类的广泛特征。混合模型采用的微调过程有助于细化密集层,提高分类精度。在INbreast数据集上评估我们的方法,我们观察到在预测二元癌症类别方面有显着改善,在准确率方面优于当前最先进的分类器98.08%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Ensemble Architecture of Vision Transformer and CNNs for Breast Cancer Tumor Detection From Mammograms

Addressing the complexities of classifying distinct object classes in computer vision presents several challenges, including effectively capturing features such as color, form, and tissue size for each class, correlating class vulnerabilities, singly capturing features, and predicting class labels accurately. To tackle these issues, we introduce a novel hybrid deep dense learning technique that combines deep transfer learning with a transformer architecture. Our approach utilizes ResNet50, EfficientNetB1, and our proposed ProDense block as the backbone models. By integrating the Vit-L16 transformer, we can focus on relevant features in mammography and extract high-value pair features, offering two alternative methods for feature extraction. This allows our model to adaptively shift the region of interest towards the class type in slides. The transformer architecture, particularly Vit-L16, enhances feature extraction by efficiently capturing long-range dependencies in the data, enabling the model to better understand the context and relationships between features. This aids in more accurate classification, especially when fine-tuning pretrained models, as it helps the model adapt to specific characteristics of the target dataset while retaining valuable information learned from the pretraining phase. Furthermore, we employ a stack ensemble technique to leverage both the deep transfer learning model and the ProDense block extension for training extensive features for breast cancer classification. The fine-tuning process employed by our hybrid model helps refine the dense layers, enhancing classification accuracy. Evaluating our method on the INbreast dataset, we observe a significant improvement in predicting the binary cancer category, outperforming the current state-of-the-art classifier by 98.08% in terms of accuracy.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Imaging Systems and Technology
International Journal of Imaging Systems and Technology 工程技术-成像科学与照相技术
CiteScore
6.90
自引率
6.10%
发文量
138
审稿时长
3 months
期刊介绍: The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals. IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging. The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered. The scope of the journal includes, but is not limited to, the following in the context of biomedical research: Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.; Neuromodulation and brain stimulation techniques such as TMS and tDCS; Software and hardware for imaging, especially related to human and animal health; Image segmentation in normal and clinical populations; Pattern analysis and classification using machine learning techniques; Computational modeling and analysis; Brain connectivity and connectomics; Systems-level characterization of brain function; Neural networks and neurorobotics; Computer vision, based on human/animal physiology; Brain-computer interface (BCI) technology; Big data, databasing and data mining.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信