Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Bin Yang, Lei Ding, Jianqiang Li, Yong Li, Guangzhi Qu, Jingyi Wang, Qiang Wang, Bo Liu
{"title":"Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification","authors":"Bin Yang, Lei Ding, Jianqiang Li, Yong Li, Guangzhi Qu, Jingyi Wang, Qiang Wang, Bo Liu","doi":"10.1007/s40747-025-01779-y","DOIUrl":null,"url":null,"abstract":"<p>Digital medical imaging, particularly pathology images, is essential for cancer diagnosis but faces challenges in direct model training due to its super-resolution nature. Although weakly supervised learning has reduced the need for manual annotations, many multiple instance learning (MIL) methods struggle to effectively capture crucial spatial relationships in histopathological images. Existing methods incorporating positional information often overlook nuanced spatial correlations or use positional encoding strategies that do not fully capture the unique spatial dynamics of pathology images. To address this issue, we propose a new framework named TMIL (Transformer-based Multiple Instance Learning Network with 2D positional encoding), which leverages multiple instance learning for weakly supervised classification of histopathological images. TMIL incorporates a 2D positional encoding module, based on the Transformer, to model positional information and explore correlations between instances. Furthermore, TMIL divides histopathological images into pseudo-bags and trains patch-level feature vectors with deep metric learning to enhance classification performance. Finally, the proposed approach is evaluated on a public colorectal adenoma dataset. The experimental results show that TMIL outperforms existing MIL methods, achieving an AUC of 97.28% and an ACC of 95.19%. These findings suggest that TMIL’s integration of deep metric learning and positional encoding offers a promising approach for improving the efficiency and accuracy of pathology image analysis in cancer diagnosis.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"56 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01779-y","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Digital medical imaging, particularly pathology images, is essential for cancer diagnosis but faces challenges in direct model training due to its super-resolution nature. Although weakly supervised learning has reduced the need for manual annotations, many multiple instance learning (MIL) methods struggle to effectively capture crucial spatial relationships in histopathological images. Existing methods incorporating positional information often overlook nuanced spatial correlations or use positional encoding strategies that do not fully capture the unique spatial dynamics of pathology images. To address this issue, we propose a new framework named TMIL (Transformer-based Multiple Instance Learning Network with 2D positional encoding), which leverages multiple instance learning for weakly supervised classification of histopathological images. TMIL incorporates a 2D positional encoding module, based on the Transformer, to model positional information and explore correlations between instances. Furthermore, TMIL divides histopathological images into pseudo-bags and trains patch-level feature vectors with deep metric learning to enhance classification performance. Finally, the proposed approach is evaluated on a public colorectal adenoma dataset. The experimental results show that TMIL outperforms existing MIL methods, achieving an AUC of 97.28% and an ACC of 95.19%. These findings suggest that TMIL’s integration of deep metric learning and positional encoding offers a promising approach for improving the efficiency and accuracy of pathology image analysis in cancer diagnosis.

基于二维位置编码的多实例学习网络用于组织病理图像分类
数字医学成像,特别是病理图像,对癌症诊断至关重要,但由于其超分辨率的性质,在直接模型训练方面面临挑战。尽管弱监督学习减少了对人工注释的需求,但许多多实例学习(MIL)方法难以有效地捕获组织病理学图像中的关键空间关系。结合位置信息的现有方法经常忽略细微的空间相关性或使用位置编码策略,不能完全捕获病理图像的独特空间动态。为了解决这个问题,我们提出了一个名为TMIL(基于二维位置编码的基于变换的多实例学习网络)的新框架,它利用多实例学习对组织病理图像进行弱监督分类。TMIL结合了一个基于Transformer的2D位置编码模块,用于对位置信息建模并探索实例之间的相关性。此外,TMIL将组织病理图像划分为伪袋,并利用深度度量学习训练斑块级特征向量,以提高分类性能。最后,在公共结直肠腺瘤数据集上对所提出的方法进行了评估。实验结果表明,该方法的AUC为97.28%,ACC为95.19%,优于现有的MIL方法。这些研究结果表明,TMIL将深度度量学习和位置编码相结合,为提高肿瘤病理图像分析的效率和准确性提供了一种有希望的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信