IP-MIML: A multi-instance multi-label learning framework for predicting protein subcellular localization from biological images

IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Xinyue Chen , Hang Shi , Shi-bing Guan , Wei Shao
{"title":"IP-MIML: A multi-instance multi-label learning framework for predicting protein subcellular localization from biological images","authors":"Xinyue Chen ,&nbsp;Hang Shi ,&nbsp;Shi-bing Guan ,&nbsp;Wei Shao","doi":"10.1016/j.displa.2025.103220","DOIUrl":null,"url":null,"abstract":"<div><div>Recent studies indicate that the localization of proteins within a cell is essential for determining their functions and gaining insights into various cellular processes. With advances in microscopic imaging, accurate classification of bioimage-based protein subcellular localization patterns has attracted as much attention as ever. However, most bioimage-based protein subcellular location predictors are designed to allocate the protein image to one location, which overlooks the case that a protein may colocalize in different cellular compartments that deserve special attention. On the other hand, we could observe a protein expressed in multiple biological images derived from different tissues, it is still a challenge to summarize the localization patterns of that protein across all related images. Based on the above considerations, we propose a multi-instance multi-label learning framework to determine the subcellular localization of proteins from biological images (<em>i.e.,</em> IP-MIML). Specifically, we first treat one protein as a bag and all images belonging to it as instances and introduce the self-attention mechanism to learn instance-level representation by considering their correlations. Then, a bag-concept layer is developed to discover the latent relation between the inputs and the output semantic labels. In addition, we also incorporate an optimal transport (OT) based formulation to learn the label distribution and exploit label correlations, simultaneously. Finally, a dynamic threshold method is utilized for adjusting the multi-label prediction results. We evaluated our method on normal and cancer protein bioimages, and the experimental results indicate that the proposed IP-MIML not only can achieve higher accuracy in predicting the cellular compartments of proteins with multiple localizations, but also can detect potential cancer biomarker proteins that have significant localization differences between normal and cancer tissues.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103220"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002574","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Recent studies indicate that the localization of proteins within a cell is essential for determining their functions and gaining insights into various cellular processes. With advances in microscopic imaging, accurate classification of bioimage-based protein subcellular localization patterns has attracted as much attention as ever. However, most bioimage-based protein subcellular location predictors are designed to allocate the protein image to one location, which overlooks the case that a protein may colocalize in different cellular compartments that deserve special attention. On the other hand, we could observe a protein expressed in multiple biological images derived from different tissues, it is still a challenge to summarize the localization patterns of that protein across all related images. Based on the above considerations, we propose a multi-instance multi-label learning framework to determine the subcellular localization of proteins from biological images (i.e., IP-MIML). Specifically, we first treat one protein as a bag and all images belonging to it as instances and introduce the self-attention mechanism to learn instance-level representation by considering their correlations. Then, a bag-concept layer is developed to discover the latent relation between the inputs and the output semantic labels. In addition, we also incorporate an optimal transport (OT) based formulation to learn the label distribution and exploit label correlations, simultaneously. Finally, a dynamic threshold method is utilized for adjusting the multi-label prediction results. We evaluated our method on normal and cancer protein bioimages, and the experimental results indicate that the proposed IP-MIML not only can achieve higher accuracy in predicting the cellular compartments of proteins with multiple localizations, but also can detect potential cancer biomarker proteins that have significant localization differences between normal and cancer tissues.
IP-MIML:用于从生物图像中预测蛋白质亚细胞定位的多实例多标签学习框架
最近的研究表明,蛋白质在细胞内的定位对于确定其功能和深入了解各种细胞过程至关重要。随着显微成像技术的进步,基于生物图像的蛋白质亚细胞定位模式的准确分类越来越受到人们的关注。然而,大多数基于生物图像的蛋白质亚细胞定位预测器被设计为将蛋白质图像分配到一个位置,这忽略了蛋白质可能在不同细胞区室中共定位的情况,这值得特别注意。另一方面,我们可以在来自不同组织的多个生物图像中观察到表达的蛋白质,但要在所有相关图像中总结该蛋白质的定位模式仍然是一个挑战。基于上述考虑,我们提出了一个多实例多标签学习框架来确定生物图像中蛋白质的亚细胞定位(即IP-MIML)。具体而言,我们首先将一个蛋白质视为一个袋子,将所有属于它的图像视为实例,并引入自注意机制,通过考虑它们之间的相关性来学习实例级表示。然后,开发了一个袋概念层来发现输入和输出语义标签之间的潜在关系。此外,我们还结合了一个基于最优传输(OT)的公式来学习标签分布并同时利用标签相关性。最后,采用动态阈值法对多标签预测结果进行调整。我们在正常和癌症蛋白生物图像上对我们的方法进行了评估,实验结果表明,我们提出的IP-MIML不仅可以预测具有多个定位的蛋白质的细胞区室,而且可以检测正常和癌症组织之间存在显著定位差异的潜在癌症生物标志物蛋白。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信