IP-MIML: A multi-instance multi-label learning framework for predicting protein subcellular localization from biological images

IF 3.4 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays Pub Date : 2025-09-18 DOI:10.1016/j.displa.2025.103220

Xinyue Chen , Hang Shi , Shi-bing Guan , Wei Shao

{"title":"IP-MIML: A multi-instance multi-label learning framework for predicting protein subcellular localization from biological images","authors":"Xinyue Chen , Hang Shi , Shi-bing Guan , Wei Shao","doi":"10.1016/j.displa.2025.103220","DOIUrl":null,"url":null,"abstract":"<div><div>Recent studies indicate that the localization of proteins within a cell is essential for determining their functions and gaining insights into various cellular processes. With advances in microscopic imaging, accurate classification of bioimage-based protein subcellular localization patterns has attracted as much attention as ever. However, most bioimage-based protein subcellular location predictors are designed to allocate the protein image to one location, which overlooks the case that a protein may colocalize in different cellular compartments that deserve special attention. On the other hand, we could observe a protein expressed in multiple biological images derived from different tissues, it is still a challenge to summarize the localization patterns of that protein across all related images. Based on the above considerations, we propose a multi-instance multi-label learning framework to determine the subcellular localization of proteins from biological images (<em>i.e.,</em> IP-MIML). Specifically, we first treat one protein as a bag and all images belonging to it as instances and introduce the self-attention mechanism to learn instance-level representation by considering their correlations. Then, a bag-concept layer is developed to discover the latent relation between the inputs and the output semantic labels. In addition, we also incorporate an optimal transport (OT) based formulation to learn the label distribution and exploit label correlations, simultaneously. Finally, a dynamic threshold method is utilized for adjusting the multi-label prediction results. We evaluated our method on normal and cancer protein bioimages, and the experimental results indicate that the proposed IP-MIML not only can achieve higher accuracy in predicting the cellular compartments of proteins with multiple localizations, but also can detect potential cancer biomarker proteins that have significant localization differences between normal and cancer tissues.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103220"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002574","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent studies indicate that the localization of proteins within a cell is essential for determining their functions and gaining insights into various cellular processes. With advances in microscopic imaging, accurate classification of bioimage-based protein subcellular localization patterns has attracted as much attention as ever. However, most bioimage-based protein subcellular location predictors are designed to allocate the protein image to one location, which overlooks the case that a protein may colocalize in different cellular compartments that deserve special attention. On the other hand, we could observe a protein expressed in multiple biological images derived from different tissues, it is still a challenge to summarize the localization patterns of that protein across all related images. Based on the above considerations, we propose a multi-instance multi-label learning framework to determine the subcellular localization of proteins from biological images (i.e., IP-MIML). Specifically, we first treat one protein as a bag and all images belonging to it as instances and introduce the self-attention mechanism to learn instance-level representation by considering their correlations. Then, a bag-concept layer is developed to discover the latent relation between the inputs and the output semantic labels. In addition, we also incorporate an optimal transport (OT) based formulation to learn the label distribution and exploit label correlations, simultaneously. Finally, a dynamic threshold method is utilized for adjusting the multi-label prediction results. We evaluated our method on normal and cancer protein bioimages, and the experimental results indicate that the proposed IP-MIML not only can achieve higher accuracy in predicting the cellular compartments of proteins with multiple localizations, but also can detect potential cancer biomarker proteins that have significant localization differences between normal and cancer tissues.

查看原文本刊更多论文

IP-MIML：用于从生物图像中预测蛋白质亚细胞定位的多实例多标签学习框架

最近的研究表明，蛋白质在细胞内的定位对于确定其功能和深入了解各种细胞过程至关重要。随着显微成像技术的进步，基于生物图像的蛋白质亚细胞定位模式的准确分类越来越受到人们的关注。然而，大多数基于生物图像的蛋白质亚细胞定位预测器被设计为将蛋白质图像分配到一个位置，这忽略了蛋白质可能在不同细胞区室中共定位的情况，这值得特别注意。另一方面，我们可以在来自不同组织的多个生物图像中观察到表达的蛋白质，但要在所有相关图像中总结该蛋白质的定位模式仍然是一个挑战。基于上述考虑，我们提出了一个多实例多标签学习框架来确定生物图像中蛋白质的亚细胞定位（即IP-MIML）。具体而言，我们首先将一个蛋白质视为一个袋子，将所有属于它的图像视为实例，并引入自注意机制，通过考虑它们之间的相关性来学习实例级表示。然后，开发了一个袋概念层来发现输入和输出语义标签之间的潜在关系。此外，我们还结合了一个基于最优传输（OT）的公式来学习标签分布并同时利用标签相关性。最后，采用动态阈值法对多标签预测结果进行调整。我们在正常和癌症蛋白生物图像上对我们的方法进行了评估，实验结果表明，我们提出的IP-MIML不仅可以预测具有多个定位的蛋白质的细胞区室，而且可以检测正常和癌症组织之间存在显著定位差异的潜在癌症生物标志物蛋白。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.