{"title":"PMLocMSCAM: Predicting miRNA Subcellular Localisations by miRNA Similarities and Cross-Attention Mechanism","authors":"Jipu Jiang, Cheng Yan","doi":"10.1049/syb2.70023","DOIUrl":null,"url":null,"abstract":"<p>Many studies have shown that microRNAs (miRNAs) play key roles in some important processes and human complicated diseases. In addition, they also have specific physiological roles at different cellular sites. Therefore, identifying their subcellular localisation is very urgent to systemically understand their physiological functions. In this study, we propose a computational method, called PMLocMSCAM, to predict miRNA subcellular localisation based on miRNA similarities and cross-attention mechanism. PMLocMSCAM implements a multimodal integration framework that systematically processes miRNA sequence data, miRNA-mRNA association networks with mRNA subcellular localisation annotations, miRNA-disease associations, and miRNA-drug association networks. The architecture initiates with intrinsic feature extraction through Smith-Waterman alignment for sequence similarity computation and disease ontology-based functional similarity derivation. Subsequent heterogeneous network embedding employs Node2vec for topological feature learning across three interaction modalities (miRNA-disease, miRNA-drug, and miRNA-mRNA networks), enhanced by hypergraph convolution to capture higher-order relationships through incidence matrix decomposition. Localisation-specific patterns are propagated via miRNA-mRNA interaction weights, culminating in a multi-head attention mechanism that dynamically fuses five feature matrices—miRNA sequence features, miRNA-disease association features, miRNA-drug association features, miRNA-mRNA association features, and miRNA-mRNA localisation features. These integrated representations are processed through residual-connected multilayer perceptrons to generate probabilistic predictions across seven subcellular compartments, establishing an end-to-end computational paradigm for multimodal miRNA localisation analysis. In order to assess the prediction performance of our method and compare it with other miRNA subcellular localisation computational methods, we conduct 10-fold cross validation (10-CV) and independent test dataset. The AUC (area of receiver operating characteristic curve) and AUPR (area of precision-recall curve) are used as metrics. The experiment results show that the average AUC and AUPR values exceed 0.9182 and 0.8487 on 10-CV, respectively. The AUC and AUPR values also reach 0.9157 and 0.8469 on independent test dataset, respectively. It is superior with compared methods. The ablation experiment results also further that PMLocMSCAM can effective predict miRNA subcellular localisations and provide help to understand their physiological functions.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70023","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Systems Biology","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/syb2.70023","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Many studies have shown that microRNAs (miRNAs) play key roles in some important processes and human complicated diseases. In addition, they also have specific physiological roles at different cellular sites. Therefore, identifying their subcellular localisation is very urgent to systemically understand their physiological functions. In this study, we propose a computational method, called PMLocMSCAM, to predict miRNA subcellular localisation based on miRNA similarities and cross-attention mechanism. PMLocMSCAM implements a multimodal integration framework that systematically processes miRNA sequence data, miRNA-mRNA association networks with mRNA subcellular localisation annotations, miRNA-disease associations, and miRNA-drug association networks. The architecture initiates with intrinsic feature extraction through Smith-Waterman alignment for sequence similarity computation and disease ontology-based functional similarity derivation. Subsequent heterogeneous network embedding employs Node2vec for topological feature learning across three interaction modalities (miRNA-disease, miRNA-drug, and miRNA-mRNA networks), enhanced by hypergraph convolution to capture higher-order relationships through incidence matrix decomposition. Localisation-specific patterns are propagated via miRNA-mRNA interaction weights, culminating in a multi-head attention mechanism that dynamically fuses five feature matrices—miRNA sequence features, miRNA-disease association features, miRNA-drug association features, miRNA-mRNA association features, and miRNA-mRNA localisation features. These integrated representations are processed through residual-connected multilayer perceptrons to generate probabilistic predictions across seven subcellular compartments, establishing an end-to-end computational paradigm for multimodal miRNA localisation analysis. In order to assess the prediction performance of our method and compare it with other miRNA subcellular localisation computational methods, we conduct 10-fold cross validation (10-CV) and independent test dataset. The AUC (area of receiver operating characteristic curve) and AUPR (area of precision-recall curve) are used as metrics. The experiment results show that the average AUC and AUPR values exceed 0.9182 and 0.8487 on 10-CV, respectively. The AUC and AUPR values also reach 0.9157 and 0.8469 on independent test dataset, respectively. It is superior with compared methods. The ablation experiment results also further that PMLocMSCAM can effective predict miRNA subcellular localisations and provide help to understand their physiological functions.
期刊介绍:
IET Systems Biology covers intra- and inter-cellular dynamics, using systems- and signal-oriented approaches. Papers that analyse genomic data in order to identify variables and basic relationships between them are considered if the results provide a basis for mathematical modelling and simulation of cellular dynamics. Manuscripts on molecular and cell biological studies are encouraged if the aim is a systems approach to dynamic interactions within and between cells.
The scope includes the following topics:
Genomics, transcriptomics, proteomics, metabolomics, cells, tissue and the physiome; molecular and cellular interaction, gene, cell and protein function; networks and pathways; metabolism and cell signalling; dynamics, regulation and control; systems, signals, and information; experimental data analysis; mathematical modelling, simulation and theoretical analysis; biological modelling, simulation, prediction and control; methodologies, databases, tools and algorithms for modelling and simulation; modelling, analysis and control of biological networks; synthetic biology and bioengineering based on systems biology.