{"title":"MIF-DTI: a multimodal information fusion method for drug-target interaction prediction.","authors":"Jiehong Shan, Jinchen Sun, Haoran Zheng","doi":"10.1093/bib/bbaf474","DOIUrl":"10.1093/bib/bbaf474","url":null,"abstract":"<p><p>Drug-target interaction (DTI) prediction is essential for drug discovery and repurposing. To overcome the limitations of current DTI prediction methods that rely on single-source encoding and inadequately fuse multimodal information, this study proposes a DTI prediction method based on multimodal information fusion (MIF-DTI) and further designs an ensemble version (MIF-DTI-B). MIF-DTI encodes the SMILES sequences of drugs and the amino acid sequences of targets via a sequence encoding module to extract their 1D sequence features. It conducts dual-view representation encoding on the hierarchical molecular graphs of drugs and the contact graphs of targets through a graph encoding module, aiming to capture their 2D topological structure information. A decoding module is utilized to fuse information from different modalities. MIF-DTI-B ensembles several MIF-DTI models through cross-validation strategy to improve predictive accuracy. This study evaluates the proposed models on three publicly accessible DTI datasets. Experimental results demonstrate that fully integrating multimodal information enables both MIF-DTI and MIF-DTI-B to consistently outperform state-of-the-art methods.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448477/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145085097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dingyi Rong, Bozitao Zhong, Wenzhuo Zheng, Liang Hong, Ning Liu
{"title":"Autoregressive enzyme function prediction with multi-scale multi-modality fusion.","authors":"Dingyi Rong, Bozitao Zhong, Wenzhuo Zheng, Liang Hong, Ning Liu","doi":"10.1093/bib/bbaf476","DOIUrl":"10.1093/bib/bbaf476","url":null,"abstract":"<p><p>Accurate prediction of enzyme function is crucial for elucidating biological mechanisms and driving innovation across various sectors. Existing deep learning methods tend to rely solely on either sequence data or structural data and predict the Enzyme Commission (EC) number as a whole, neglecting the intrinsic hierarchical structure of EC numbers. To address these limitations, we introduce Multi-scale multi-modality Autoregressive Predictor (MAPred), a novel multi-modality and multi-scale model designed to autoregressively predict the EC number of proteins. MAPred integrates both the primary amino acid sequence and the 3D tokens of proteins, employing a dual-pathway approach to capture comprehensive protein characteristics and essential local functional sites. Additionally, MAPred utilizes an autoregressive prediction network to sequentially predict the digits of the EC number, leveraging the hierarchical organization of EC classifications. Evaluations on benchmark datasets, including New-392, Price, and New-815, demonstrate that our method outperforms existing models, marking a significant advance in the reliability and granularity of protein function prediction within bioinformatics.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448393/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145085083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transfer learning reveals the mediating mechanisms of cross-ethnic lipid metabolic pathways in the association between APOE gene and Alzheimer's disease.","authors":"Lulu Pan, Yahang Liu, Chen Huang, Ruilang Lin, Yongfu Yu, Guoyou Qin","doi":"10.1093/bib/bbaf460","DOIUrl":"10.1093/bib/bbaf460","url":null,"abstract":"<p><p>Lipid-mediated effects play a crucial role in elucidating the pathological mechanisms linking the ε4 allele of the apolipoprotein E gene (APOE ε4) to Alzheimer's disease (AD). However, traditional mediation analysis methods often suffer from insufficient statistical power in studies involving minority populations due to limited sample sizes. This study innovatively develops a high-dimensional mediation analysis model (TransHDM) based on a transfer learning framework. By leveraging information from source data with large-scale samples, it significantly enhances the ability to identify potential mediators in small sample target data. The method first constructs a high-dimensional regression model using aggregated data from the source data and target data, then applies transfer regularization to adjust for heterogeneity between the source and target domains, correcting for estimation bias in high-dimensional Lasso. Ultimately, it achieves parameter transfer across domains, addressing statistical bias and inferential uncertainty caused by small sample sizes. Simulation results demonstrate that, compared to traditional methods, this approach significantly improves the power in identifying true mediator variables while effectively controlling the family-wise error rate in multiple testing. When applied to the Alzheimer's Disease Neuroimaging Initiative cohort, TransHDM transferred large-scale data from white and other ethnic groups, identifying additional lipid metabolic pathways mediating the influence of the APOE ε4 allele on AD pathological progression in African American populations compared to pre-transfer analysis. These pathways include glycerophospholipid metabolism, glycerolipid metabolism, sphingolipid metabolism, and ether lipid metabolism (false discovery rate < 0.05). The TransHDM framework not only provides a powerful methodological tool for small sample population research but also offers valuable insights for future research in exploring disease mechanisms and developing biomarkers for disease prediction.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445873/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145085172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yinyin Gong, Rui Li, Yan Liu, Jilong Wang, Danny Z Chen, Chee Keong Kwoh
{"title":"DeepPhosPPI: a deep learning framework with attention-CNN and transformer for predicting phosphorylation effects on protein-protein interactions.","authors":"Yinyin Gong, Rui Li, Yan Liu, Jilong Wang, Danny Z Chen, Chee Keong Kwoh","doi":"10.1093/bib/bbaf462","DOIUrl":"10.1093/bib/bbaf462","url":null,"abstract":"<p><p>Protein phosphorylation regulates protein function and cellular signaling pathways, and is strongly associated with diseases, including neurodegenerative disorders and cancer. Phosphorylation plays a critical role in regulating protein activity and cellular signaling by modulating protein-protein interactions (PPIs). It alters binding affinities and interaction networks, thereby influencing biological processes and maintaining cellular homeostasis. Experimental validation of these effects is labor-intensive and expensive, highlighting the need for efficient computational approaches. We propose DeepPhosPPI, the first sequence-based deep learning framework for phosphorylation effects on PPIs prediction, which employs the pre-trained protein language model for feature embedding, with ProtBERT and ESM-2 as alternative backbone encoders. By combining attention-based convolutional neural network and Transformer models, DeepPhosPPI accurately predicts phosphorylation effects. The experimental results show that DeepPhosPPI consistently outperforms state-of-the-art methods in multiple tasks, including functional sites identification and regulatory effect classification.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12414479/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145013754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DeepMobilome: predicting mobile genetic elements using sequencing reads of microbiomes.","authors":"Youna Cho, Erin Kim, Minyoung Kim, Mina Rho","doi":"10.1093/bib/bbaf450","DOIUrl":"10.1093/bib/bbaf450","url":null,"abstract":"<p><strong>Motivation: </strong>Mobile genetic elements (MGEs) play an important role in facilitating the acquisition of antibiotic resistance genes (ARGs) within microbial communities, significantly impacting the evolution of antibiotic resistance. Understanding the mechanism and trajectory of ARG acquisition requires a comprehensive analysis of the ARG-carrying mobilome-a collective set of MGEs carrying ARGs. However, identifying the mobilome within complex microbiomes poses considerable challenges. Existing MGE prediction methods, designed primarily for single genomes, exhibit substantial limitations when applied to metagenomic data, often producing high false positive rates in identifying target MGEs from metagenome sequencing data.</p><p><strong>Results: </strong>To address these challenges, we developed DeepMobilome, a novel approach for accurately identifying target MGEs within the microbiome. DeepMobilome leverages a convolutional neural network trained on read alignment data derived from sequence alignment map (SAM) files, providing superior accuracy in detecting MGEs. Trained on 364 647 cases, DeepMobilome achieved a high validation accuracy of 0.99. DeepMobilome consistently outperformed existing methods in discerning the presence of target MGE sequences across diverse test sets. In single-genome test scenarios, DeepMobilome showed an F1-score of 0.935, compared to 0.755 and 0.670 for MGEfinder and ISMapper, respectively, demonstrating its substantial improvements in prediction accuracy. Extensive evaluations across simulated microbiomes further validated the robustness and reliability of DeepMobilome in practical applications. In real microbiome data, DeepMobilome successfully identified six ARG-carrying MGEs across diverse populations. By addressing the limitations of current methods, DeepMobilome offers a powerful tool for advancing our understanding of ARG dissemination and supports targeted interventions in combating antibiotic resistance.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12414478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145013767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy entropy vector: a novel approach for efficient microbial genomic sequence analysis and classification.","authors":"Hao Wang, Guoqing Hu, Stephen S-T Yau","doi":"10.1093/bib/bbaf459","DOIUrl":"10.1093/bib/bbaf459","url":null,"abstract":"<p><p>With the rapid development of genomic sequencing technologies, there is an increasing demand for efficient and accurate sequence analysis methods. However, existing methods face challenges in handling long, variable-length sequences and large-scale datasets. To address these issues, we propose a novel encoding method-Energy Entropy Vector (EEV). This method encodes gene sequences of arbitrary length into fixed-dimensional vector representations by modeling nucleotide energy characteristics based on information entropy. Experiments conducted on five microbial datasets demonstrate that, compared to traditional alignment-free methods, EEV achieves higher accuracy in convex hull classification and species classification tasks, with improvements of 15% to 30% in family-level classification. In phylogenetic tree construction, EEV significantly accelerates the process relative to multiple sequence alignment methods while maintaining high tree quality, enabling rapid and accurate phylogenetic reconstruction. Moreover, EEV supports flexible dimensional expansion by superimposing nucleotide energies, enhancing its ability to represent complex genomic sequences while effectively alleviating sparsity issues in high-dimensional representations. This study provides an efficient gene encoding strategy for large-scale genomic analysis and evolutionary research.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12414480/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145013743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Gong, Nan Weng, Hongjia Liu, Ziyuan Qian, Yunyao Shen, Hongde Liu, Wenlong Ming
{"title":"abCAN: a practical and novel attention network for predicting mutant antibody affinity.","authors":"Chen Gong, Nan Weng, Hongjia Liu, Ziyuan Qian, Yunyao Shen, Hongde Liu, Wenlong Ming","doi":"10.1093/bib/bbaf464","DOIUrl":"https://doi.org/10.1093/bib/bbaf464","url":null,"abstract":"<p><p>Accurate prediction of mutation effects on antibody-antigen interactions is critical for antibody engineering and drug design. In this study, we present abCAN, a practical and novel attention network designed to predict changes in binding affinity caused by mutations. abCAN requires only the pre-mutant antibody-antigen complex structure and mutation information to perform its predictions. abCAN introduces an innovative approach, Progressive Encoding, which progressively integrates structural, residue-level, and sequential information to construct the complex representation in a systematic manner, effectively capturing both the topological features of the structure and contextual features of the sequence. During which, extra weight to interface residues would also be applied through attention mechanisms. These learned representations are then transferred to a predictor that estimates changes in antibody-antigen binding affinity induced by mutations. On the benchmark test set, abCAN achieved a root-mean-square error of 1.460 (kcal/mol) and a Pearson correlation coefficient of 0.731, setting a new state-of-the-art benchmark for prediction accuracy in the field of antibody affinity prediction. Our code and datasets are available at https://github.com/ChenGong57/abCAN.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145085121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CircCode3: integrating deep learning to mine and evaluate translatable circular RNAs from ribosome profiling sequencing and mass spectrometry data.","authors":"Zonghui Zhu, Xiaojuan Liang, Rui Ma, Shuwei Yin, Meng Xu, Guanglin Li","doi":"10.1093/bib/bbaf458","DOIUrl":"https://doi.org/10.1093/bib/bbaf458","url":null,"abstract":"<p><p>Translatable circular RNAs (circRNAs), distinguished by their capacity to encode proteins or peptides, rely on cap-independent mechanisms such as m6A-mediated or internal ribosome entry site (IRES)-driven translation initiation. Currently, identification translatable circRNAs and their open reading frame accurately are challenging. In this study, we developed an integrated analysis pipeline, CircCode3, to mine translatable circRNAs from high throughput sequencing data, building upon existing tools and significantly enhancing their functionalities. CircCode3 also introduces new capabilities, including the identification and assessment of open reading frames (ORFs) spanning back-splice junction sites. To evaluate IRES potential, we incorporated IRESfinder into the pipeline. Furthermore, we developed two deep learning tools: DeepCircm6A for predicting m6A modification sites in circRNAs, and DLMSC for assessing the reliability of stop codons. These enhancements make CircCode3 a comprehensive solution for analyzing ribosome profiling sequencing and mass spectrometry data, identifying and evaluating ORFs, and visualizing results. The CircCode3 tool is publicly available and can be downloaded from https://github.com/Lilab-SNNU/CircCode3.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145085119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph neural network integrated with pretrained protein language model for predicting human-virus protein-protein interactions.","authors":"Linyang Jiang, Xiaodi Yang, Xiaokun Guo, Dianke Li, Jiajun Li, Stefan Wuchty, Wenyu Shi, Ziding Zhang","doi":"10.1093/bib/bbaf461","DOIUrl":"10.1093/bib/bbaf461","url":null,"abstract":"<p><p>The systematic identification of human-virus protein-protein interactions (PPIs) is a critical step toward elucidating the underlying mechanisms of viral infection, directly informing the development of targeted interventions against existing and emerging viral threats. In this work, we presented DeepGNHV, an end-to-end framework that integrated a pretrained protein language model with structural features derived from AlphaFold2 and leveraged graph attention networks to predict human-virus PPIs. In comparison to other state-of-the-art approaches, DeepGNHV exhibited superior predictive performance, especially when applied to viral proteins absent from the training process, indicating its strong generalization capability for detecting newly emerging virus-related PPIs. We further demonstrated DeepGNHV's robustness across diverse perturbations and its practical application under high-confidence thresholds. Additionally, we conducted extensive predictions of human-HPV PPIs, which were supported by multiple lines of evidence and identified several host factors that specifically interact with high-risk HPV. To further explore the biological significance of DeepGNHV, we provided a case study to pinpoint specific residues that play critical roles in facilitating the corresponding PPIs. The source code of DeepGNHV and related data is publicly available on GitHub (https://github.com/bioboy0415/DeepGNHV).</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12415850/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145013805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting nucleic acid binding sites by attention map-guided graph convolutional network with protein language embeddings and physicochemical information.","authors":"Xiang Li, Wei Peng, Xiaolei Zhu","doi":"10.1093/bib/bbaf457","DOIUrl":"10.1093/bib/bbaf457","url":null,"abstract":"<p><p>Protein-nucleic acid binding sites play a crucial role in biological processes such as gene expression, signal transduction, replication, and transcription. In recent years, with the development of artificial intelligence, protein language models, graph neural networks, and transformer architectures have been adopted to develop both structure-based and sequence-based predictive models. Structure-based methods benefit from the spatial relationship between residues and have shown promising performance. However, structure-based information requires 3D protein structures, which is a challenge for large-scale protein sequence spaces. To address this limitation, researchers have attempted to use predicted protein structure information to guide binding site prediction. While this strategy has improved accuracy, it still depends on the quality of structure predictions. Thus, some studies have returned to prediction methods based solely on protein sequences, particularly those using protein language models, which have greatly enhanced the prediction accuracy. This paper proposes a novel protein-nucleic acid binding site prediction framework, ATtention Maps and Graph convolutional neural networks to predict nucleic acid-protein Binding sites (ATMGBs), which first fuses protein language embeddings with physicochemical properties to obtain multiview information, then leverages the attention map of a protein language model to simulate the relationship between residues, and then utilizes graph convolutional networks for enhancing the feature representations for final prediction. ATMGBs was evaluated on several different independent test sets. The results indicate that the proposed approach significantly improves sequence-based prediction performance, even achieving prediction accuracy comparable to structure-based frameworks. The dataset and code used in this study are available at https://github.com/lixiangli01/ATMGBs.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12415854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145013839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}