Briefings in bioinformatics最新文献_第9页

Accurate structure prediction of cyclic peptides containing unnatural amino acids using HighFold3. 使用HighFold3对含有非天然氨基酸的环肽进行准确的结构预测。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf488

Sen Cao, Cheng Zhu, Qingyi Mao, Jingjing Guo, Ning Zhu, Hongliang Duan

{"title":"Accurate structure prediction of cyclic peptides containing unnatural amino acids using HighFold3.","authors":"Sen Cao, Cheng Zhu, Qingyi Mao, Jingjing Guo, Ning Zhu, Hongliang Duan","doi":"10.1093/bib/bbaf488","DOIUrl":"10.1093/bib/bbaf488","url":null,"abstract":"Cyclic peptides have emerged as a research hotspot in drug development in recent years due to their excellent stability, specificity, and cell penetration. However, existing computational models face challenges in accurately predicting the three-dimensional structures of cyclic peptides containing unnatural amino acids (unAAs), thereby limiting their drug design. The release of AlphaFold 3 has significantly enhanced the modeling capability of biomolecular complexes and enabled the inclusion of unAAs through definitions provided by the Chemical Component Dictionary (CCD). Nevertheless, its training data reliance limits its ability to accurately predict cyclic peptide structures, failing to meet the demand for precise cyclic peptide structure prediction. Based on the AlphaFold 3 framework, we developed HighFold3 by introducing the Cyclic Position Offset Encoding Matrix (CycPOEM). HighFold3 comprises two submodels: HighFold3-Linear and HighFold3-Cyclic, designed for predicting the structures of linear and cyclic peptides, respectively. Our results demonstrate that HighFold3 outperforms existing models (HighFold, HighFold2, CyclicBoltz1, NCPepFold, CABS-flex, ESMFold, and HelixFold) in cyclic peptide structure prediction. It achieves atomic-level precision in predicting cyclic peptide monomers while demonstrating enhanced accuracy and generalization capability for cyclic peptide complexes containing unAAs. This offers unprecedented technical support for the structural design and optimization of cyclic peptide-based therapeutics.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12450345/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145102680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MVRBind: multi-view learning for RNA-small molecule binding site prediction. MVRBind：多视角学习rna -小分子结合位点预测。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf489

Song Chen, Zhijian Huang, Yucheng Wang, Yahan Li, Yaw Sing Tan, Lei Deng, Min Wu

{"title":"MVRBind: multi-view learning for RNA-small molecule binding site prediction.","authors":"Song Chen, Zhijian Huang, Yucheng Wang, Yahan Li, Yaw Sing Tan, Lei Deng, Min Wu","doi":"10.1093/bib/bbaf489","DOIUrl":"10.1093/bib/bbaf489","url":null,"abstract":"RNA plays a critical role in cellular processes, and its dysregulation is linked to many diseases, positioning RNA-targeted drugs as an important area of research. Accurate prediction of RNA-small molecule binding sites is crucial for advancing RNA-targeted therapies. Although deep learning has shown promise in this area, challenges remain in integrating and processing multi-dimensional data, such as RNA sequences and structural features, particularly given the inherent flexibility of RNA structures. In this study, we present MVRBind, a multi-view graph convolutional network designed to predict RNA-small molecule binding sites. MVRBind generates feature representations of RNA nucleotides across different structural levels. To effectively integrate these features, we developed a multi-view feature fusion module that constructs graphs based on RNA's primary, secondary, and tertiary structural views, enabling the model to capture diverse aspects of RNA structure. In addition, we fuse embeddings from multi-scale to obtain a comprehensive representation of RNA nucleotides, which is then used to predict RNA-small molecule binding sites. Extensive experiments demonstrate that MVRBind consistently outperforms baseline methods in various experimental settings. Our MVRBind shows exceptional performance in predicting binding sites for both the holo and apo forms of RNA, even when RNA adopts multiple conformations. These results suggest that MVRBind offers a robust model for structure-based RNA analysis, contributing toward accurate prediction and analysis of RNA-small molecule binding sites. All datasets and resource codes are available at https://github.com/cschen-y/MVRBind.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12451103/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145112031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A multi-omics integration framework using multi-label guided learning and multi-scale fusion. 基于多标签引导学习和多尺度融合的多组学集成框架。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf493

Yuze Li, Yinghe Wang, Tao Liang, Ying Li, Wei Du

{"title":"A multi-omics integration framework using multi-label guided learning and multi-scale fusion.","authors":"Yuze Li, Yinghe Wang, Tao Liang, Ying Li, Wei Du","doi":"10.1093/bib/bbaf493","DOIUrl":"10.1093/bib/bbaf493","url":null,"abstract":"The rapid development of high-throughput sequencing technologies has generated vast amounts of omics data, making multi-omics integration a crucial approach for understanding complex diseases. Despite the introduction of various multi-omics integration methods in recent years, existing approaches still have limitations, primarily in their reliance on manual feature selection, restricted applicability, and inability to comprehensively capture both inter-sample and cross-omics interactions. To address these challenges, we propose mmMOI, an end-to-end multi-omics integration framework that incorporates multi-label guided learning and multi-scale attention fusion. mmMOI directly processes raw high-dimensional omics data without requiring manual feature selection, thereby enhancing model interpretability and eliminating biases introduced by feature preselection. First, we introduce a multi-label guided multi-view graph neural network, which enables the model to adaptively learn omics data representations across different datasets, thereby improving generalizability and stability. Second, we design a multi-scale attention fusion network, which integrates global attention and local attention. This dual-attention mechanism allows mmMOI to more accurately integrate multi-omics data, enhance cross-omics feature representations, and improve classification performance. Experimental results demonstrate that mmMOI significantly outperforms state-of-the-art methods in classification tasks, exhibiting high stability and adaptability across diverse biological contexts and sequencing technologies. Additionally, mmMOI successfully identifies key disease-associated biomarkers, further enhancing its biological interpretability and practical relevance. The source code, datasets, and detailed hyperparameter configurations for mmMOI are available at https://github.com/mlcb-jlu/mmMOI.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461718/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145136550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Streamline automated biomedical discoveries with agentic bioinformatics. 通过代理生物信息学简化自动化生物医学发现。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf505

Juexiao Zhou, Jindong Jiang, Zhongyi Han, Zijian Wang, Xin Gao

{"title":"Streamline automated biomedical discoveries with agentic bioinformatics.","authors":"Juexiao Zhou, Jindong Jiang, Zhongyi Han, Zijian Wang, Xin Gao","doi":"10.1093/bib/bbaf505","DOIUrl":"10.1093/bib/bbaf505","url":null,"abstract":"The emergence of artificial intelligence agents powered by large language models marks a transformative shift in computational biology. In this new paradigm, autonomous, adaptive, and intelligent agents are deployed to tackle complex biological challenges, leading to a new research field named agentic bioinformatics. Here, we explore the core principles, evolving methodologies, and diverse applications of agentic bioinformatics. We examine how agentic bioinformatics systems work synergistically to facilitate data-driven decision-making and enable self-directed exploration of biological datasets. Furthermore, we highlight the integration of agentic frameworks in key areas such as personalized medicine, drug discovery, and synthetic biology, illustrating their potential to revolutionize healthcare and biotechnology. In addition, we address the ethical, technical, and scalability challenges associated with agentic bioinformatics, identifying key opportunities for future advancements. By emphasizing the importance of interdisciplinary collaboration and innovation, we envision agentic bioinformatics as a major force in overcoming the grand challenges of modern biology, ultimately advancing both research and clinical applications.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12476841/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145184492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Advancing ADMET prediction through multiscale fragment-aware pretraining with MSformer-ADMET. 基于MSformer-ADMET的多尺度片段感知预训练推进ADMET预测。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-08-31 DOI: 10.1093/bib/bbaf506

Huihui Liu, Bingjie Zhu, Shuyang Nie, Haoran Li, Yugang Lin, Tianyi Ma, Xin Shao, Qian Chen, Minjie Shen, Yanrong Zheng, Xiaohui Fan, Jie Liao

{"title":"Advancing ADMET prediction through multiscale fragment-aware pretraining with MSformer-ADMET.","authors":"Huihui Liu, Bingjie Zhu, Shuyang Nie, Haoran Li, Yugang Lin, Tianyi Ma, Xin Shao, Qian Chen, Minjie Shen, Yanrong Zheng, Xiaohui Fan, Jie Liao","doi":"10.1093/bib/bbaf506","DOIUrl":"10.1093/bib/bbaf506","url":null,"abstract":"Absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties are critical determinants of the pharmacokinetic and safety profiles of drug candidates. Accurate and early-stage prediction of ADMET characteristics is essential for reducing late-stage attrition rates, lowering development costs, and accelerating the drug discovery process. Recent advances in deep learning have shown great promise in molecular property prediction, especially with the emergence of Transformer-based architectures that can effectively model long-range dependencies in molecular representations. However, most existing methods rely heavily on atom-level encodings (e.g. smiles or molecular graphs), which often lack structural interpretability and generalization across heterogeneous tasks. Previously, we developed a de novo and flexible molecular representation framework named MSformer (available at https://github.com/ZJUFanLab/MSformer), which demonstrated success in bioactivity prediction. We have now adapted and specialized this architecture for ADMET property prediction. This adapted implementation, designated as MSformer-ADMET, extends the framework's capabilities to pharmacokinetic and toxicity endpoints while maintaining its flexible, fragmentation-based approach to molecular representation learning. MSformer-ADMET is fine-tuned on 22 tasks collected from the Therapeutics Data Commons (TDC), covering both classification and regression settings. Results demonstrate that MSformer-ADMET achieves superior performance across a wide range of ADMET endpoints, consistently outperforming conventional smiles-based and graph-based models. Notably, we further conducted interpretability analyses by leveraging the model's attention distributions and fragment-to-atom mappings, allowing the identification of key structural fragments that are highly associated with molecular properties. This post hoc interpretability provides more transparent insights into the structure-property relationship. Collectively, results demonstrate that MSformer-ADMET is a highly effective and broadly applicable model for ADMET prediction.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12478026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145184447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches. 多组学数据集成方法的技术回顾：从经典统计到深度生成方法。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-07-02 DOI: 10.1093/bib/bbaf355

Ana R Baião, Zhaoxiang Cai, Rebecca C Poulos, Phillip J Robinson, Roger R Reddel, Qing Zhong, Susana Vinga, Emanuel Gonçalves

{"title":"A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.","authors":"Ana R Baião, Zhaoxiang Cai, Rebecca C Poulos, Phillip J Robinson, Roger R Reddel, Qing Zhong, Susana Vinga, Emanuel Gonçalves","doi":"10.1093/bib/bbaf355","DOIUrl":"10.1093/bib/bbaf355","url":null,"abstract":"The rapid advancement of high-throughput sequencing and other assay technologies has resulted in the generation of large and complex multi-omics datasets, offering unprecedented opportunities for advancing precision medicine. However, multi-omics data integration remains challenging due to the high-dimensionality, heterogeneity, and frequency of missing values across data types. Computational methods leveraging statistical and machine learning approaches have been developed to address these issues and uncover complex biological patterns, improving our understanding of disease mechanisms. Here, we comprehensively review state-of-the-art multi-omics integration methods with a focus on deep generative models, particularly variational autoencoders (VAEs) that have been widely used for data imputation, augmentation, and batch effect correction. We explore the technical aspects of VAE loss functions and regularisation techniques, including adversarial training, disentanglement, and contrastive learning. Moreover, we highlight recent advancements in foundation models and multimodal data integration, outlining future directions in precision medicine research.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12315550/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144759177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genetic/epigenetic DNA markers for linking suspects and tissues in complex crime scenes. 在复杂的犯罪现场将嫌疑人和组织联系起来的遗传/表观遗传DNA标记。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-07-02 DOI: 10.1093/bib/bbaf395

Roy Mizrachi, Daniel Neiman, Jonathan Rosenski, Netanel Loyfer, Danielle Share, Cindy Adjedj, Benjamin Glaser, Moshe Shpitzen, Yuval Dor, Ruth Shemer, Tommy Kaplan

{"title":"Genetic/epigenetic DNA markers for linking suspects and tissues in complex crime scenes.","authors":"Roy Mizrachi, Daniel Neiman, Jonathan Rosenski, Netanel Loyfer, Danielle Share, Cindy Adjedj, Benjamin Glaser, Moshe Shpitzen, Yuval Dor, Ruth Shemer, Tommy Kaplan","doi":"10.1093/bib/bbaf395","DOIUrl":"10.1093/bib/bbaf395","url":null,"abstract":"Analysis of DNA found at a crime scene can provide crucial information on the identity of the individual who has left the DNA, as well as the tissue origin of the DNA. However, the current methods used for DNA profiling and for the identification of cellular origin are separated-they do not associate genetic profiles found in the DNA evidence with epigenetic information about the biological origins of that same DNA. In this study, we developed a method based on joint genetic/epigenetic analysis of the same DNA molecule, allowing us to concurrently identify both the donor and the cellular origin at a single DNA molecule resolution. For this, we created a forensic body fluid methylation atlas, containing blood, semen, skin, and urine, and identified 800 tissue-specific methylation markers. Of these, nearly one hundred markers capture genetic information of nearby single nucleotide polymorphisms. We sequenced and tested 53 of these markers, including 12 for blood, 12 for skin, 14 for semen, and 15 for urine, in a joint genetic/epigenetic analysis. This allowed us to associate specific DNA fragments to their origin, by concurrently identifying their body fluid or tissue type and donor. Using our method, it is possible to disentangle complex crime scenes, composed of mixed biological materials and donors, and identify which donor contributed which tissue type. The method will allow forensics laboratories around the world to better understand the origin of DNA mixtures found at complex crime scenes, and help to check the testimonies of parties involved in criminal cases.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341875/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144834024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prioritizing pathway signature using deep learning approach: a novel strategy for traditional Chinese medicine formula generation and optimization. 基于深度学习方法的路径特征优选：中药配方生成与优化的新策略。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-07-02 DOI: 10.1093/bib/bbaf403

Zheng Wu, Zihan Wang, Xiyue Chang, Xingyu Chen, Qian Ding, Rong Fu, Cheong-Meng Chong, Jianyuan Tang, Chen Huang

{"title":"Prioritizing pathway signature using deep learning approach: a novel strategy for traditional Chinese medicine formula generation and optimization.","authors":"Zheng Wu, Zihan Wang, Xiyue Chang, Xingyu Chen, Qian Ding, Rong Fu, Cheong-Meng Chong, Jianyuan Tang, Chen Huang","doi":"10.1093/bib/bbaf403","DOIUrl":"10.1093/bib/bbaf403","url":null,"abstract":"The advancement of traditional Chinese medicine (TCM) faces challenges, due to the absence of a deep understanding of TCM mechanism at the perspective of modern biomedical practices. This results in how TCM selects herbs to treat diseases or symptoms prevailingly rely on clinicals' experience or TCM ancient books, at least in part lacking scientific basis. Herein, we present a novel deep learning-based approach, named Negative-Correlation-based TCM Architecture for Reversal (NeCTAR), to optimize the generation and combination of TCM formulas for guiding empiric therapy, by which we could, to some degree, narrow the gap between TCM and modern biomedical science. Our approach builds on a hypothesis that pathway alterations may serve as a proxy for the corresponding physiological changes induced by a certain disease, and 'inverse-fit' those alterations would provide a feasible therapeutic strategy to treat the disease. We leveraged ribonucleic acid sequencing (RNA-seq) data with Gene Set Enrichment Analysis to establish herb-pathway associations, integrating these insights into a multilayer perceptron model that incorporates top-k sparse projection and pathway reconstruction loss to predict the most therapeutically promising herbal components. NeCTAR demonstrated high concordance with experimental data across various disease models, including fatty liver disease, type 2 diabetes mellitus, and premature ovarian failure. Notably, NeCTAR could equally apply to single cell RNA-seq data. Overall, our study put forwards a novel interpretive framework underlying TCM mechanisms using modern biomedical foundation, by which we could prioritize herbal components based on existing TCM formulas treating diseases.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12342745/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144834042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IMOP-Cancer: identifying mutation order pairs impacting cancer phenotypes. iop - cancer：鉴定影响癌症表型的突变顺序对。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-07-02 DOI: 10.1093/bib/bbaf362

Yijing Zhang, Shaobo Kang, Renjie Dou, Wanmei Zhang, Yuanyuan Liu, Yang Wu, Dongxue Li, Fangfang Fan, Yanyan Ping

{"title":"IMOP-Cancer: identifying mutation order pairs impacting cancer phenotypes.","authors":"Yijing Zhang, Shaobo Kang, Renjie Dou, Wanmei Zhang, Yuanyuan Liu, Yang Wu, Dongxue Li, Fangfang Fan, Yanyan Ping","doi":"10.1093/bib/bbaf362","DOIUrl":"10.1093/bib/bbaf362","url":null,"abstract":"Cancer development and progression are driven by the accumulation of somatic genetic alterations, which occur in a specific temporal order. However, how the order of mutations impacts cancer phenotypes of solid tumors remains poorly understood. To address this, we developed a novel computational framework, IMOP-Cancer (Identifying Mutation Order Pairs in Cancer), to identify mutation gene pairs whose order influences cancer phenotypes. We applied IMOP-Cancer to The Cancer Genome Atlas-Lung Adenocarcinoma (TCGA-LUAD) cohort and identified 446 key mutation order pairs, with 34 pairs significantly associated with prognosis. Mutation order impacts cancer phenotypes, as demonstrated by CSMD3 and PTPRD (tumor proliferation) and TP53 and NAV3 (immune modulation), with effects validated in four independent datasets. We further presented the impact of mutation pairs on cancer phenotypes through case studies in the TCGA cohorts of bladder urothelial carcinoma (BLCA), and colon adenocarcinoma (COAD). We extended this analysis to 33 cancer cohorts from TCGA portal, identifying 106 034 critical mutation pairs across 17 cancers, with 3036 pairs co-occurring in multiple cancers. Shared mutation pairs across cancers also showed distinct effects on cancer phenotype. Our study highlights the importance of mutation order in cancer progression and diversity, offering new insights into the temporal dynamics of co-occurring mutations and paving the way for personalized treatment strategies and improved diagnosis.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12284763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144688964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ensemble learning methods and heterogeneous graph network fusion: building drug-gene-disease triple association prediction models. 集成学习方法与异构图网络融合：构建药物-基因-疾病三重关联预测模型。

IF 7.7 2区生物学

Briefings in bioinformatics Pub Date : 2025-07-02 DOI: 10.1093/bib/bbaf369

Keichin N G

{"title":"Ensemble learning methods and heterogeneous graph network fusion: building drug-gene-disease triple association prediction models.","authors":"Keichin N G","doi":"10.1093/bib/bbaf369","DOIUrl":"10.1093/bib/bbaf369","url":null,"abstract":"The potential association data between drugs, genes, and diseases is sparse and complex. Existing models find it difficult to effectively handle the problem of heterogeneous relationships and multi-source data fusion simultaneously, resulting in limited accuracy and generalization of association prediction. To address this problem, we propose a fusion method of relational graph convolutional network (R-GCN) and eXtreme Gradient Boosting (XGBoost). First, a heterogeneous graph containing drug, gene, and disease nodes and their relationships is constructed. The features of different types of nodes are aggregated and represented by R-GCN to generate high-quality node embeddings. Then, the embedded features of the drug-gene-disease triples are input into the XGBoost model for training to achieve the association prediction task. The findings demonstrate that the model's area under the curve reaches 0.92, and the F1 score reaches 0.85, indicating strong predictive ability. This method solves the problem of association prediction in complex biological networks and brings new technological support for precision medicine.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":7.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12286780/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144697649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0