Pham Nhat Duy , Nguyen Phuong Thao , Thanh Le , Le Van Trinh
{"title":"Leveraging mutual information in Variational Autoencoders for improved dimensionality reduction of single-cell RNA sequencing data: The scInfoMaxVAE approach","authors":"Pham Nhat Duy , Nguyen Phuong Thao , Thanh Le , Le Van Trinh","doi":"10.1016/j.compbiolchem.2025.108637","DOIUrl":"10.1016/j.compbiolchem.2025.108637","url":null,"abstract":"<div><div>Single-cell RNA-seq (scRNA-seq) analysis demands representations that are robust to sparsity and technical noise. We present scInfoMaxVAE, a mutual-information–maximizing variational autoencoder with a zero-inflated count likelihood tailored for scRNA-seq, designed for dimensionality reduction and cell-type classification. We evaluated the model on 12 public scRNA-seq datasets spanning multiple tissues and platforms using a unified pipeline with cell- and gene-level quality control (minimum detected genes), library-size normalization, log-transform, and reference-based cell-type annotation. Against established methods (VASC, DREAM, scVI, scDeepCluster) and conventional embeddings (e.g., t-SNE, UMAP), scInfoMaxVAE delivered competitive clustering and structure preservation across all datasets; for representative cohorts, it achieved normalized mutual information (NMI) of 0.94, matching VASC (0.94) and exceeding t-SNE (0.66), with notable gains in homogeneity (0.89 vs. 0.58 for scVI) and adjusted Rand index (0.81 vs. 0.38 for scVI). Strengths include consistent performance across heterogeneous datasets and improved preservation of neighborhood structure, attributable to information-theoretic training and explicit modeling of zero inflation. Limitations observed in our study include sensitivity to hyperparameters and modest run-to-run variance, suggesting benefits from automated tuning and further large-scale validation. Overall, scInfoMaxVAE offers a robust, reproducible alternative for representation learning in scRNA-seq workflows.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108637"},"PeriodicalIF":3.1,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144907693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yue Ying , Nan Wu , Jinhao Huo , Yutao Wang , Wei Jin
{"title":"scUCAF: An uncertainty-aware cross-omics alignment and fusion network for single-cell multi-omics data clustering","authors":"Yue Ying , Nan Wu , Jinhao Huo , Yutao Wang , Wei Jin","doi":"10.1016/j.compbiolchem.2025.108631","DOIUrl":"10.1016/j.compbiolchem.2025.108631","url":null,"abstract":"<div><div>The development of single-cell multi-omics sequencing technologies provides new insights into cell heterogeneity. Cell clustering is a crucial step in the analysis of multi-omics data. However, existing methods often overlook variations in data quality across omics, leading to unreliable feature representations. To address this issue, we propose scUCAF, an uncertainty-aware network for multi-omics clustering. Specifically, to mitigate the impact of noise on cell feature extraction, we introduce a variational autoencoder with a negative binomial distribution. After extracting each omics feature, we propose a high-confidence cluster-guided contrastive learning method to ensure cross-omics feature consistency. Finally, an uncertainty-aware fusion and gating network dynamically integrates the omics features to mitigate biases from low-quality data and produce reliable cell representations for clustering. Clustering results on eight real single-cell multi-omics datasets demonstrate that scUCAF outperforms existing multi-omics clustering methods. We also conduct downstream analyses to validate the effectiveness of scUCAF for cell type annotation and biomarker identification in liver cancer.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108631"},"PeriodicalIF":3.1,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144913986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gulam Rabbani , Mohammad Ehtisham Khan , Mohammad Aslam , Waleed Zakri , Mohammad Fareed , Glowi Alasiri , Wahid Ali , Syed Kashif Ali , Mohd Imran , Abdulrahman Khamaj , Jintae Lee
{"title":"Utilizing a combined approach of machine learning and structure-based drug design principles to identify potential hits targeting SphK1","authors":"Gulam Rabbani , Mohammad Ehtisham Khan , Mohammad Aslam , Waleed Zakri , Mohammad Fareed , Glowi Alasiri , Wahid Ali , Syed Kashif Ali , Mohd Imran , Abdulrahman Khamaj , Jintae Lee","doi":"10.1016/j.compbiolchem.2025.108648","DOIUrl":"10.1016/j.compbiolchem.2025.108648","url":null,"abstract":"<div><div>Sphingosine kinase (SphK1) is acrucial enzyme that aids in the processing of sphingolipids by adding a phosphate group to sphingosine, converting it into sphingosine-1-phosphate. A recent study has suggested that dysregulation of SphK1 is linked to tumor progression and metastasis in lung and bladder cancers,making SphK1 a promising therapeutic target for these diseases. In this study, we employedmachine learning-based virtual screening along with structure-based drug design to identify potential SphK1 inhibitors with diverse chemical scaffolds. A total of 16 machine learning models were generated using molecular fingerprints, and the most effective models were employed to conductvirtual screening of the Maybridge library. The screened compounds were then subjected to molecular docking to determine a suitable docked pose against the SphK1 protein. Upon visualization of the best docked compounds, we found that six compounds exhibited strong interactions with the SphK1 protein compared to the control (SQS). To further support our findings, we conducted 100 ns long molecular dynamics (MD) simulations of all six compounds to analyzeconformational changes and stability. Two compounds (SCR00139 and SCR00133) demonstratedpromising stability and fit well within the binding pocket of the SphK1 protein. Furthermore, MM-PBSA and MM-GBSA studies were carried out on these two compounds, providing favorable relative binding estimations. This study introduces an integrated pipeline of machine learning-based virtual screening for the identification of new scaffolds targeting cancer progression. However, <em>in vitro</em> evaluations are necessary to assess the efficacy of these compounds.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108648"},"PeriodicalIF":3.1,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144904133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring vulnerable building blocks in protein-protein interaction networks of breast tumor and adjacent normal tissues","authors":"Swapnil Kumar, Avantika Agrawal, Vaibhav Vindal","doi":"10.1016/j.compbiolchem.2025.108647","DOIUrl":"10.1016/j.compbiolchem.2025.108647","url":null,"abstract":"<div><div>Tumor-adjacent normal tissues (TANTs) histologically and morphologically look normal and are commonly used as a control in patient-based cancer studies. Previous studies have revealed that TANTs present a unique transitional state between healthy normal and tumor tissues. However, little or no knowledge exists about the landscape of protein-protein interactions (PPIs) in TANTs and how they differ from the tumor tissues. Herein, we integrated the PPI data mapped onto the differentially expressed genes in TANTs and tumor tissues compared to healthy normal tissues. This led to the reconstruction of six tissue-specific PPI networks, including TANTs and breast tumor tissues (viz., Luminal A, Luminal B, Her2, Basal, and Normal-Like). First, these PPI networks were analyzed using network influence and vulnerability analyses from the NetVA R package. Consequently, it revealed 134 vulnerable proteins (VPs), 21 vulnerable protein pairs (VPPs), and 94 influential proteins (IPs) that were present across all six tissue networks. Further, we identified a set of 34 proteins as common hubs and another set of seven proteins as common bottlenecks across all six tissue networks. Next, all VPs, IPs, hubs, and bottlenecks were investigated for their associations with various diseases, including cancers, and found sharing a significant number of well-known cancer-associated proteins, viz., AR, BRCA1, ERBB2, FN1, FOXA1, JUN, MKI67, and NRAS. Thus, by applying network vulnerability, influence, and gene-disease association-based analyses, we suggest lists of known and candidate proteins along with their associated protein complexes potentially involved in breast cancer tumorigenesis and present across TANTs and different breast cancer subtypes.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108647"},"PeriodicalIF":3.1,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144888996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Awaz M. Abbas , Maiwan Bahjat Abdulrazaq , Adel AL-Zebari
{"title":"Efficient lightweight CNN for automated classification of B-cell acute lymphoblastic leukemia","authors":"Awaz M. Abbas , Maiwan Bahjat Abdulrazaq , Adel AL-Zebari","doi":"10.1016/j.compbiolchem.2025.108645","DOIUrl":"10.1016/j.compbiolchem.2025.108645","url":null,"abstract":"<div><div>B-cell acute lymphoblastic leukemia (B-ALL) is an aggressive hematological malignancy that primarily affects children but can also occur in adults, progressing rapidly and requiring urgent clinical intervention. Late-stage diagnosis often results in reduced survival rates and typically depends on costly, time-intensive diagnostic procedures. Peripheral blood smear (PBS) imaging plays a central role in the preliminary screening of B-ALL and provides an accessible foundation for computer-assisted diagnosis. To support early and efficient classification, this study proposes a lightweight convolutional neural network (CNN) designed to classify B-ALL subtypes directly from PBS images without the need for pre-segmentation. The model is computationally efficient, comprising only 986,126 trainable parameters, and integrates Squeeze-and-Excitation (SE) modules within Inverted Residual Blocks to strengthen feature representation. Experimental results demonstrated excellent classification performance, achieving 100 % accuracy, precision, sensitivity, specificity, F1-score, and Matthews correlation coefficient (MCC). To further assess generalizability, cross-dataset validation was performed on the independent Blood Cells Cancer (ALL) dataset without retraining or fine-tuning, yielding a robust accuracy of 99.85 %. Model interpretability was performed using Gradient-weighted Class Activation Mapping (Grad-CAM) and Local Interpretable Model-agnostic Explanations (LIME), which provided visual explanations and highlighted key discriminative cellular features, respectively. Taken together, these results demonstrate that the proposed framework delivers a highly accurate, resource-efficient, and interpretable solution for B-ALL classification, underscoring its strong potential for integration into real-world clinical practice. Additionally, the implementation code for this study is publicly available at: <span><span>https://github.com/awazabbas/Efficient-Lightweight-CNN-for-Automated-Classification-of-B-cell-Acute-Lymphoblastic-Leukemia</span><svg><path></path></svg></span>-.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108645"},"PeriodicalIF":3.1,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144886393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unveiling therapeutic potential: In Silico discovery of prognostic markers and potential inhibitors for TGFßR1 in pancreatic cancer","authors":"Samvedna Singh , Himanshi Gupta , Subhav Sinha , Aman Chandra Kaushik , Shraddha Kapoor , Amit Kumar Awasthi , Imteyaz Qamar , Shakti Sahi","doi":"10.1016/j.compbiolchem.2025.108646","DOIUrl":"10.1016/j.compbiolchem.2025.108646","url":null,"abstract":"<div><div>Pancreatic cancer remains one of the lethal malignancies. Characterised by low survival rates, resistance to conventional chemotherapy and a lack of early detection markers. Differentially expressed genes AHNAK2, TSC2, LAMC2, C3orf52 and IGFBP3 were identified as significant prognostic markers based on their expression pattern and poor patient survival. Mutational analysis of the TCGA-PAAD data showed a 20.93 % mutation frequency in SMAD4, which is a key regulator of TGF-ß signaling. Consequently, TGFßR1 was selected as a potential therapeutic target. A structure-based virtual screening approach was employed on a small molecule library of 101,324 compounds. Based on pharmacokinetic properties, binding affinity, non-bonded interactions, and stereochemical considerations, Compound 6, Compound 7, and Compound 8 were shortlisted. To further understand the dynamic behaviour and binding mechanism of TGFßR1 of these shortlisted compounds, molecular dynamics simulations were performed. Analysis revealed critical residues ASP351, LYS232, LYS337, and LYS213, essential for receptor stability. Additionally, umbrella sampling revealed the unbinding mechanism. These hits exhibited lower free energies (ΔG) as compared to the benchmark inhibitors, Galunisertib and Vactosertib. The results offer valuable insights into the binding mechanism of protein TGFßR1 and its role in the disease, suggesting that targeting the TGF-ß signaling pathway may represent a promising therapeutic strategy.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108646"},"PeriodicalIF":3.1,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144892847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lightweight self-attention and deep gated neural network (LSA-DGNet) for multiple neurological disease detection","authors":"Shraddha Jain , Rajeev Srivastava , Sukomal Pal","doi":"10.1016/j.compbiolchem.2025.108621","DOIUrl":"10.1016/j.compbiolchem.2025.108621","url":null,"abstract":"<div><div>Detecting neurological diseases is an important task in modern medicine, for which it is crucial to accurately model the temporal distributions of disease genesis. In prior methodologies, temporal patterns are used in feature effects and limiting assumptions such as proportionate risks. We introduce a new methodology for neural disease diagnosis, known as LSA-DGNet (Lightweight Self-Attention based on Deep Gated Network). LSA-DGNet utilizes a deep gated neural network module to model nonlinear and time-lagged effects of variables on disease outcomes. We combined multi-scale time-aware self-attention modules with scaled dot-product self-attention modules so that the parallel structures could provide an integrated self-attention mechanism to improve data perception. LSA-DGNet addresses both issues and, hence, sets a new benchmark for real-time, accurate detection of neurological diseases. Unlike existing approaches, LSA-DGNet integrates a lightweight multi-scale time-aware self-attention mechanism with deep gated neural networks, enabling improved modeling of temporal dependencies in noisy EEG data. This design allows for accurate and efficient detection of multiple neurological diseases, validated on five real-world datasets, setting new benchmarks in classification performance. With up to 250 frames a second, it indicant large progress in computational efficiency—game-changer potential—and clinical applications. The entire framework opens up new opportunities for early diagnosis and more tailored treatment strategies and simply revolutionizes how neurological diseases are detected and treated.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108621"},"PeriodicalIF":3.1,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144895874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zixun Wang , Yimeng Sun , Xiaoling Zhang , Luqiang Wang , Desheng Song , Jingtao Yu , Xiaoxue Hu , Weiping Lin , Ruihua Wei
{"title":"Exploring potential therapeutic targets for myopia: Causal analysis and biological annotation with gut microbiota","authors":"Zixun Wang , Yimeng Sun , Xiaoling Zhang , Luqiang Wang , Desheng Song , Jingtao Yu , Xiaoxue Hu , Weiping Lin , Ruihua Wei","doi":"10.1016/j.compbiolchem.2025.108634","DOIUrl":"10.1016/j.compbiolchem.2025.108634","url":null,"abstract":"<div><h3>Purpose</h3><div>This study investigates the causal relationship between gut microbiota (GM) composition and myopia development through genetic instruments, aiming to identify specific microbial taxa with therapeutic potential and elucidate their underlying biological pathways.</div></div><div><h3>Methods</h3><div>We performed bidirectional two-sample Mendelian randomization (MR) using summary statistics from GWAS of 473 GM taxa (n = 5959) and myopia (26,184 cases). Inverse variance weighted (IVW) and four complementary methods assessed causality (F-statistics>10), with sensitivity analyses to validate robustness. Biological annotation integrates protein-protein interaction networks and pathway enrichment to decode mechanisms.</div></div><div><h3>Results</h3><div>Our inverse-variance weighted Mendelian randomization analysis identified 15 microbial features exhibiting causal associations with myopia (FDR < 0.05). Protective taxa included Family <em>Dysgonomonadaceae</em> (OR = 0.947, 95 % CI: 0.910–0.986) and species <em>Megamonas funiformis</em> (OR = 0.979, 0.964–0.995), while risk-associated taxa comprised Class <em>Omnitrophota</em> (OR = 1.144, 1.022–1.280) and species <em>Bacillus velezensis</em> (OR = 1.072, 1.017–1.129). Sensitivity analyses demonstrated robustness through nonsignificant heterogeneity (Q > 0.05), absence of horizontal pleiotropy (Egger intercept P > 0.1), and no influential outliers (MR-PRESSO P > 0.3). Host genetic variants were significantly enriched in PI3K-Akt (P = 9.4 ×10⁻⁵) and Ras signaling pathways (P = 3.7 ×10⁻³). Three hub genes (<em>PIK3R1</em>, <em>KITLG</em>, and IL2RB) may mediate scleral pathogenesis through TGF-β/Smad-regulated extracellular matrix degradation and dopaminergic deficiency via downregulation of tyrosine hydroxylase. Microbial metabolic interaction analyses revealed that <em>Megamonas</em>-derived short-chain fatty acids suppressed PI3K-Akt/HDAC signaling (β = −0.27 ± 0.08, P = 0.002). In contrast, the risk-associated taxon <em>Prevotella massilia</em> elevated oxidative stress markers via indole-3-acetate/AhR activation (β = 0.34 ± 0.12, P = 0.009).</div></div><div><h3>Conclusion</h3><div>This first MR-biological annotation study revealed a degree of congruence between microbiota-associated host genes and the PI3K-Akt/Ras-driven scleral-immune dysregulation in ocular signaling pathways. The findings of Megamonas-derived SCFAs as therapeutic targets provide a viable approach for addressing myopia through microbiome intervention.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108634"},"PeriodicalIF":3.1,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144880038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unlocking the anti-aging potential: In silico analysis of astaxanthin, curcumin, quercetin, and resveratrol in modulating skin aging pathways","authors":"Debora Gonçalves Barbosa , Karen Ruth Michio Barbosa , Yasmin Moreto Guaitolini , Matheus Correia Casotti , Rahna Gonçalves Coutinho da Cruz , Lorena Souza Castro Altoé , Isabele Pagani Pavan , Elizeu Fagundes de Carvalho , Iúri Drumond Louro , Débora Dummer Meira","doi":"10.1016/j.compbiolchem.2025.108633","DOIUrl":"10.1016/j.compbiolchem.2025.108633","url":null,"abstract":"<div><div>Multiple studies have linked aging to a result of the inflammatory response. Thus, there is a recognized need for cosmeceuticals that modulate inflammation pathways to prevent and treat aging. In this sense, four bioactive compounds were selected for their documented anti-inflammatory/antioxidant properties. Therefore, this study aimed to evaluate whether the bioactive compounds astaxanthin, curcumin, quercetin, and resveratrol are effective in treating the effects of skin aging, using <em>in silico</em> analyses. Protein-protein interaction networks (PPINs) related to skin aging and the bioactive compounds astaxanthin, curcumin, quercetin, and resveratrol were generated using the <em>Cytoscape plug-in</em> to analyze the functional enrichment of recovered proteins. From these main networks, clusters and bottleneck networks were generated. Initially, 5 main PPINs were generated. From the clusters recovered from the main networks, 3 were selected from the general network and 11 from the specific networks. Through functional enrichment of the clusters, the biological process of response to oxidative stress was identified. Blood and blood-forming tissue, vascular, and immune system abnormality phenotypes were also observed, along with an increase in inflammatory response. Additionally, Reactome pathways related to interleukin signaling and detoxification of reactive oxygen species were noted. Finally, the key genes for each network were identified from the bottleneck networks: <em>IL-6</em> (general and astaxanthin), <em>TAB1</em> (curcumin), <em>TNF-α</em> (quercetin), and <em>TP53</em> (resveratrol). Based on this research, the analyzed bioactive compounds suggest potential efficacy to be included in cosmetic products, as they are capable of reducing excessive oxidative stress and inflammatory processes, consequently preventing cellular aging.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108633"},"PeriodicalIF":3.1,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144888995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Lv , Ting Liu , Chang Liu , Yuchen Ma , Yunfei Liu , Ze Liu , Yin Li
{"title":"LYnet: Computational identification of tumor T cell antigens using convolutional and recurrent neural networks","authors":"Yang Lv , Ting Liu , Chang Liu , Yuchen Ma , Yunfei Liu , Ze Liu , Yin Li","doi":"10.1016/j.compbiolchem.2025.108630","DOIUrl":"10.1016/j.compbiolchem.2025.108630","url":null,"abstract":"<div><h3>Background</h3><div>Immunotherapy represents a paradigm shift in oncology, offering advantages in efficacy and specificity over traditional therapies. Key to its success is the identification of T-cell antigens, which are essential for triggering an effective antitumor immune response. Current methodologies for antigen prediction, however, lack the precision required for optimal vaccine development.</div></div><div><h3>Purpose</h3><div>This study aims to address this gap by introducing a novel deep learning model for the accurate prediction of tumor T-cell antigens. It seeks to improve the identification process, thereby facilitating the creation of more effective therapeutic cancer vaccines.</div></div><div><h3>Methods</h3><div>A hybrid architecture, designated LYnet, was constructed by integrating one-dimensional Convolutional Neural Networks with bidirectional Long Short-Term Memory layers, thereby capturing both local motif patterns and long-range sequence dependencies. Nineteen complementary sequence-derived descriptors—including AAindex, AAK-mer, CKSAAP/CKSAAGP, and physicochemical composition vectors—were concatenated to form the input feature space. Class imbalance in the training set was mitigated with the SMOTE-Tomek resampling strategy. Model robustness was quantified by stratified 10-fold cross-validation, and generalisation was verified on two independent benchmark collections (TAP 1.0 and iTTCA-RF).</div></div><div><h3>Results</h3><div>Across 10-fold validation on the LYnet benchmark, the proposed model achieved an AUC of 0.992, together with a sensitivity of 0.863, specificity of 0.925 and MCC of 0.776. Independent evaluation confirmed the advantage: LYnet yielded AUCs of 0.949 on the TAP 1.0 set and 0.836 on the iTTCA-RF set, surpassing the strongest competing method by 2.4–6.9 percentage points in AUC and up to 10.6 percentage points in MCC. These results demonstrate that LYnet attains state-of-the-art accuracy and balanced prediction for tumour T-cell antigen identification.</div></div><div><h3>Conclusions</h3><div>The introduction of this deep learning model represents a significant advancement in the prediction of tumor T-cell antigens. Its superior accuracy and robustness offer substantial potential to enhance the development and efficacy of cancer immunotherapies. This work not only underscores the importance of precise antigen identification in immunotherapy but also provides a powerful computational tool to aid in the design of next-generation cancer vaccines.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108630"},"PeriodicalIF":3.1,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144904132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}