{"title":"A deep learning model for predicting systemic lupus erythematosus-associated epitopes.","authors":"Jiale He, Zixia Liu, Xiaopo Tang","doi":"10.1186/s12911-025-03056-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The accurate prediction of epitopes associated with Systemic Lupus Erythematosus (SLE) plays a vital role in advancing our understanding of autoimmune pathogenesis and in designing effective immunotherapeutics. Traditional bioinformatics methods often struggle to capture the intricate sequence patterns and high-dimensional signals characteristic of epitope data. Deep learning presents a compelling alternative, with its ability to perform automatic feature learning and model complex dependencies inherent in biological sequences. This study proposes a hybrid deep learning architecture that synergistically integrates handcrafted biochemical features with data-driven deep sequence modeling to improve the identification of SLE-associated epitopes.</p><p><strong>Methods: </strong>The framework comprises six interconnected components: (1) handcrafted feature extraction encoding biochemical and physicochemical attributes; (2) an embedding layer for dense sequence representation; (3) a Convolutional Neural Network (CNN) branch that captures local patterns from handcrafted features; (4) a Long Short-Term Memory branch for learning temporal dependencies in sequence data; (5) a scaled dot-product attention-based fusion module that integrates complementary information from both branches; and (6) a Multi-Layer Perceptron for final classification. Model evaluation employed metrics such as Accuracy, Precision, Recall, F1-score, and the area under the receiver operating characteristic curve (ROCAUC).</p><p><strong>Results: </strong>The hybrid model outperformed both baseline machine learning algorithms and ablated versions of itself. It achieved a ROCAUC of 0.9506 and an F1-score of 0.8333 on the SLE epitope prediction task. Notably, ablation studies revealed that the CNN component had the most substantial influence on performance, while the custom fusion mechanism yielded better integration of features than conventional strategies. These findings underscore the model's robustness and capacity to generalize across complex epitope prediction tasks.</p><p><strong>Conclusion: </strong>This work presents an interpretable, biologically informed deep learning approach for predicting SLE-associated epitopes. By merging domain-specific handcrafted features with dynamic deep learning representations, the model not only enhances predictive accuracy but also provides meaningful biological insights. The framework holds promise for broader applications in immunoinformatics and autoimmune disease research.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"230"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12220259/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03056-x","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The accurate prediction of epitopes associated with Systemic Lupus Erythematosus (SLE) plays a vital role in advancing our understanding of autoimmune pathogenesis and in designing effective immunotherapeutics. Traditional bioinformatics methods often struggle to capture the intricate sequence patterns and high-dimensional signals characteristic of epitope data. Deep learning presents a compelling alternative, with its ability to perform automatic feature learning and model complex dependencies inherent in biological sequences. This study proposes a hybrid deep learning architecture that synergistically integrates handcrafted biochemical features with data-driven deep sequence modeling to improve the identification of SLE-associated epitopes.
Methods: The framework comprises six interconnected components: (1) handcrafted feature extraction encoding biochemical and physicochemical attributes; (2) an embedding layer for dense sequence representation; (3) a Convolutional Neural Network (CNN) branch that captures local patterns from handcrafted features; (4) a Long Short-Term Memory branch for learning temporal dependencies in sequence data; (5) a scaled dot-product attention-based fusion module that integrates complementary information from both branches; and (6) a Multi-Layer Perceptron for final classification. Model evaluation employed metrics such as Accuracy, Precision, Recall, F1-score, and the area under the receiver operating characteristic curve (ROCAUC).
Results: The hybrid model outperformed both baseline machine learning algorithms and ablated versions of itself. It achieved a ROCAUC of 0.9506 and an F1-score of 0.8333 on the SLE epitope prediction task. Notably, ablation studies revealed that the CNN component had the most substantial influence on performance, while the custom fusion mechanism yielded better integration of features than conventional strategies. These findings underscore the model's robustness and capacity to generalize across complex epitope prediction tasks.
Conclusion: This work presents an interpretable, biologically informed deep learning approach for predicting SLE-associated epitopes. By merging domain-specific handcrafted features with dynamic deep learning representations, the model not only enhances predictive accuracy but also provides meaningful biological insights. The framework holds promise for broader applications in immunoinformatics and autoimmune disease research.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.