{"title":"DeepUSPS: Deep Learning-Empowered Unconstrained-Structural Protein Sequence Design.","authors":"Zhichong Ma, Jiawen Yang","doi":"10.1002/prot.26847","DOIUrl":null,"url":null,"abstract":"<p><p>Currently, the unconstrained-structural protein sequence design models suffer from low optimization efficiency, and their generated proteins exhibit significant similarities to natural proteins and low thermal stability. To address these challenges, we propose the Deep Learning-Empowered Unconstrained-Structural Protein Sequence Design (DeepUSPS) model. To effectively address the inadequate thermal stability problem, we employ the innovative Inverted Dense Residual Network (IDRNet). To mitigate the designed proteins similarity issue, the Sequence-Pairwise Features Extraction Synthetic Network (SPFESN) is constructed. Furthermore, we introduce the Warm Restart AngularGrad (WRA) optimizer to optimize the 3D Position-Specific Scoring Matrix (3Dpssm) for unconstrained-structural protein sequence, only involving 2100 iterations (140.36 min) updates to generate idealization (IDE) protein sequences. We obtained a total of 1000 IDE protein sequences. Then we utilized in silico experiments to evaluate them, including similarity, clarity and iterations, thermal stability, spatial distribution of similarity, and predicted local-distance difference test (pLDDT) confidence assessment. Notably, the mean lg(E-value) for IDE protein sequences reached -0.051, the mean TM-score for IDE protein structures reached 0.594, the iterations only need 2100, and the mean Tm (melting point) for thermal stability reached 74.78°C. The average pLDDT value for 3D structures reached 76. Additionally, the IDE proteins' 3D structures exhibit diverse types. These in silico results conclusively demonstrate the superior performance of DeepUSPS compared with Hallucinate.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteins-Structure Function and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/prot.26847","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Currently, the unconstrained-structural protein sequence design models suffer from low optimization efficiency, and their generated proteins exhibit significant similarities to natural proteins and low thermal stability. To address these challenges, we propose the Deep Learning-Empowered Unconstrained-Structural Protein Sequence Design (DeepUSPS) model. To effectively address the inadequate thermal stability problem, we employ the innovative Inverted Dense Residual Network (IDRNet). To mitigate the designed proteins similarity issue, the Sequence-Pairwise Features Extraction Synthetic Network (SPFESN) is constructed. Furthermore, we introduce the Warm Restart AngularGrad (WRA) optimizer to optimize the 3D Position-Specific Scoring Matrix (3Dpssm) for unconstrained-structural protein sequence, only involving 2100 iterations (140.36 min) updates to generate idealization (IDE) protein sequences. We obtained a total of 1000 IDE protein sequences. Then we utilized in silico experiments to evaluate them, including similarity, clarity and iterations, thermal stability, spatial distribution of similarity, and predicted local-distance difference test (pLDDT) confidence assessment. Notably, the mean lg(E-value) for IDE protein sequences reached -0.051, the mean TM-score for IDE protein structures reached 0.594, the iterations only need 2100, and the mean Tm (melting point) for thermal stability reached 74.78°C. The average pLDDT value for 3D structures reached 76. Additionally, the IDE proteins' 3D structures exhibit diverse types. These in silico results conclusively demonstrate the superior performance of DeepUSPS compared with Hallucinate.
期刊介绍:
PROTEINS : Structure, Function, and Bioinformatics publishes original reports of significant experimental and analytic research in all areas of protein research: structure, function, computation, genetics, and design. The journal encourages reports that present new experimental or computational approaches for interpreting and understanding data from biophysical chemistry, structural studies of proteins and macromolecular assemblies, alterations of protein structure and function engineered through techniques of molecular biology and genetics, functional analyses under physiologic conditions, as well as the interactions of proteins with receptors, nucleic acids, or other specific ligands or substrates. Research in protein and peptide biochemistry directed toward synthesizing or characterizing molecules that simulate aspects of the activity of proteins, or that act as inhibitors of protein function, is also within the scope of PROTEINS. In addition to full-length reports, short communications (usually not more than 4 printed pages) and prediction reports are welcome. Reviews are typically by invitation; authors are encouraged to submit proposed topics for consideration.