{"title":"GeoEvoBuilder: A deep learning framework for efficient functional and thermostable protein design","authors":"Jiale Liu, Hantian You, Zheng Guo, Qin Xu, Changsheng Zhang, Luhua Lai","doi":"10.1073/pnas.2504117122","DOIUrl":null,"url":null,"abstract":"While deep learning has advanced protein sequence and function design, engineering highly active and stable proteins still requires labor-intensive iterative computational design and experimentation. There is a critical need for methods capable of directly generating protein sequences with the required properties. Here, we present GeoEvoBuilder, an advanced deep learning framework that adaptively integrates structural and evolutionary constraints for protein sequence design. GeoEvoBuilder accurately recapitulates functional sites and generates sequences that fold correctly with enhanced activity and thermal stability. GeoEvoBuilder has been applied to redesign green fluorescent protein, glutathione peroxidase 4 (GPX4), and dihydrofolate reductase (DHFR), yielding variants with significantly improved thermal stability and activity. Notably, the top DHFR design demonstrated a 20-fold increase in catalytic efficiency and a 10 °C gain in thermal stability. Crystal structure determination confirmed that the designed proteins form correct structures. Further analysis of residue dynamic correlations in GPX4 variants provides insights into how remote sites regulate enzymatic activity. Unlike conventional methods that focus on single mutation and their combinations with iterative design and experiment cycles, GeoEvoBuilder explores a large sequence space that enables successful designs with over 30% residue changes in one run. GeoEvoBuilder not only provides a transformative tool for protein engineering but also can be applied to uncover the intricate relationships between protein sequence, structure, function, and evolution. GeoEvoBuilder is publicly available at <jats:ext-link xmlns:xlink=\"http://www.w3.org/1999/xlink\" ext-link-type=\"uri\" xlink:href=\"https://github.com/PKUliujl/GeoEvoBuilder\">https://github.com/PKUliujl/GeoEvoBuilder</jats:ext-link> .","PeriodicalId":20548,"journal":{"name":"Proceedings of the National Academy of Sciences of the United States of America","volume":"19 1","pages":""},"PeriodicalIF":9.1000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the National Academy of Sciences of the United States of America","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1073/pnas.2504117122","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
While deep learning has advanced protein sequence and function design, engineering highly active and stable proteins still requires labor-intensive iterative computational design and experimentation. There is a critical need for methods capable of directly generating protein sequences with the required properties. Here, we present GeoEvoBuilder, an advanced deep learning framework that adaptively integrates structural and evolutionary constraints for protein sequence design. GeoEvoBuilder accurately recapitulates functional sites and generates sequences that fold correctly with enhanced activity and thermal stability. GeoEvoBuilder has been applied to redesign green fluorescent protein, glutathione peroxidase 4 (GPX4), and dihydrofolate reductase (DHFR), yielding variants with significantly improved thermal stability and activity. Notably, the top DHFR design demonstrated a 20-fold increase in catalytic efficiency and a 10 °C gain in thermal stability. Crystal structure determination confirmed that the designed proteins form correct structures. Further analysis of residue dynamic correlations in GPX4 variants provides insights into how remote sites regulate enzymatic activity. Unlike conventional methods that focus on single mutation and their combinations with iterative design and experiment cycles, GeoEvoBuilder explores a large sequence space that enables successful designs with over 30% residue changes in one run. GeoEvoBuilder not only provides a transformative tool for protein engineering but also can be applied to uncover the intricate relationships between protein sequence, structure, function, and evolution. GeoEvoBuilder is publicly available at https://github.com/PKUliujl/GeoEvoBuilder .
期刊介绍:
The Proceedings of the National Academy of Sciences (PNAS), a peer-reviewed journal of the National Academy of Sciences (NAS), serves as an authoritative source for high-impact, original research across the biological, physical, and social sciences. With a global scope, the journal welcomes submissions from researchers worldwide, making it an inclusive platform for advancing scientific knowledge.