QingLan Ma, YuHang Zhang, Lei Chen, YuShen Bao, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai
{"title":"Machine Learning-Driven Discovery of Essential Binding Preference in Anti-CRISPR Proteins.","authors":"QingLan Ma, YuHang Zhang, Lei Chen, YuShen Bao, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai","doi":"10.1002/prca.70013","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Anti-CRISPR (Acr) proteins can evade CRISPR-Cas immunity, yet their molecular determinants remain poorly understood. This study aimed to uncover key features driving Acr activity, thereby advancing both fundamental knowledge and the rational design of robust CRISPR-based tools.</p><p><strong>Experimental design: </strong>We compiled a binary-encoded matrix of 761 InterPro-annotated domains and binding-site features for known Acr proteins. Seven feature ranking algorithms were applied to prioritize determinant features, and an incremental feature selection strategy, coupled with four distinct classifiers, was used to identify optimal subsets. Consensus key features were defined by intersecting the top subsets across all methods.</p><p><strong>Results: </strong>Key identified features include the DUF2829 domain, the Lambda repressor-like domain and Sulfolobus islandicus virus proteins, the Cro/C1-type helix-turn-helix domain, phage protein, and replication initiator A. These findings illuminate novel structural modules and regulatory motifs that underpin Acr inhibition.</p><p><strong>Conclusions: </strong>This study provides critical theoretical support for deciphering Acr mechanisms and offers actionable insights for engineering next-generation CRISPR-Cas applications in clinical and biotechnological settings.</p><p><strong>Summary: </strong>The CRISPR system is a part of the antiviral immune defense initially discovered in bacteria and archaea. At present, the CRISPR system has become the cornerstone of genome editing technologies such as CRISPR-Cas9, widely used in clinical, agricultural, and biological research. Anti-CRISPR proteins are a group of proteins that inhibit the normal activity of CRISPR-Cas system in certain bacteria or archaea and avoid having the phages' genomes destroyed by the prokaryotic cells. The anti-CRISPR protein family has various components, but with similar functions to help exogenous DNA escape from the immune system. This study tried to uncover molecular mechanisms for anti-CRISPR proteins.</p>","PeriodicalId":20571,"journal":{"name":"PROTEOMICS – Clinical Applications","volume":" ","pages":"e70013"},"PeriodicalIF":2.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PROTEOMICS – Clinical Applications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/prca.70013","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/30 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Anti-CRISPR (Acr) proteins can evade CRISPR-Cas immunity, yet their molecular determinants remain poorly understood. This study aimed to uncover key features driving Acr activity, thereby advancing both fundamental knowledge and the rational design of robust CRISPR-based tools.
Experimental design: We compiled a binary-encoded matrix of 761 InterPro-annotated domains and binding-site features for known Acr proteins. Seven feature ranking algorithms were applied to prioritize determinant features, and an incremental feature selection strategy, coupled with four distinct classifiers, was used to identify optimal subsets. Consensus key features were defined by intersecting the top subsets across all methods.
Results: Key identified features include the DUF2829 domain, the Lambda repressor-like domain and Sulfolobus islandicus virus proteins, the Cro/C1-type helix-turn-helix domain, phage protein, and replication initiator A. These findings illuminate novel structural modules and regulatory motifs that underpin Acr inhibition.
Conclusions: This study provides critical theoretical support for deciphering Acr mechanisms and offers actionable insights for engineering next-generation CRISPR-Cas applications in clinical and biotechnological settings.
Summary: The CRISPR system is a part of the antiviral immune defense initially discovered in bacteria and archaea. At present, the CRISPR system has become the cornerstone of genome editing technologies such as CRISPR-Cas9, widely used in clinical, agricultural, and biological research. Anti-CRISPR proteins are a group of proteins that inhibit the normal activity of CRISPR-Cas system in certain bacteria or archaea and avoid having the phages' genomes destroyed by the prokaryotic cells. The anti-CRISPR protein family has various components, but with similar functions to help exogenous DNA escape from the immune system. This study tried to uncover molecular mechanisms for anti-CRISPR proteins.
期刊介绍:
PROTEOMICS - Clinical Applications has developed into a key source of information in the field of applying proteomics to the study of human disease and translation to the clinic. With 12 issues per year, the journal will publish papers in all relevant areas including:
-basic proteomic research designed to further understand the molecular mechanisms underlying dysfunction in human disease
-the results of proteomic studies dedicated to the discovery and validation of diagnostic and prognostic disease biomarkers
-the use of proteomics for the discovery of novel drug targets
-the application of proteomics in the drug development pipeline
-the use of proteomics as a component of clinical trials.