Machine Learning-Driven Discovery of Essential Binding Preference in Anti-CRISPR Proteins.

IF 2.5 4区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS
PROTEOMICS – Clinical Applications Pub Date : 2025-07-01 Epub Date: 2025-06-30 DOI:10.1002/prca.70013
QingLan Ma, YuHang Zhang, Lei Chen, YuShen Bao, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai
{"title":"Machine Learning-Driven Discovery of Essential Binding Preference in Anti-CRISPR Proteins.","authors":"QingLan Ma, YuHang Zhang, Lei Chen, YuShen Bao, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai","doi":"10.1002/prca.70013","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Anti-CRISPR (Acr) proteins can evade CRISPR-Cas immunity, yet their molecular determinants remain poorly understood. This study aimed to uncover key features driving Acr activity, thereby advancing both fundamental knowledge and the rational design of robust CRISPR-based tools.</p><p><strong>Experimental design: </strong>We compiled a binary-encoded matrix of 761 InterPro-annotated domains and binding-site features for known Acr proteins. Seven feature ranking algorithms were applied to prioritize determinant features, and an incremental feature selection strategy, coupled with four distinct classifiers, was used to identify optimal subsets. Consensus key features were defined by intersecting the top subsets across all methods.</p><p><strong>Results: </strong>Key identified features include the DUF2829 domain, the Lambda repressor-like domain and Sulfolobus islandicus virus proteins, the Cro/C1-type helix-turn-helix domain, phage protein, and replication initiator A. These findings illuminate novel structural modules and regulatory motifs that underpin Acr inhibition.</p><p><strong>Conclusions: </strong>This study provides critical theoretical support for deciphering Acr mechanisms and offers actionable insights for engineering next-generation CRISPR-Cas applications in clinical and biotechnological settings.</p><p><strong>Summary: </strong>The CRISPR system is a part of the antiviral immune defense initially discovered in bacteria and archaea. At present, the CRISPR system has become the cornerstone of genome editing technologies such as CRISPR-Cas9, widely used in clinical, agricultural, and biological research. Anti-CRISPR proteins are a group of proteins that inhibit the normal activity of CRISPR-Cas system in certain bacteria or archaea and avoid having the phages' genomes destroyed by the prokaryotic cells. The anti-CRISPR protein family has various components, but with similar functions to help exogenous DNA escape from the immune system. This study tried to uncover molecular mechanisms for anti-CRISPR proteins.</p>","PeriodicalId":20571,"journal":{"name":"PROTEOMICS – Clinical Applications","volume":" ","pages":"e70013"},"PeriodicalIF":2.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PROTEOMICS – Clinical Applications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/prca.70013","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/30 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Anti-CRISPR (Acr) proteins can evade CRISPR-Cas immunity, yet their molecular determinants remain poorly understood. This study aimed to uncover key features driving Acr activity, thereby advancing both fundamental knowledge and the rational design of robust CRISPR-based tools.

Experimental design: We compiled a binary-encoded matrix of 761 InterPro-annotated domains and binding-site features for known Acr proteins. Seven feature ranking algorithms were applied to prioritize determinant features, and an incremental feature selection strategy, coupled with four distinct classifiers, was used to identify optimal subsets. Consensus key features were defined by intersecting the top subsets across all methods.

Results: Key identified features include the DUF2829 domain, the Lambda repressor-like domain and Sulfolobus islandicus virus proteins, the Cro/C1-type helix-turn-helix domain, phage protein, and replication initiator A. These findings illuminate novel structural modules and regulatory motifs that underpin Acr inhibition.

Conclusions: This study provides critical theoretical support for deciphering Acr mechanisms and offers actionable insights for engineering next-generation CRISPR-Cas applications in clinical and biotechnological settings.

Summary: The CRISPR system is a part of the antiviral immune defense initially discovered in bacteria and archaea. At present, the CRISPR system has become the cornerstone of genome editing technologies such as CRISPR-Cas9, widely used in clinical, agricultural, and biological research. Anti-CRISPR proteins are a group of proteins that inhibit the normal activity of CRISPR-Cas system in certain bacteria or archaea and avoid having the phages' genomes destroyed by the prokaryotic cells. The anti-CRISPR protein family has various components, but with similar functions to help exogenous DNA escape from the immune system. This study tried to uncover molecular mechanisms for anti-CRISPR proteins.

机器学习驱动发现抗crispr蛋白的基本结合偏好。
目的:抗crispr (Acr)蛋白可以逃避CRISPR-Cas免疫,但其分子决定因素仍然知之甚少。本研究旨在揭示驱动Acr活性的关键特征,从而推进基础知识和健壮的基于crispr的工具的合理设计。实验设计:我们编制了已知Acr蛋白的761个interpro注释结构域和结合位点特征的二进制编码矩阵。采用7种特征排序算法对决定特征进行优先排序,并采用增量特征选择策略,结合4种不同的分类器来识别最优子集。共识关键特征通过交叉所有方法的顶级子集来定义。结果:确定的关键特征包括DUF2829结构域、Lambda抑制因子样结构域和岛硫虫病毒蛋白、Cro/ c1型螺旋-转-螺旋结构域、噬菌体蛋白和复制启动子a。这些发现阐明了支持Acr抑制的新型结构模块和调控基元。结论:本研究为破解Acr机制提供了关键的理论支持,并为下一代CRISPR-Cas在临床和生物技术环境中的应用提供了可行的见解。摘要:CRISPR系统是最初在细菌和古细菌中发现的抗病毒免疫防御的一部分。目前,CRISPR系统已成为CRISPR- cas9等基因组编辑技术的基石,广泛应用于临床、农业、生物研究等领域。抗crispr蛋白是抑制某些细菌或古细菌中CRISPR-Cas系统的正常活性,避免噬菌体基因组被原核细胞破坏的一组蛋白。抗crispr蛋白家族有多种成分,但功能相似,都是帮助外源DNA逃离免疫系统。这项研究试图揭示抗crispr蛋白的分子机制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PROTEOMICS – Clinical Applications
PROTEOMICS – Clinical Applications 医学-生化研究方法
CiteScore
5.20
自引率
5.00%
发文量
50
审稿时长
1 months
期刊介绍: PROTEOMICS - Clinical Applications has developed into a key source of information in the field of applying proteomics to the study of human disease and translation to the clinic. With 12 issues per year, the journal will publish papers in all relevant areas including: -basic proteomic research designed to further understand the molecular mechanisms underlying dysfunction in human disease -the results of proteomic studies dedicated to the discovery and validation of diagnostic and prognostic disease biomarkers -the use of proteomics for the discovery of novel drug targets -the application of proteomics in the drug development pipeline -the use of proteomics as a component of clinical trials.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信