Lisa M Boatner, Jerome Eberhardt, Flowreen Shikwana, Matthew Holcomb, Peiyuan Lee, Kendall N Houk, Stefano Forli, Keriann M Backus
{"title":"CIAA: Integrated Proteomics and Structural Modeling for Understanding Cysteine Reactivity with Iodoacetamide Alkyne.","authors":"Lisa M Boatner, Jerome Eberhardt, Flowreen Shikwana, Matthew Holcomb, Peiyuan Lee, Kendall N Houk, Stefano Forli, Keriann M Backus","doi":"10.1021/acschembio.5c00225","DOIUrl":null,"url":null,"abstract":"<p><p>Cysteine residues play key roles in protein structure and function and can serve as targets for chemical probes and even drugs. Chemoproteomic studies have revealed that heightened cysteine reactivity toward electrophilic probes, such as iodoacetamide alkyne (IAA), is indicative of likely residue functionality. However, while the cysteine coverage of chemoproteomic studies has increased substantially, these methods still provide only a partial assessment of proteome-wide cysteine reactivity, with cysteines from low-abundance proteins and tough-to-detect peptides still largely refractory to chemoproteomic analysis. Here, we integrate cysteine chemoproteomic reactivity data sets with structure-guided computational analysis to delineate key structural features of proteins that favor elevated cysteine reactivity toward IAA. We first generated and aggregated multiple descriptors of cysteine microenvironment, including amino acid content, solvent accessibility, residue proximity, secondary structure, and predicted p<i>K</i><sub>a</sub>. We find that no single feature is sufficient to accurately predict the reactivity. Therefore, we developed the CIAA (Cysteine reactivity toward IodoAcetamide Alkyne) method, which utilizes a Random Forest model to assess cysteine reactivity by incorporating descriptors that characterize the three-dimensional (3D) structural properties of thiol microenvironments. We trained the CIAA model on existing and newly generated cysteine chemoproteomic reactivity data paired with high-resolution crystal structures from the Protein Data Bank (PDB), with cross-validation against an external data set. CIAA analysis reveals key features driving cysteine reactivity, such as backbone hydrogen bond donor atoms, and reveals still underserved needs in the area of computational predictions of cysteine reactivity, including challenges surrounding protein structure selection data set curation. Thus, our work provides a strong foundation for deploying artificial intelligence (AI) on cysteine chemoproteomic data sets.</p>","PeriodicalId":11,"journal":{"name":"ACS Chemical Biology","volume":" ","pages":"1669-1682"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Chemical Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1021/acschembio.5c00225","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/29 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Cysteine residues play key roles in protein structure and function and can serve as targets for chemical probes and even drugs. Chemoproteomic studies have revealed that heightened cysteine reactivity toward electrophilic probes, such as iodoacetamide alkyne (IAA), is indicative of likely residue functionality. However, while the cysteine coverage of chemoproteomic studies has increased substantially, these methods still provide only a partial assessment of proteome-wide cysteine reactivity, with cysteines from low-abundance proteins and tough-to-detect peptides still largely refractory to chemoproteomic analysis. Here, we integrate cysteine chemoproteomic reactivity data sets with structure-guided computational analysis to delineate key structural features of proteins that favor elevated cysteine reactivity toward IAA. We first generated and aggregated multiple descriptors of cysteine microenvironment, including amino acid content, solvent accessibility, residue proximity, secondary structure, and predicted pKa. We find that no single feature is sufficient to accurately predict the reactivity. Therefore, we developed the CIAA (Cysteine reactivity toward IodoAcetamide Alkyne) method, which utilizes a Random Forest model to assess cysteine reactivity by incorporating descriptors that characterize the three-dimensional (3D) structural properties of thiol microenvironments. We trained the CIAA model on existing and newly generated cysteine chemoproteomic reactivity data paired with high-resolution crystal structures from the Protein Data Bank (PDB), with cross-validation against an external data set. CIAA analysis reveals key features driving cysteine reactivity, such as backbone hydrogen bond donor atoms, and reveals still underserved needs in the area of computational predictions of cysteine reactivity, including challenges surrounding protein structure selection data set curation. Thus, our work provides a strong foundation for deploying artificial intelligence (AI) on cysteine chemoproteomic data sets.
期刊介绍:
ACS Chemical Biology provides an international forum for the rapid communication of research that broadly embraces the interface between chemistry and biology.
The journal also serves as a forum to facilitate the communication between biologists and chemists that will translate into new research opportunities and discoveries. Results will be published in which molecular reasoning has been used to probe questions through in vitro investigations, cell biological methods, or organismic studies.
We welcome mechanistic studies on proteins, nucleic acids, sugars, lipids, and nonbiological polymers. The journal serves a large scientific community, exploring cellular function from both chemical and biological perspectives. It is understood that submitted work is based upon original results and has not been published previously.