{"title":"用机器学习从共价标记数据中提取残留溶剂:蛋白质结构预测的混合方法。","authors":"Elijah H Day, Steffen Lindert","doi":"10.1021/jasms.5c00053","DOIUrl":null,"url":null,"abstract":"<p><p>Hydroxyl radical protein footprinting (HRPF) coupled with mass spectrometry yields information about residue solvent exposure and protein topology. However, data from these experiments are sparse and require computational interpretation to generate useful structural insight. We previously implemented a Rosetta algorithm that uses experimental HRPF data to improve protein structure prediction. Modern structure prediction methods, such as AlphaFold2 (AF2), use machine learning (ML) to generate their predictions. Implementation of an HRPF-guided version of AF2 is challenging due to the substantial amount of training data required and the inherently abstract nature of ML networks. Thus, here we present a hybrid method that uses a light gradient boosting machine to predict residue solvent accessibility from experimental HRPF data. These predictions were subsequently used to improve Rosetta structure prediction. Our hybrid approach identified models with atomic-level detail for all four proteins in our benchmark set. These results illustrate that it is possible to successfully use ML in combination with HRPF data to accurately predict protein structures.</p>","PeriodicalId":672,"journal":{"name":"Journal of the American Society for Mass Spectrometry","volume":" ","pages":"1336-1347"},"PeriodicalIF":2.7000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Extracting Residue Solvent Exposure from Covalent Labeling Data with Machine Learning: A Hybrid Approach for Protein Structure Prediction.\",\"authors\":\"Elijah H Day, Steffen Lindert\",\"doi\":\"10.1021/jasms.5c00053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Hydroxyl radical protein footprinting (HRPF) coupled with mass spectrometry yields information about residue solvent exposure and protein topology. However, data from these experiments are sparse and require computational interpretation to generate useful structural insight. We previously implemented a Rosetta algorithm that uses experimental HRPF data to improve protein structure prediction. Modern structure prediction methods, such as AlphaFold2 (AF2), use machine learning (ML) to generate their predictions. Implementation of an HRPF-guided version of AF2 is challenging due to the substantial amount of training data required and the inherently abstract nature of ML networks. Thus, here we present a hybrid method that uses a light gradient boosting machine to predict residue solvent accessibility from experimental HRPF data. These predictions were subsequently used to improve Rosetta structure prediction. Our hybrid approach identified models with atomic-level detail for all four proteins in our benchmark set. These results illustrate that it is possible to successfully use ML in combination with HRPF data to accurately predict protein structures.</p>\",\"PeriodicalId\":672,\"journal\":{\"name\":\"Journal of the American Society for Mass Spectrometry\",\"volume\":\" \",\"pages\":\"1336-1347\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Society for Mass Spectrometry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/jasms.5c00053\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Society for Mass Spectrometry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/jasms.5c00053","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Extracting Residue Solvent Exposure from Covalent Labeling Data with Machine Learning: A Hybrid Approach for Protein Structure Prediction.
Hydroxyl radical protein footprinting (HRPF) coupled with mass spectrometry yields information about residue solvent exposure and protein topology. However, data from these experiments are sparse and require computational interpretation to generate useful structural insight. We previously implemented a Rosetta algorithm that uses experimental HRPF data to improve protein structure prediction. Modern structure prediction methods, such as AlphaFold2 (AF2), use machine learning (ML) to generate their predictions. Implementation of an HRPF-guided version of AF2 is challenging due to the substantial amount of training data required and the inherently abstract nature of ML networks. Thus, here we present a hybrid method that uses a light gradient boosting machine to predict residue solvent accessibility from experimental HRPF data. These predictions were subsequently used to improve Rosetta structure prediction. Our hybrid approach identified models with atomic-level detail for all four proteins in our benchmark set. These results illustrate that it is possible to successfully use ML in combination with HRPF data to accurately predict protein structures.
期刊介绍:
The Journal of the American Society for Mass Spectrometry presents research papers covering all aspects of mass spectrometry, incorporating coverage of fields of scientific inquiry in which mass spectrometry can play a role.
Comprehensive in scope, the journal publishes papers on both fundamentals and applications of mass spectrometry. Fundamental subjects include instrumentation principles, design, and demonstration, structures and chemical properties of gas-phase ions, studies of thermodynamic properties, ion spectroscopy, chemical kinetics, mechanisms of ionization, theories of ion fragmentation, cluster ions, and potential energy surfaces. In addition to full papers, the journal offers Communications, Application Notes, and Accounts and Perspectives