{"title":"Using deep learning and large protein language models to predict protein-membrane interfaces of peripheral membrane proteins.","authors":"Dimitra Paranou, Alexios Chatzigoulas, Zoe Cournia","doi":"10.1093/bioadv/vbae078","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Characterizing interactions at the protein-membrane interface is crucial as abnormal peripheral protein-membrane attachment is involved in the onset of many diseases. However, a limiting factor in studying and understanding protein-membrane interactions is that the membrane-binding domains of peripheral membrane proteins (PMPs) are typically unknown. By applying artificial intelligence techniques in the context of natural language processing (NLP), the accuracy and prediction time for protein-membrane interface analysis can be significantly improved compared to existing methods. Here, we assess whether NLP and protein language models (pLMs) can be used to predict membrane-interacting amino acids for PMPs.</p><p><strong>Results: </strong>We utilize available experimental data and generate protein embeddings from two pLMs (ProtTrans and ESM) to train classifier models. Overall, the results demonstrate the first proof of concept study and the promising potential of using deep learning and pLMs to predict protein-membrane interfaces for PMPs faster, with similar accuracy, and without the need for 3D structural data compared to existing tools.</p><p><strong>Availability and implementation: </strong>The code is available at https://github.com/zoecournia/pLM-PMI. All data are available in the Supplementary material.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae078"},"PeriodicalIF":2.4000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572487/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Characterizing interactions at the protein-membrane interface is crucial as abnormal peripheral protein-membrane attachment is involved in the onset of many diseases. However, a limiting factor in studying and understanding protein-membrane interactions is that the membrane-binding domains of peripheral membrane proteins (PMPs) are typically unknown. By applying artificial intelligence techniques in the context of natural language processing (NLP), the accuracy and prediction time for protein-membrane interface analysis can be significantly improved compared to existing methods. Here, we assess whether NLP and protein language models (pLMs) can be used to predict membrane-interacting amino acids for PMPs.
Results: We utilize available experimental data and generate protein embeddings from two pLMs (ProtTrans and ESM) to train classifier models. Overall, the results demonstrate the first proof of concept study and the promising potential of using deep learning and pLMs to predict protein-membrane interfaces for PMPs faster, with similar accuracy, and without the need for 3D structural data compared to existing tools.
Availability and implementation: The code is available at https://github.com/zoecournia/pLM-PMI. All data are available in the Supplementary material.