Unravelling cyclic peptide membrane permeability prediction: a study on data augmentation, architecture choices, and representation schemes

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY
Alfonso Cabezón, Erik Otović, Daniela Kalafatovic, Ángel Piñeiro, Rebeca García-Fandiño and Goran Mauša
{"title":"Unravelling cyclic peptide membrane permeability prediction: a study on data augmentation, architecture choices, and representation schemes","authors":"Alfonso Cabezón, Erik Otović, Daniela Kalafatovic, Ángel Piñeiro, Rebeca García-Fandiño and Goran Mauša","doi":"10.1039/D4DD00375F","DOIUrl":null,"url":null,"abstract":"<p >Cyclic peptides have emerged as promising candidates for drug development due to their unique structural properties and potential therapeutic benefits. However, clinical applications are limited by their low membrane permeability, which is difficult to predict. This study explores the impact of data augmentation and the inclusion of cyclic structure information in ML modeling to enhance the prediction of membrane permeability of cyclic peptides from their amino acid sequence. Various peptide representation strategies in combination with data augmentation techniques based on amino acid mutations and cyclic permutations were investigated to address the limited availability of experimental data. Moreover, cyclic convolutional layers were explored to explicitly model the cyclic nature of the peptides. The results indicated that combining sequential and peptide properties demonstrated superior performance across multiple metrics. The model performance is highly sensitive to the number and degree of similarity of amino acids involved in mutations. Cyclic permutations improved model performance, particularly in a larger and more diverse dataset and standard architectures captured most of the relevant cyclic information. Highlighting the complexity of peptide-membrane interactions, these results lay a foundation for future improvements in computational methods for the design of cyclic peptide drugs and offer practical guidelines for researchers in this field. The best-performing model was integrated into a user-friendly web-based tool, CYCLOPS: CYCLOpeptide Permeability Simulator (available at http://cyclopep.com/cyclops), to facilitate wider accessibility and application in drug discovery community. This tool allows for rapid predictions of the membrane permeability for cyclic peptides with a classification accuracy score of 0.824 and a regression mean absolute error of 0.477.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 5","pages":" 1259-1275"},"PeriodicalIF":6.2000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00375f?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/dd/d4dd00375f","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Cyclic peptides have emerged as promising candidates for drug development due to their unique structural properties and potential therapeutic benefits. However, clinical applications are limited by their low membrane permeability, which is difficult to predict. This study explores the impact of data augmentation and the inclusion of cyclic structure information in ML modeling to enhance the prediction of membrane permeability of cyclic peptides from their amino acid sequence. Various peptide representation strategies in combination with data augmentation techniques based on amino acid mutations and cyclic permutations were investigated to address the limited availability of experimental data. Moreover, cyclic convolutional layers were explored to explicitly model the cyclic nature of the peptides. The results indicated that combining sequential and peptide properties demonstrated superior performance across multiple metrics. The model performance is highly sensitive to the number and degree of similarity of amino acids involved in mutations. Cyclic permutations improved model performance, particularly in a larger and more diverse dataset and standard architectures captured most of the relevant cyclic information. Highlighting the complexity of peptide-membrane interactions, these results lay a foundation for future improvements in computational methods for the design of cyclic peptide drugs and offer practical guidelines for researchers in this field. The best-performing model was integrated into a user-friendly web-based tool, CYCLOPS: CYCLOpeptide Permeability Simulator (available at http://cyclopep.com/cyclops), to facilitate wider accessibility and application in drug discovery community. This tool allows for rapid predictions of the membrane permeability for cyclic peptides with a classification accuracy score of 0.824 and a regression mean absolute error of 0.477.

解开环肽膜渗透率预测:数据增强,架构选择和表示方案的研究
由于其独特的结构特性和潜在的治疗效益,环肽已成为药物开发的有希望的候选者。但其膜透性低,难以预测,限制了其临床应用。本研究探讨了数据增强和环结构信息在ML建模中的影响,以增强环肽的氨基酸序列对膜通透性的预测。结合基于氨基酸突变和循环排列的数据增强技术,研究了各种肽表示策略,以解决实验数据的有限可用性。此外,循环卷积层被探索以明确地模拟肽的循环性质。结果表明,结合序列和肽性质在多个指标上表现出优越的性能。模型性能对突变所涉及的氨基酸的数量和相似程度高度敏感。循环排列提高了模型性能,特别是在更大、更多样化的数据集和标准架构中捕获了大多数相关的循环信息。这些结果突出了肽-膜相互作用的复杂性,为未来改进环肽药物设计的计算方法奠定了基础,并为该领域的研究人员提供了实用指导。表现最好的模型被整合到一个用户友好的基于web的工具CYCLOPS: CYCLOpeptide Permeability Simulator(可在http://cyclopep.com/cyclops上获得)中,以促进更广泛的可访问性和在药物发现界的应用。该工具可以快速预测环状肽的膜通透性,分类精度评分为0.824,回归平均绝对误差为0.477。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信