László Keresztes, Evelin Szögi, Bálint Varga, Viktor Farkas, András Perczel, Vince Grolmusz
{"title":"Navigating Homogeneous Graph Paths Through Amyloidogenic and Non-Amyloidogenic Hexapeptides","authors":"László Keresztes, Evelin Szögi, Bálint Varga, Viktor Farkas, András Perczel, Vince Grolmusz","doi":"10.1002/jcc.70238","DOIUrl":null,"url":null,"abstract":"<p>Hexapeptides are increasingly applied as model systems for studying the amyloidogenic properties of oligo- and polypeptides. It is possible to construct 64 million different hexapeptides from the twenty proteinogenic amino acid residues. Today's experimental amyloid databases contain only a fraction of these annotated hexapeptides. For labeling all the possible hexapeptides as “amyloidogenic” or “non-amyloidogenic” there exist several computational predictors with good accuracy. It may be of interest to define and study a simple graph structure on the 64 million hexapeptides as nodes, when two hexapeptides are connected by an edge if they differ by only a single residue. For example, in this graph, HIKKLM is connected to AIKKLM, or HIKKNM, or HIKKLC, but it is not connected with an edge to VVKKLM or HIKNPM. In the present contribution, we consider our previously published artificial intelligence-based tool, the Budapest Amyloid Predictor (BAP for short), and demonstrate a spectacular property of this predictor in the graph defined above. We show that for any two hexapeptides predicted to be “amyloidogenic” by the BAP predictor, there exists an easily constructible path of length at most six that passes through neighboring hexapeptides all predicted to be “amyloidogenic” by BAP. For example, the predicted amyloidogenic ILVWIW and FWLCYL hexapeptides can be connected through the length-6 path ILVWIW-IWVWIW-IWVCIW-IWVCIL-FWVCIL-FWLCIL-FWLCYL in such a way that the neighbors differ in exactly one residue, and all hexapeptides on the path are predicted to be amyloidogenic by BAP. The symmetric statement also holds true for non-amyloidogenic predicted hexapeptides: For any such pair, there exists a path of length at most six, traversing only predicted non-amyloidogenic hexapeptides. It is noted that the mentioned property of the Budapest Amyloid Predictor https://pitgroup.org/bap is not proprietary; it is also true for any linear Support Vector Machine (SVM)-based predictors; therefore, for any future improvements of BAP using the linear SVM prediction technique.</p>","PeriodicalId":188,"journal":{"name":"Journal of Computational Chemistry","volume":"46 26","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jcc.70238","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jcc.70238","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Hexapeptides are increasingly applied as model systems for studying the amyloidogenic properties of oligo- and polypeptides. It is possible to construct 64 million different hexapeptides from the twenty proteinogenic amino acid residues. Today's experimental amyloid databases contain only a fraction of these annotated hexapeptides. For labeling all the possible hexapeptides as “amyloidogenic” or “non-amyloidogenic” there exist several computational predictors with good accuracy. It may be of interest to define and study a simple graph structure on the 64 million hexapeptides as nodes, when two hexapeptides are connected by an edge if they differ by only a single residue. For example, in this graph, HIKKLM is connected to AIKKLM, or HIKKNM, or HIKKLC, but it is not connected with an edge to VVKKLM or HIKNPM. In the present contribution, we consider our previously published artificial intelligence-based tool, the Budapest Amyloid Predictor (BAP for short), and demonstrate a spectacular property of this predictor in the graph defined above. We show that for any two hexapeptides predicted to be “amyloidogenic” by the BAP predictor, there exists an easily constructible path of length at most six that passes through neighboring hexapeptides all predicted to be “amyloidogenic” by BAP. For example, the predicted amyloidogenic ILVWIW and FWLCYL hexapeptides can be connected through the length-6 path ILVWIW-IWVWIW-IWVCIW-IWVCIL-FWVCIL-FWLCIL-FWLCYL in such a way that the neighbors differ in exactly one residue, and all hexapeptides on the path are predicted to be amyloidogenic by BAP. The symmetric statement also holds true for non-amyloidogenic predicted hexapeptides: For any such pair, there exists a path of length at most six, traversing only predicted non-amyloidogenic hexapeptides. It is noted that the mentioned property of the Budapest Amyloid Predictor https://pitgroup.org/bap is not proprietary; it is also true for any linear Support Vector Machine (SVM)-based predictors; therefore, for any future improvements of BAP using the linear SVM prediction technique.
期刊介绍:
This distinguished journal publishes articles concerned with all aspects of computational chemistry: analytical, biological, inorganic, organic, physical, and materials. The Journal of Computational Chemistry presents original research, contemporary developments in theory and methodology, and state-of-the-art applications. Computational areas that are featured in the journal include ab initio and semiempirical quantum mechanics, density functional theory, molecular mechanics, molecular dynamics, statistical mechanics, cheminformatics, biomolecular structure prediction, molecular design, and bioinformatics.