基于序列的蛋白质相互作用预测,使用自相关特征和机器学习

Syahid Abdullah, W. Kusuma, S. Wijaya
{"title":"基于序列的蛋白质相互作用预测,使用自相关特征和机器学习","authors":"Syahid Abdullah, W. Kusuma, S. Wijaya","doi":"10.14710/jtsiskom.2021.13984","DOIUrl":null,"url":null,"abstract":"Protein-protein interaction (PPI) can define a protein's function by knowing the protein's position in a complex network of protein interactions. The number of PPIs that have been identified is relatively small. Therefore, several studies were conducted to predict PPI using protein sequence information. This research compares the performance of three autocorrelation methods: Moran, Geary, and Moreau-Broto, in extracting protein sequence features to predict PPI. The results of the three extractions are then applied to three machine learning algorithms, namely k-Nearest Neighbor (KNN), Random Forest, and Support Vector Machine (SVM). The prediction models with the three autocorrelation methods can produce predictions with high average accuracy, which is 95.34% for Geary in KNN, 97.43% for Geary in RF, and 97.11% for Geary and Moran in SVM. In addition, the interacting protein pairs tend to have similar autocorrelation characteristics. Thus, the autocorrelation method can be used to predict PPI well.","PeriodicalId":56231,"journal":{"name":"Jurnal Teknologi dan Sistem Komputer","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sequence-based prediction of protein-protein interaction using autocorrelation features and machine learning\",\"authors\":\"Syahid Abdullah, W. Kusuma, S. Wijaya\",\"doi\":\"10.14710/jtsiskom.2021.13984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein-protein interaction (PPI) can define a protein's function by knowing the protein's position in a complex network of protein interactions. The number of PPIs that have been identified is relatively small. Therefore, several studies were conducted to predict PPI using protein sequence information. This research compares the performance of three autocorrelation methods: Moran, Geary, and Moreau-Broto, in extracting protein sequence features to predict PPI. The results of the three extractions are then applied to three machine learning algorithms, namely k-Nearest Neighbor (KNN), Random Forest, and Support Vector Machine (SVM). The prediction models with the three autocorrelation methods can produce predictions with high average accuracy, which is 95.34% for Geary in KNN, 97.43% for Geary in RF, and 97.11% for Geary and Moran in SVM. In addition, the interacting protein pairs tend to have similar autocorrelation characteristics. Thus, the autocorrelation method can be used to predict PPI well.\",\"PeriodicalId\":56231,\"journal\":{\"name\":\"Jurnal Teknologi dan Sistem Komputer\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal Teknologi dan Sistem Komputer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14710/jtsiskom.2021.13984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Teknologi dan Sistem Komputer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14710/jtsiskom.2021.13984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质-蛋白质相互作用(PPI)可以通过了解蛋白质在蛋白质相互作用的复杂网络中的位置来定义蛋白质的功能。已确定的PPI数量相对较少。因此,进行了几项利用蛋白质序列信息预测PPI的研究。本研究比较了三种自相关方法:Moran、Geary和Moreau-Broto在提取蛋白质序列特征以预测PPI方面的性能。然后将三次提取的结果应用于三种机器学习算法,即k近邻(KNN)、随机森林和支持向量机(SVM)。具有三种自相关方法的预测模型可以产生具有高平均精度的预测,在KNN中Geary的平均精度为95.34%,在RF中Geary为97.43%,在SVM中Geary和Moran为97.11%。此外,相互作用的蛋白质对往往具有相似的自相关特性。因此,自相关方法可以很好地预测PPI。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sequence-based prediction of protein-protein interaction using autocorrelation features and machine learning
Protein-protein interaction (PPI) can define a protein's function by knowing the protein's position in a complex network of protein interactions. The number of PPIs that have been identified is relatively small. Therefore, several studies were conducted to predict PPI using protein sequence information. This research compares the performance of three autocorrelation methods: Moran, Geary, and Moreau-Broto, in extracting protein sequence features to predict PPI. The results of the three extractions are then applied to three machine learning algorithms, namely k-Nearest Neighbor (KNN), Random Forest, and Support Vector Machine (SVM). The prediction models with the three autocorrelation methods can produce predictions with high average accuracy, which is 95.34% for Geary in KNN, 97.43% for Geary in RF, and 97.11% for Geary and Moran in SVM. In addition, the interacting protein pairs tend to have similar autocorrelation characteristics. Thus, the autocorrelation method can be used to predict PPI well.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
6
审稿时长
6 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信