Prediction of protein-protein interaction sites by means of ensemble learning and weighted feature descriptor.

IF 1.9 3区 生物学 Q2 BIOLOGY
Journal of Biological Research-Thessaloniki Pub Date : 2016-07-04 eCollection Date: 2016-05-01 DOI:10.1186/s40709-016-0046-7
Xiuquan Du, Shiwei Sun, Changlin Hu, Xinrui Li, Junfeng Xia
{"title":"Prediction of protein-protein interaction sites by means of ensemble learning and weighted feature descriptor.","authors":"Xiuquan Du,&nbsp;Shiwei Sun,&nbsp;Changlin Hu,&nbsp;Xinrui Li,&nbsp;Junfeng Xia","doi":"10.1186/s40709-016-0046-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Reliable prediction of protein-protein interaction sites is an important goal in the field of bioinformatics. Many computational methods have been explored for the large-scale prediction of protein-protein interaction sites based on various data types, including protein sequence, structural and genomic data. Although much progress has been achieved in recent years, the problem has not yet been satisfactorily solved.</p><p><strong>Results: </strong>In this work, we presented an efficient approach that uses ensemble learning algorithm with weighted feature descriptor (EL-WFD) to predict protein-protein interaction sites. Moreover, weighted feature descriptor was designed to describe the distance influence of neighboring residues on interaction sites. The results on two dataset (Hetero and Homo), show that the proposed method yields a satisfactory accuracy with 83.8 % recall and 96.3 % precision on the Hetero dataset and 84.2 % recall and 96.3 % precision on the Homo dataset, respectively. In both datasets, our method tend to obtain high Mathews correlation coefficient compared with state-of-the-art technique random forest method.</p><p><strong>Conclusions: </strong>The experimental results show that the EL-WFD method is quite effective in predicting protein-protein interaction sites. The novel weighted feature descriptor was proved to be promising in discovering interaction sites. Overall, the proposed method can be considered as a new powerful tool for predicting protein-protein interaction sites with excellence performance.</p>","PeriodicalId":50251,"journal":{"name":"Journal of Biological Research-Thessaloniki","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2016-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s40709-016-0046-7","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biological Research-Thessaloniki","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s40709-016-0046-7","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2016/5/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 7

Abstract

Background: Reliable prediction of protein-protein interaction sites is an important goal in the field of bioinformatics. Many computational methods have been explored for the large-scale prediction of protein-protein interaction sites based on various data types, including protein sequence, structural and genomic data. Although much progress has been achieved in recent years, the problem has not yet been satisfactorily solved.

Results: In this work, we presented an efficient approach that uses ensemble learning algorithm with weighted feature descriptor (EL-WFD) to predict protein-protein interaction sites. Moreover, weighted feature descriptor was designed to describe the distance influence of neighboring residues on interaction sites. The results on two dataset (Hetero and Homo), show that the proposed method yields a satisfactory accuracy with 83.8 % recall and 96.3 % precision on the Hetero dataset and 84.2 % recall and 96.3 % precision on the Homo dataset, respectively. In both datasets, our method tend to obtain high Mathews correlation coefficient compared with state-of-the-art technique random forest method.

Conclusions: The experimental results show that the EL-WFD method is quite effective in predicting protein-protein interaction sites. The novel weighted feature descriptor was proved to be promising in discovering interaction sites. Overall, the proposed method can be considered as a new powerful tool for predicting protein-protein interaction sites with excellence performance.

基于集成学习和加权特征描述子的蛋白质相互作用位点预测。
背景:蛋白质相互作用位点的可靠预测是生物信息学领域的一个重要目标。基于各种数据类型,包括蛋白质序列、结构和基因组数据,已经探索了许多计算方法来大规模预测蛋白质-蛋白质相互作用位点。尽管近年来取得了很大进展,但这一问题尚未得到圆满解决。结果:在这项工作中,我们提出了一种有效的方法,使用加权特征描述符(EL-WFD)的集成学习算法来预测蛋白质-蛋白质相互作用位点。此外,设计了加权特征描述符来描述相邻残基对相互作用位点的距离影响。在两个数据集(Hetero和Homo)上的结果表明,该方法在Hetero数据集上的查全率和查准率分别为83.8%和96.3%,在Homo数据集上的查全率和查准率分别为84.2%和96.3%。在这两个数据集上,与目前最先进的随机森林方法相比,我们的方法倾向于获得较高的马修斯相关系数。结论:实验结果表明,EL-WFD方法在预测蛋白-蛋白相互作用位点方面是非常有效的。结果表明,该加权特征描述符在发现交互点方面具有较好的应用前景。总之,该方法是预测蛋白质-蛋白质相互作用位点的一种强有力的新工具,具有优良的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.20
自引率
0.00%
发文量
0
审稿时长
>12 weeks
期刊介绍: Journal of Biological Research-Thessaloniki is a peer-reviewed, open access, international journal that publishes articles providing novel insights into the major fields of biology. Topics covered in Journal of Biological Research-Thessaloniki include, but are not limited to: molecular biology, cytology, genetics, evolutionary biology, morphology, development and differentiation, taxonomy, bioinformatics, physiology, marine biology, behaviour, ecology and conservation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信