{"title":"基于质心的蛋白质相互作用特征提取新方法","authors":"Gunjan Sahni, Bhawna Mewara, Soniya Lalwani, Rajesh Kumar","doi":"10.1080/0952813X.2022.2052189","DOIUrl":null,"url":null,"abstract":"ABSTRACT Protein is always a central part of the biology of the organism, it is essential to be familiar with the nature of proteins’ molecular level communications, in which the prediction of Protein-Protein Interactions (PPIs) plays the main role. This article proposes a new probabilistic feature extraction technique, termed Centroid-based feature (CF) abbreviated as CF-PPI, to generate a new feature from protein sequence, and then the random forest is used as a classifier to predict PPIs. CF-PPI considers the residual energy of the protein bond in the scenario to detect the interaction between proteins and resolve the protein’s length variation issue using probabilistic feature vectors. The PPI datasets which are used in this article are S. cerevisae, H. pylori, and Human, which achieved the average accuracy of 96.25%, 97.68%, and 97.69% respectively using the CF-PPI and Random Forest as a classifier and the comparison result proved superior to other existing results. The AUC score is also evaluated, additionally, a blind test is performed using five other species’ datasets which are independent of the training set with the same proposed feature approach. The experimental results prove that the CF-PPI is very promising and beneficial for looming proteomics research.","PeriodicalId":15677,"journal":{"name":"Journal of Experimental & Theoretical Artificial Intelligence","volume":"11 1","pages":"1037 - 1057"},"PeriodicalIF":1.7000,"publicationDate":"2022-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"CF-PPI: Centroid based new feature extraction approach for Protein-Protein Interaction Prediction\",\"authors\":\"Gunjan Sahni, Bhawna Mewara, Soniya Lalwani, Rajesh Kumar\",\"doi\":\"10.1080/0952813X.2022.2052189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT Protein is always a central part of the biology of the organism, it is essential to be familiar with the nature of proteins’ molecular level communications, in which the prediction of Protein-Protein Interactions (PPIs) plays the main role. This article proposes a new probabilistic feature extraction technique, termed Centroid-based feature (CF) abbreviated as CF-PPI, to generate a new feature from protein sequence, and then the random forest is used as a classifier to predict PPIs. CF-PPI considers the residual energy of the protein bond in the scenario to detect the interaction between proteins and resolve the protein’s length variation issue using probabilistic feature vectors. The PPI datasets which are used in this article are S. cerevisae, H. pylori, and Human, which achieved the average accuracy of 96.25%, 97.68%, and 97.69% respectively using the CF-PPI and Random Forest as a classifier and the comparison result proved superior to other existing results. The AUC score is also evaluated, additionally, a blind test is performed using five other species’ datasets which are independent of the training set with the same proposed feature approach. The experimental results prove that the CF-PPI is very promising and beneficial for looming proteomics research.\",\"PeriodicalId\":15677,\"journal\":{\"name\":\"Journal of Experimental & Theoretical Artificial Intelligence\",\"volume\":\"11 1\",\"pages\":\"1037 - 1057\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2022-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Experimental & Theoretical Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1080/0952813X.2022.2052189\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Experimental & Theoretical Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1080/0952813X.2022.2052189","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
CF-PPI: Centroid based new feature extraction approach for Protein-Protein Interaction Prediction
ABSTRACT Protein is always a central part of the biology of the organism, it is essential to be familiar with the nature of proteins’ molecular level communications, in which the prediction of Protein-Protein Interactions (PPIs) plays the main role. This article proposes a new probabilistic feature extraction technique, termed Centroid-based feature (CF) abbreviated as CF-PPI, to generate a new feature from protein sequence, and then the random forest is used as a classifier to predict PPIs. CF-PPI considers the residual energy of the protein bond in the scenario to detect the interaction between proteins and resolve the protein’s length variation issue using probabilistic feature vectors. The PPI datasets which are used in this article are S. cerevisae, H. pylori, and Human, which achieved the average accuracy of 96.25%, 97.68%, and 97.69% respectively using the CF-PPI and Random Forest as a classifier and the comparison result proved superior to other existing results. The AUC score is also evaluated, additionally, a blind test is performed using five other species’ datasets which are independent of the training set with the same proposed feature approach. The experimental results prove that the CF-PPI is very promising and beneficial for looming proteomics research.
期刊介绍:
Journal of Experimental & Theoretical Artificial Intelligence (JETAI) is a world leading journal dedicated to publishing high quality, rigorously reviewed, original papers in artificial intelligence (AI) research.
The journal features work in all subfields of AI research and accepts both theoretical and applied research. Topics covered include, but are not limited to, the following:
• cognitive science
• games
• learning
• knowledge representation
• memory and neural system modelling
• perception
• problem-solving