{"title":"New classification system for protein sequences","authors":"Fatima Kabli, R. M. Hamou, Abdelmalek Amine","doi":"10.1109/EDIS.2017.8284029","DOIUrl":null,"url":null,"abstract":"The Protein classification is an important activity in bioinformatics field. Several techniques have been developed to improve the categories prediction of unclassified protein that serves to predict its function. For this reason, we present a global framework inspired by the knowledge extraction process from biological data based on the association rules. This framework has three main steps: the pre-processing phase consists of extracting the descriptors, we used the N-Gram technique, The second one is devoted to extracting the association rules between the proteins components, we applied the Apriori algorithm; As a third step we selected the relevant rules to classified the unclassified protein. We have tested this classifier on five classes of protein, extracted from the uniprot data bank compared with five methods of classification in WEKA platform, based on the validation tools we obtained satisfied results improve the effectiveness of our protein classifier.","PeriodicalId":401258,"journal":{"name":"2017 First International Conference on Embedded & Distributed Systems (EDiS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 First International Conference on Embedded & Distributed Systems (EDiS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EDIS.2017.8284029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The Protein classification is an important activity in bioinformatics field. Several techniques have been developed to improve the categories prediction of unclassified protein that serves to predict its function. For this reason, we present a global framework inspired by the knowledge extraction process from biological data based on the association rules. This framework has three main steps: the pre-processing phase consists of extracting the descriptors, we used the N-Gram technique, The second one is devoted to extracting the association rules between the proteins components, we applied the Apriori algorithm; As a third step we selected the relevant rules to classified the unclassified protein. We have tested this classifier on five classes of protein, extracted from the uniprot data bank compared with five methods of classification in WEKA platform, based on the validation tools we obtained satisfied results improve the effectiveness of our protein classifier.