M. S. Khatun, M. Hasan, Md. Nurul Haque Mollah, H. Kurata
{"title":"SIPMA:在机器学习框架中使用自相关特征系统地识别玉米蛋白-蛋白相互作用","authors":"M. S. Khatun, M. Hasan, Md. Nurul Haque Mollah, H. Kurata","doi":"10.1109/BIBE.2018.00030","DOIUrl":null,"url":null,"abstract":"Zea mays (maize) is one of the most vital crops which are grown widely in the world. To understand the molecular structures and functions of maize, the identification of protein-protein interaction (PPI) is very important. PPI identification by wet lab experiments is time-consuming, expensive and laborious. These days in silico methods that accurately predict potential PPIs based on protein sequence information are highly demanded. Research on PPI prediction in maize is currently very limited, and no dedicated bioinformatics schemes are available. In this work, we proposed a novel approach, termed SIPMA (Systematic Identification of PPI in Maize using Autocorrelation). A machine learning random forest classifier was trained with autocorrelation features to build the prediction model. The SIPMA, which was tested by the experimentally verified PPI dataset of maize, yielded a prediction accuracy of 0.899 when the specificity was 0.969 on the training set. The SIPMA achieved promising performances on the test datasets. Compared with different sequence-based encoding and statistical learning methods, the SIPMA was a powerful computational resource for identifying PPIs in maize.","PeriodicalId":127507,"journal":{"name":"2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"SIPMA: A Systematic Identification of Protein-Protein Interactions in Zea mays Using Autocorrelation Features in a Machine-Learning Framework\",\"authors\":\"M. S. Khatun, M. Hasan, Md. Nurul Haque Mollah, H. Kurata\",\"doi\":\"10.1109/BIBE.2018.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Zea mays (maize) is one of the most vital crops which are grown widely in the world. To understand the molecular structures and functions of maize, the identification of protein-protein interaction (PPI) is very important. PPI identification by wet lab experiments is time-consuming, expensive and laborious. These days in silico methods that accurately predict potential PPIs based on protein sequence information are highly demanded. Research on PPI prediction in maize is currently very limited, and no dedicated bioinformatics schemes are available. In this work, we proposed a novel approach, termed SIPMA (Systematic Identification of PPI in Maize using Autocorrelation). A machine learning random forest classifier was trained with autocorrelation features to build the prediction model. The SIPMA, which was tested by the experimentally verified PPI dataset of maize, yielded a prediction accuracy of 0.899 when the specificity was 0.969 on the training set. The SIPMA achieved promising performances on the test datasets. Compared with different sequence-based encoding and statistical learning methods, the SIPMA was a powerful computational resource for identifying PPIs in maize.\",\"PeriodicalId\":127507,\"journal\":{\"name\":\"2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE)\",\"volume\":\"189 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2018.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2018.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SIPMA: A Systematic Identification of Protein-Protein Interactions in Zea mays Using Autocorrelation Features in a Machine-Learning Framework
Zea mays (maize) is one of the most vital crops which are grown widely in the world. To understand the molecular structures and functions of maize, the identification of protein-protein interaction (PPI) is very important. PPI identification by wet lab experiments is time-consuming, expensive and laborious. These days in silico methods that accurately predict potential PPIs based on protein sequence information are highly demanded. Research on PPI prediction in maize is currently very limited, and no dedicated bioinformatics schemes are available. In this work, we proposed a novel approach, termed SIPMA (Systematic Identification of PPI in Maize using Autocorrelation). A machine learning random forest classifier was trained with autocorrelation features to build the prediction model. The SIPMA, which was tested by the experimentally verified PPI dataset of maize, yielded a prediction accuracy of 0.899 when the specificity was 0.969 on the training set. The SIPMA achieved promising performances on the test datasets. Compared with different sequence-based encoding and statistical learning methods, the SIPMA was a powerful computational resource for identifying PPIs in maize.