Weijun Zhu, Xiaokai Liu, Zhenfei Wang, Yongwen Fan, Jianwei Wang
{"title":"随机森林预测RNA分子特异性杂交","authors":"Weijun Zhu, Xiaokai Liu, Zhenfei Wang, Yongwen Fan, Jianwei Wang","doi":"10.1109/ICBCB.2019.8854665","DOIUrl":null,"url":null,"abstract":"RNA hybridization is one of the most important operations in popular RNA simulation software in bioinformatics. However, it is a challenging task to decide whether a specific RNA hybridization is effective within an acceptable time, since this mission has the exponentially computational complexity caused by the combinatorial problem. We hereby introduce a machine learning (ML)-based technique to address this problem. And the Random Forest (RF) algorithm is employed, and many groups of RNA molecular coding and their classification in terms of the results of hybridization are inputted to RF for ML training. The trained ML models are applied to predict the classification of RNA hybridization results. The experiment results show that the average computation efficiency of the RF-based approach is 190690 times higher than that of the existing approach, while the predictive accuracy of the former method is 97.7%, compared with the latter one.","PeriodicalId":136995,"journal":{"name":"2019 IEEE 7th International Conference on Bioinformatics and Computational Biology ( ICBCB)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Predicting RNA Molecular Specific Hybridization via Random Forest\",\"authors\":\"Weijun Zhu, Xiaokai Liu, Zhenfei Wang, Yongwen Fan, Jianwei Wang\",\"doi\":\"10.1109/ICBCB.2019.8854665\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"RNA hybridization is one of the most important operations in popular RNA simulation software in bioinformatics. However, it is a challenging task to decide whether a specific RNA hybridization is effective within an acceptable time, since this mission has the exponentially computational complexity caused by the combinatorial problem. We hereby introduce a machine learning (ML)-based technique to address this problem. And the Random Forest (RF) algorithm is employed, and many groups of RNA molecular coding and their classification in terms of the results of hybridization are inputted to RF for ML training. The trained ML models are applied to predict the classification of RNA hybridization results. The experiment results show that the average computation efficiency of the RF-based approach is 190690 times higher than that of the existing approach, while the predictive accuracy of the former method is 97.7%, compared with the latter one.\",\"PeriodicalId\":136995,\"journal\":{\"name\":\"2019 IEEE 7th International Conference on Bioinformatics and Computational Biology ( ICBCB)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 7th International Conference on Bioinformatics and Computational Biology ( ICBCB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICBCB.2019.8854665\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 7th International Conference on Bioinformatics and Computational Biology ( ICBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBCB.2019.8854665","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Predicting RNA Molecular Specific Hybridization via Random Forest
RNA hybridization is one of the most important operations in popular RNA simulation software in bioinformatics. However, it is a challenging task to decide whether a specific RNA hybridization is effective within an acceptable time, since this mission has the exponentially computational complexity caused by the combinatorial problem. We hereby introduce a machine learning (ML)-based technique to address this problem. And the Random Forest (RF) algorithm is employed, and many groups of RNA molecular coding and their classification in terms of the results of hybridization are inputted to RF for ML training. The trained ML models are applied to predict the classification of RNA hybridization results. The experiment results show that the average computation efficiency of the RF-based approach is 190690 times higher than that of the existing approach, while the predictive accuracy of the former method is 97.7%, compared with the latter one.