Boosting(Xgboost)和Bagging(Randomforest)组合技术用于拼接DNA序列分类的比较分析

Iswaya Maalik Syahrani
{"title":"Boosting(Xgboost)和Bagging(Randomforest)组合技术用于拼接DNA序列分类的比较分析","authors":"Iswaya Maalik Syahrani","doi":"10.17933/jppi.v9i1.249","DOIUrl":null,"url":null,"abstract":"Bioinformatics research currently supported by rapid growth of computation technology and algorithm. Ensemble decision tree is common method for classifying large and complex dataset such as DNA sequence. By implementing two classification methods with ensemble technique like xgboost and random Forest might improve the accuracy result on classifying DNA Sequence splice junction type. With 96,24% of xgboost accuracy and 95,11% of Random Forest accuracy, our conclusions  the xgboost and random forest methods using right parameter setting are highly effective tool for classifying small example dataset. Analyzing both methods with their characteristics will give an overview on how they work to meet the needs in DNA splicing.","PeriodicalId":31332,"journal":{"name":"Jurnal Penelitian Pos dan Informatika","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Comparation Analysis of Ensemble Technique With Boosting(Xgboost) and Bagging (Randomforest) For Classify Splice Junction DNA Sequence Category\",\"authors\":\"Iswaya Maalik Syahrani\",\"doi\":\"10.17933/jppi.v9i1.249\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bioinformatics research currently supported by rapid growth of computation technology and algorithm. Ensemble decision tree is common method for classifying large and complex dataset such as DNA sequence. By implementing two classification methods with ensemble technique like xgboost and random Forest might improve the accuracy result on classifying DNA Sequence splice junction type. With 96,24% of xgboost accuracy and 95,11% of Random Forest accuracy, our conclusions  the xgboost and random forest methods using right parameter setting are highly effective tool for classifying small example dataset. Analyzing both methods with their characteristics will give an overview on how they work to meet the needs in DNA splicing.\",\"PeriodicalId\":31332,\"journal\":{\"name\":\"Jurnal Penelitian Pos dan Informatika\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal Penelitian Pos dan Informatika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17933/jppi.v9i1.249\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Penelitian Pos dan Informatika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17933/jppi.v9i1.249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

生物信息学研究目前得到了计算技术和算法快速发展的支持。集成决策树是对DNA序列等大型复杂数据集进行分类的常用方法。用集成技术实现xgboost和随机森林两种分类方法,可以提高DNA序列剪接连接类型分类的准确性。xgboost和随机森林的准确率分别为96.24%和95.11%,我们的结论是,使用正确参数设置的xgboost和随机森林方法是对小样本数据集进行分类的高效工具。分析这两种方法的特点,将概述它们如何满足DNA剪接的需求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparation Analysis of Ensemble Technique With Boosting(Xgboost) and Bagging (Randomforest) For Classify Splice Junction DNA Sequence Category
Bioinformatics research currently supported by rapid growth of computation technology and algorithm. Ensemble decision tree is common method for classifying large and complex dataset such as DNA sequence. By implementing two classification methods with ensemble technique like xgboost and random Forest might improve the accuracy result on classifying DNA Sequence splice junction type. With 96,24% of xgboost accuracy and 95,11% of Random Forest accuracy, our conclusions  the xgboost and random forest methods using right parameter setting are highly effective tool for classifying small example dataset. Analyzing both methods with their characteristics will give an overview on how they work to meet the needs in DNA splicing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
3 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信