Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.

Q1 Biochemistry, Genetics and Molecular Biology
Advances in Bioinformatics Pub Date : 2016-01-01 Epub Date: 2016-04-12 DOI:10.1155/2016/5670851
Malik Yousef, Müşerref Duygu Saçar Demirci, Waleed Khalifa, Jens Allmer
{"title":"Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.","authors":"Malik Yousef,&nbsp;Müşerref Duygu Saçar Demirci,&nbsp;Waleed Khalifa,&nbsp;Jens Allmer","doi":"10.1155/2016/5670851","DOIUrl":null,"url":null,"abstract":"<p><p>MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"5670851"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/5670851","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2016/5670851","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2016/4/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 22

Abstract

MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.

Abstract Image

Abstract Image

Abstract Image

特征选择对植物microrna的一类分类精度有很大影响。
MicroRNAs (miRNAs)是参与转录后基因调控的短RNA序列。他们的实验分析比较复杂,因此需要辅以计算miRNA检测。目前,计算miRNA检测主要使用机器学习,特别是两类分类。对于机器学习,需要对mirna进行参数化,并且已经描述了700多个特征。机器学习的正面训练示例很容易获得,但负面数据很难获得。因此,用一类分类代替两类分类似乎是一种特权。以前,我们几乎可以使用单类分类器达到两类分类精度。在这项工作中,我们将特征选择过程与一类分类相结合,并表明这些特征选择方法之间的准确率差异高达36%。最佳特征集允许训练一个单类分类器,其平均准确率达到95.6%,从而比以前基于两类的植物miRNA检测方法高出约0.5%。我们相信,未来可以通过严格过滤正训练样例和改进当前的特征聚类算法来改进这一点,以更好地针对pre-miRNA特征选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Advances in Bioinformatics
Advances in Bioinformatics Biochemistry, Genetics and Molecular Biology-Biochemistry, Genetics and Molecular Biology (miscellaneous)
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信