6mA-StackingCV: an improved stacking ensemble model for predicting DNA N6-methyladenine site.

IF 4 3区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Biodata Mining Pub Date : 2023-11-27 DOI:10.1186/s13040-023-00348-8

Guohua Huang, Xiaohong Huang, Wei Luo

{"title":"6mA-StackingCV: an improved stacking ensemble model for predicting DNA N6-methyladenine site.","authors":"Guohua Huang, Xiaohong Huang, Wei Luo","doi":"10.1186/s13040-023-00348-8","DOIUrl":null,"url":null,"abstract":"<p><p>DNA N6-adenine methylation (N6-methyladenine, 6mA) plays a key regulating role in the cellular processes. Precisely recognizing 6mA sites is of importance to further explore its biological functions. Although there are many developed computational methods for 6mA site prediction over the past decades, there is a large root left to improve. We presented a cross validation-based stacking ensemble model for 6mA site prediction, called 6mA-StackingCV. The 6mA-StackingCV is a type of meta-learning algorithm, which uses output of cross validation as input to the final classifier. The 6mA-StackingCV reached the state of the art performances in the Rosaceae independent test. Extensive tests demonstrated the stability and the flexibility of the 6mA-StackingCV. We implemented the 6mA-StackingCV as a user-friendly web application, which allows one to restrictively choose representations or learning algorithms. This application is freely available at http://www.biolscience.cn/6mA-stackingCV/ . The source code and experimental data is available at https://github.com/Xiaohong-source/6mA-stackingCV .</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"16 1","pages":"34"},"PeriodicalIF":4.0000,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10680251/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-023-00348-8","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

DNA N6-adenine methylation (N6-methyladenine, 6mA) plays a key regulating role in the cellular processes. Precisely recognizing 6mA sites is of importance to further explore its biological functions. Although there are many developed computational methods for 6mA site prediction over the past decades, there is a large root left to improve. We presented a cross validation-based stacking ensemble model for 6mA site prediction, called 6mA-StackingCV. The 6mA-StackingCV is a type of meta-learning algorithm, which uses output of cross validation as input to the final classifier. The 6mA-StackingCV reached the state of the art performances in the Rosaceae independent test. Extensive tests demonstrated the stability and the flexibility of the 6mA-StackingCV. We implemented the 6mA-StackingCV as a user-friendly web application, which allows one to restrictively choose representations or learning algorithms. This application is freely available at http://www.biolscience.cn/6mA-stackingCV/ . The source code and experimental data is available at https://github.com/Xiaohong-source/6mA-stackingCV .

查看原文本刊更多论文

6mA-StackingCV:一种用于预测DNA n6 -甲基ladenine位点的改进的堆叠集成模型。

DNA n6 -腺嘌呤甲基化(n6 - methylladenine, 6mA)在细胞过程中起着关键的调节作用。准确识别6mA位点对进一步探索其生物学功能具有重要意义。在过去的几十年里，虽然有许多成熟的6mA场址预测计算方法，但仍有很大的改进余地。我们提出了一个基于交叉验证的用于6mA位点预测的堆叠集成模型，称为6mA- stackingcv。6mA-StackingCV是一种元学习算法，它使用交叉验证的输出作为最终分类器的输入。6mA-StackingCV在蔷薇科独立测试中达到了最先进的性能。广泛的测试证明了6mA-StackingCV的稳定性和灵活性。我们将6mA-StackingCV实现为一个用户友好的web应用程序，它允许人们限制性地选择表示或学习算法。该应用程序可在http://www.biolscience.cn/6mA-stackingCV/免费获得。源代码和实验数据可在https://github.com/Xiaohong-source/6mA-stackingCV上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

7.90

自引率

0.00%

发文量

审稿时长

23 weeks

期刊介绍： BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.