A Supervised Machine Learning Approach for Distinguishing Between Additive and Replacing Horizontal Gene Transfers

Abhijit Mondal, Misagh Kordi, Mukul S. Bansal
{"title":"A Supervised Machine Learning Approach for Distinguishing Between Additive and Replacing Horizontal Gene Transfers","authors":"Abhijit Mondal, Misagh Kordi, Mukul S. Bansal","doi":"10.1145/3388440.3412428","DOIUrl":null,"url":null,"abstract":"Horizontal gene transfer is one of the most important drivers of microbial gene and genome evolution. Despite its central role in microbial evolution, several aspects of horizontal gene transfer remain poorly understood. In particular, transfers can be either additive or replacing depending on whether the transferred gene adds itself as a new gene in the recipient genome or replaces an existing homologous gene. However, despite recent efforts, there do not yet exist effective computational approaches for classifying inferred transfers as being additive or replacing. In this work, we address this gap by devising a novel supervised machine learning approach for classifying transfers as being either additive or replacing. Our approach is based on phylogenetic reconciliation, a standard computational technique for inferring transfers. Our classifier, named ARTra, uses as features the classifications provided by several simple reconciliation-based classification rules, along with topological information from the gene tree, and ensembles them to produce a more accurate classification. ARTra is efficient and robust and significantly improves upon the classification accuracy of the only existing computational approach for this problem. We demonstrate the accuracy of ARTra by applying it to a wide range of simulated datasets and to a large biological dataset. Our results show that ARTra performs well over a broad range of evolutionary conditions and on real data, and that it does so even when trained only on a narrow range of such conditions and only using simulated data. An open-source implementation of ARTra is freely available from https://compbio.engr.uconn.edu/software/ARTra/.","PeriodicalId":411338,"journal":{"name":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3412428","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Horizontal gene transfer is one of the most important drivers of microbial gene and genome evolution. Despite its central role in microbial evolution, several aspects of horizontal gene transfer remain poorly understood. In particular, transfers can be either additive or replacing depending on whether the transferred gene adds itself as a new gene in the recipient genome or replaces an existing homologous gene. However, despite recent efforts, there do not yet exist effective computational approaches for classifying inferred transfers as being additive or replacing. In this work, we address this gap by devising a novel supervised machine learning approach for classifying transfers as being either additive or replacing. Our approach is based on phylogenetic reconciliation, a standard computational technique for inferring transfers. Our classifier, named ARTra, uses as features the classifications provided by several simple reconciliation-based classification rules, along with topological information from the gene tree, and ensembles them to produce a more accurate classification. ARTra is efficient and robust and significantly improves upon the classification accuracy of the only existing computational approach for this problem. We demonstrate the accuracy of ARTra by applying it to a wide range of simulated datasets and to a large biological dataset. Our results show that ARTra performs well over a broad range of evolutionary conditions and on real data, and that it does so even when trained only on a narrow range of such conditions and only using simulated data. An open-source implementation of ARTra is freely available from https://compbio.engr.uconn.edu/software/ARTra/.
一种有监督的机器学习方法用于区分加性和替代水平基因转移
水平基因转移是微生物基因和基因组进化的重要驱动因素之一。尽管它在微生物进化中的核心作用,水平基因转移的几个方面仍然知之甚少。具体地说,转移可以是加性的,也可以是替换性的,这取决于转移的基因是作为一个新基因在受体基因组中添加自己,还是替换现有的同源基因。然而,尽管最近的努力,目前还没有有效的计算方法来分类推断转移为可加性或可替换性。在这项工作中,我们通过设计一种新的监督机器学习方法来解决这一差距,该方法将转移分类为可加性或可替换性。我们的方法基于系统发育调节,这是一种推断转移的标准计算技术。我们的分类器名为ARTra,它使用几个简单的基于协调的分类规则提供的分类特征,以及来自基因树的拓扑信息,并将它们集成在一起以产生更准确的分类。ARTra是一种高效且鲁棒的算法,在现有算法的基础上显著提高了分类精度。我们通过将其应用于广泛的模拟数据集和大型生物数据集来证明ARTra的准确性。我们的结果表明,ARTra在大范围的进化条件和真实数据上表现良好,即使只在小范围的进化条件和模拟数据上训练,它也能做到这一点。ARTra的开源实现可以从https://compbio.engr.uconn.edu/software/ARTra/免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信