用系统发育挖掘解决基因树和物种树问题

Xiaoxu Han
{"title":"用系统发育挖掘解决基因树和物种树问题","authors":"Xiaoxu Han","doi":"10.1142/9781860947292_0032","DOIUrl":null,"url":null,"abstract":"The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of widely distributed orthologous genes selected from genome data to conduct phylogenetic analysis. The random concatenation mechanism prevents us from the further investigations of the inner structures of the gene data set employed to infer the phylogenetic trees and locates the most phylogenetically informative genes. In this work, a phylogenomic mining approach is described to gain knowledge from a gene data set by clustering genes in the gene set through a self-organizing map (SOM) to explore the gene dataset inner structures. From this, the most phylogenetically informative gene set is created by picking the maximum entropy gene from each cluster to infer phylogenetic trees by phylogenetic analysis. Using the same data set, the phylogenetic mining approach performs better than the random gene concatenation approach.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Resolving the Gene Tree and Species Tree Problem by Phylogenetic Mining\",\"authors\":\"Xiaoxu Han\",\"doi\":\"10.1142/9781860947292_0032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of widely distributed orthologous genes selected from genome data to conduct phylogenetic analysis. The random concatenation mechanism prevents us from the further investigations of the inner structures of the gene data set employed to infer the phylogenetic trees and locates the most phylogenetically informative genes. In this work, a phylogenomic mining approach is described to gain knowledge from a gene data set by clustering genes in the gene set through a self-organizing map (SOM) to explore the gene dataset inner structures. From this, the most phylogenetically informative gene set is created by picking the maximum entropy gene from each cluster to infer phylogenetic trees by phylogenetic analysis. Using the same data set, the phylogenetic mining approach performs better than the random gene concatenation approach.\",\"PeriodicalId\":74513,\"journal\":{\"name\":\"Proceedings of the ... Asia-Pacific bioinformatics conference\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... Asia-Pacific bioinformatics conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/9781860947292_0032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... Asia-Pacific bioinformatics conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/9781860947292_0032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

基因树和物种树问题仍然是系统基因组学的核心问题。为了克服这一问题,人们采用基因串联方法,从基因组数据中选择一组分布广泛的同源基因,随机组合一定数量的基因进行系统发育分析。随机连接机制使我们无法进一步研究用于推断系统发育树和定位最具系统发育信息基因的基因数据集的内部结构。在这项工作中,描述了一种系统基因组挖掘方法,通过自组织图谱(SOM)对基因集中的基因进行聚类,以探索基因数据集的内部结构,从而从基因数据集中获得知识。在此基础上,从每个聚类中选取熵值最大的基因,通过系统发育分析推断出系统发育树,从而得到系统发育信息量最大的基因集。使用相同的数据集,系统发育挖掘方法比随机基因连接方法性能更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Resolving the Gene Tree and Species Tree Problem by Phylogenetic Mining
The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of widely distributed orthologous genes selected from genome data to conduct phylogenetic analysis. The random concatenation mechanism prevents us from the further investigations of the inner structures of the gene data set employed to infer the phylogenetic trees and locates the most phylogenetically informative genes. In this work, a phylogenomic mining approach is described to gain knowledge from a gene data set by clustering genes in the gene set through a self-organizing map (SOM) to explore the gene dataset inner structures. From this, the most phylogenetically informative gene set is created by picking the maximum entropy gene from each cluster to infer phylogenetic trees by phylogenetic analysis. Using the same data set, the phylogenetic mining approach performs better than the random gene concatenation approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信