一个系统发育树数据库的实现

T. Yoshikawa, T. Tabe, R. Kishinami, H. Matsuda, A. Hashimoto
{"title":"一个系统发育树数据库的实现","authors":"T. Yoshikawa, T. Tabe, R. Kishinami, H. Matsuda, A. Hashimoto","doi":"10.1109/PACRIM.1999.799473","DOIUrl":null,"url":null,"abstract":"A molecular phylogenetic tree is a tree-structured graph that represents the evolutionary process of genes, and is constructed from sequence data (such as DNA sequences) obtained from several organisms. Although molecular phylogenetic trees are fundamental data structures in evolutionary analysis, no database system is available that can match trees in the database against a user-supplied tree by comparing tree structures. In this paper, we propose a phylogenetic tree database system with a retrieval function that matches trees having similar structure. The tree data stored in the database are transformed from document images published in biological journals using a pattern-recognition program developed by us. To retrieve phylogenetic trees from the database according to their structures, we propose a method of determining the structural similarity between trees that is based on the split distance method. Our structural similarity measure shows high correlation with the log-likelihood difference that is widely used for comparing phylogenetic trees, and the computation time of our measure is much shorter than that of the log-likelihood difference, which relies on sequence comparison.","PeriodicalId":176763,"journal":{"name":"1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"On the implementation of a phylogenetic tree database\",\"authors\":\"T. Yoshikawa, T. Tabe, R. Kishinami, H. Matsuda, A. Hashimoto\",\"doi\":\"10.1109/PACRIM.1999.799473\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A molecular phylogenetic tree is a tree-structured graph that represents the evolutionary process of genes, and is constructed from sequence data (such as DNA sequences) obtained from several organisms. Although molecular phylogenetic trees are fundamental data structures in evolutionary analysis, no database system is available that can match trees in the database against a user-supplied tree by comparing tree structures. In this paper, we propose a phylogenetic tree database system with a retrieval function that matches trees having similar structure. The tree data stored in the database are transformed from document images published in biological journals using a pattern-recognition program developed by us. To retrieve phylogenetic trees from the database according to their structures, we propose a method of determining the structural similarity between trees that is based on the split distance method. Our structural similarity measure shows high correlation with the log-likelihood difference that is widely used for comparing phylogenetic trees, and the computation time of our measure is much shorter than that of the log-likelihood difference, which relies on sequence comparison.\",\"PeriodicalId\":176763,\"journal\":{\"name\":\"1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACRIM.1999.799473\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.1999.799473","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

分子系统发育树是一种表示基因进化过程的树状结构图,它是由从几个生物体获得的序列数据(如DNA序列)构建而成的。虽然分子系统发育树是进化分析中的基本数据结构,但是没有数据库系统可以通过比较树结构来匹配数据库中的树和用户提供的树。在本文中,我们提出了一个系统发育树数据库系统,该系统具有匹配具有相似结构的树的检索功能。存储在数据库中的树数据是用我们开发的模式识别程序从发表在生物学期刊上的文档图像转换而来的。为了根据系统发育树的结构从数据库中检索系统发育树,我们提出了一种基于分割距离法的树间结构相似性确定方法。我们的结构相似性度量与广泛用于系统发育树比较的对数似然差具有较高的相关性,且计算时间远短于依赖于序列比较的对数似然差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On the implementation of a phylogenetic tree database
A molecular phylogenetic tree is a tree-structured graph that represents the evolutionary process of genes, and is constructed from sequence data (such as DNA sequences) obtained from several organisms. Although molecular phylogenetic trees are fundamental data structures in evolutionary analysis, no database system is available that can match trees in the database against a user-supplied tree by comparing tree structures. In this paper, we propose a phylogenetic tree database system with a retrieval function that matches trees having similar structure. The tree data stored in the database are transformed from document images published in biological journals using a pattern-recognition program developed by us. To retrieve phylogenetic trees from the database according to their structures, we propose a method of determining the structural similarity between trees that is based on the split distance method. Our structural similarity measure shows high correlation with the log-likelihood difference that is widely used for comparing phylogenetic trees, and the computation time of our measure is much shorter than that of the log-likelihood difference, which relies on sequence comparison.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信