{"title":"用于比较标记树的 k-Robinson-Foulds 差异度量。","authors":"Elahe Khayatian, Gabriel Valiente, Louxin Zhang","doi":"10.1089/cmb.2023.0312","DOIUrl":null,"url":null,"abstract":"<p><p>\n <b>Understanding the mutational history of tumor cells is a critical endeavor in unraveling the mechanisms that drive the onset and progression of cancer. Modeling tumor cell evolution with labeled trees motivates researchers to develop different measures to compare labeled trees. Although the Robinson-Foulds (RF) distance is widely used for comparing species trees, its applicability to labeled trees reveals certain limitations. This study introduces the <i>k</i>-RF dissimilarity measures, tailored to address the challenges of labeled tree comparison. The RF distance is succinctly expressed as <i>n</i>-RF in the space of labeled trees with <i>n</i> nodes. Like the RF distance, the <i>k</i>-RF is a pseudometric for multiset-labeled trees and becomes a metric in the space of 1-labeled trees. By setting <i>k</i> to a small value, the <i>k</i>-RF dissimilarity can capture analogous local regions in two labeled trees with different size or different labels.</b>\n </p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"328-344"},"PeriodicalIF":1.4000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11057537/pdf/","citationCount":"0","resultStr":"{\"title\":\"The <i>k</i>-Robinson-Foulds Dissimilarity Measures for Comparison of Labeled Trees.\",\"authors\":\"Elahe Khayatian, Gabriel Valiente, Louxin Zhang\",\"doi\":\"10.1089/cmb.2023.0312\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>\\n <b>Understanding the mutational history of tumor cells is a critical endeavor in unraveling the mechanisms that drive the onset and progression of cancer. Modeling tumor cell evolution with labeled trees motivates researchers to develop different measures to compare labeled trees. Although the Robinson-Foulds (RF) distance is widely used for comparing species trees, its applicability to labeled trees reveals certain limitations. This study introduces the <i>k</i>-RF dissimilarity measures, tailored to address the challenges of labeled tree comparison. The RF distance is succinctly expressed as <i>n</i>-RF in the space of labeled trees with <i>n</i> nodes. Like the RF distance, the <i>k</i>-RF is a pseudometric for multiset-labeled trees and becomes a metric in the space of 1-labeled trees. By setting <i>k</i> to a small value, the <i>k</i>-RF dissimilarity can capture analogous local regions in two labeled trees with different size or different labels.</b>\\n </p>\",\"PeriodicalId\":15526,\"journal\":{\"name\":\"Journal of Computational Biology\",\"volume\":\" \",\"pages\":\"328-344\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11057537/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1089/cmb.2023.0312\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/25 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2023.0312","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/25 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
摘要
了解肿瘤细胞的突变历史是揭示癌症发病和进展机制的关键工作。用标记树模拟肿瘤细胞的进化促使研究人员开发不同的测量方法来比较标记树。虽然罗宾逊-福尔斯(Robinson-Foulds,RF)距离被广泛用于比较物种树,但它对标记树的适用性暴露出一定的局限性。本研究引入了 k-RF 差异度量,以应对标记树比较的挑战。RF 距离在有 n 个节点的标记树空间中简洁地表示为 n-RF。与 RF 距离一样,k-RF 也是多集标签树的伪计量,在 1 个标签树的空间中成为一个度量。通过将 k 设为一个较小的值,k-RF 差异度可以捕捉到两个不同大小或不同标签的标记树中的类似局部区域。
The k-Robinson-Foulds Dissimilarity Measures for Comparison of Labeled Trees.
Understanding the mutational history of tumor cells is a critical endeavor in unraveling the mechanisms that drive the onset and progression of cancer. Modeling tumor cell evolution with labeled trees motivates researchers to develop different measures to compare labeled trees. Although the Robinson-Foulds (RF) distance is widely used for comparing species trees, its applicability to labeled trees reveals certain limitations. This study introduces the k-RF dissimilarity measures, tailored to address the challenges of labeled tree comparison. The RF distance is succinctly expressed as n-RF in the space of labeled trees with n nodes. Like the RF distance, the k-RF is a pseudometric for multiset-labeled trees and becomes a metric in the space of 1-labeled trees. By setting k to a small value, the k-RF dissimilarity can capture analogous local regions in two labeled trees with different size or different labels.
期刊介绍:
Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics.
Journal of Computational Biology coverage includes:
-Genomics
-Mathematical modeling and simulation
-Distributed and parallel biological computing
-Designing biological databases
-Pattern matching and pattern detection
-Linking disparate databases and data
-New tools for computational biology
-Relational and object-oriented database technology for bioinformatics
-Biological expert system design and use
-Reasoning by analogy, hypothesis formation, and testing by machine
-Management of biological databases