{"title":"标记树k-Robinson-Foulds不相似测度的渐近分布。","authors":"Michael Fuchs, Mike Steel","doi":"10.1089/cmb.2025.0093","DOIUrl":null,"url":null,"abstract":"<p><p>Motivated by applications in medical bioinformatics, Khayatian et al. (2024) introduced a family of metrics on Cayley trees [the <i>k</i>-Robinson-Foulds (RF) distance, for <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mn>0</mn><mo>,</mo></mrow></math> . . . <math><mrow><mo>,</mo><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math>] and explored their distribution on pairs of random Cayley trees via simulations. In this article, we investigate this distribution mathematically and derive exact asymptotic descriptions of the distribution of the <i>k</i>-RF metric for the extreme values <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mn>0</mn></mrow></math> and <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math>, as <i>n</i> becomes large. We show that a linear transform of the 0-RF metric converges to a Poisson distribution (with mean 2), whereas a similar transform for the (<math><mrow><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math>)-RF metric leads to a normal distribution (with mean <math><mrow><mstyle><mo>∼</mo></mstyle><mo> </mo><mi>n</mi><mrow><msup><mrow><mi>e</mi></mrow><mrow><mo>-</mo><mn>2</mn></mrow></msup></mrow></mrow></math>). These results (together with the case <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mn>1</mn></mrow></math> which behaves quite differently and <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mi>n</mi><mo>-</mo><mn>3</mn></mrow></math>) shed light on the earlier simulation results and the predictions made concerning them.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Asymptotic Distribution of the <i>k</i>-Robinson-Foulds Dissimilarity Measure on Labeled Trees.\",\"authors\":\"Michael Fuchs, Mike Steel\",\"doi\":\"10.1089/cmb.2025.0093\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Motivated by applications in medical bioinformatics, Khayatian et al. (2024) introduced a family of metrics on Cayley trees [the <i>k</i>-Robinson-Foulds (RF) distance, for <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mn>0</mn><mo>,</mo></mrow></math> . . . <math><mrow><mo>,</mo><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math>] and explored their distribution on pairs of random Cayley trees via simulations. In this article, we investigate this distribution mathematically and derive exact asymptotic descriptions of the distribution of the <i>k</i>-RF metric for the extreme values <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mn>0</mn></mrow></math> and <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math>, as <i>n</i> becomes large. We show that a linear transform of the 0-RF metric converges to a Poisson distribution (with mean 2), whereas a similar transform for the (<math><mrow><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math>)-RF metric leads to a normal distribution (with mean <math><mrow><mstyle><mo>∼</mo></mstyle><mo> </mo><mi>n</mi><mrow><msup><mrow><mi>e</mi></mrow><mrow><mo>-</mo><mn>2</mn></mrow></msup></mrow></mrow></math>). These results (together with the case <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mn>1</mn></mrow></math> which behaves quite differently and <math><mrow><mi>k</mi><mo> </mo><mo>=</mo><mo> </mo><mi>n</mi><mo>-</mo><mn>3</mn></mrow></math>) shed light on the earlier simulation results and the predictions made concerning them.</p>\",\"PeriodicalId\":15526,\"journal\":{\"name\":\"Journal of Computational Biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1089/cmb.2025.0093\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2025.0093","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
The Asymptotic Distribution of the k-Robinson-Foulds Dissimilarity Measure on Labeled Trees.
Motivated by applications in medical bioinformatics, Khayatian et al. (2024) introduced a family of metrics on Cayley trees [the k-Robinson-Foulds (RF) distance, for . . . ] and explored their distribution on pairs of random Cayley trees via simulations. In this article, we investigate this distribution mathematically and derive exact asymptotic descriptions of the distribution of the k-RF metric for the extreme values and , as n becomes large. We show that a linear transform of the 0-RF metric converges to a Poisson distribution (with mean 2), whereas a similar transform for the ()-RF metric leads to a normal distribution (with mean ). These results (together with the case which behaves quite differently and ) shed light on the earlier simulation results and the predictions made concerning them.
期刊介绍:
Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics.
Journal of Computational Biology coverage includes:
-Genomics
-Mathematical modeling and simulation
-Distributed and parallel biological computing
-Designing biological databases
-Pattern matching and pattern detection
-Linking disparate databases and data
-New tools for computational biology
-Relational and object-oriented database technology for bioinformatics
-Biological expert system design and use
-Reasoning by analogy, hypothesis formation, and testing by machine
-Management of biological databases