{"title":"Isolating - a new resampling method for gene order data","authors":"Jian Shi, W. Arndt, Fei Hu, Jijun Tang","doi":"10.1109/CIBCB.2011.5948464","DOIUrl":null,"url":null,"abstract":"The purpose of using resampling methods on phylogenetic data is to estimate the confidence value of branches. In recent years, bootstrapping and jackknifing are the two most popular resampling schemes which are widely used in biological reserach. However, for gene order data, traditional bootstrap procedures can not be applied because gene order data is viewed as one character with various states. Experience in the biological community has shown that jackknifing is a useful means of determining the confidence value of a gene order phylogeny. When genomes are distant, however, applying jackknifing tends to give low confidence values to many valid branches, causing them to be mistakenly removed. In this paper, we propose a new method that overcomes this disadvantage of jackknifing and achieves better accuracy and confidence values for gene order data. Compared to jackknifing, our experimental results show that the proposed method can produce phylogenies with lower error rates and much stronger support for good branches. We also establish a theoretic lower bound regarding how many genes should be isolated, which is confirmed empirically.","PeriodicalId":395505,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2011.5948464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The purpose of using resampling methods on phylogenetic data is to estimate the confidence value of branches. In recent years, bootstrapping and jackknifing are the two most popular resampling schemes which are widely used in biological reserach. However, for gene order data, traditional bootstrap procedures can not be applied because gene order data is viewed as one character with various states. Experience in the biological community has shown that jackknifing is a useful means of determining the confidence value of a gene order phylogeny. When genomes are distant, however, applying jackknifing tends to give low confidence values to many valid branches, causing them to be mistakenly removed. In this paper, we propose a new method that overcomes this disadvantage of jackknifing and achieves better accuracy and confidence values for gene order data. Compared to jackknifing, our experimental results show that the proposed method can produce phylogenies with lower error rates and much stronger support for good branches. We also establish a theoretic lower bound regarding how many genes should be isolated, which is confirmed empirically.