P. Hanus, J. Dingel, Juergen Zecht, J. Hagenauer, Jakob C. Mueller
{"title":"Information Theoretic Distance Measures in Phylogenomics","authors":"P. Hanus, J. Dingel, Juergen Zecht, J. Hagenauer, Jakob C. Mueller","doi":"10.1109/ITA.2007.4357613","DOIUrl":null,"url":null,"abstract":"A variety of distance measures has been developed in information theory, proven useful in the application to digital information systems. According to the fact, that the information for a living organism is stored digitally on the information carrier DNA, it seems intuitive to apply these methods to genome analysis. We present two applications to genetics: a compression based distance measure can be used to compute pairwise distances between genomic sequences of unequal lengths and thus recognize the content of a DNA region. The Kullback-Leibler distance will serve as basis for the estimation of evolutionary conservation across the genomes of different species in order to identify regions with potential important functionality. Moreover, we show that we can draw conclusions about the biological properties of the such analyzed sequences.","PeriodicalId":439952,"journal":{"name":"2007 Information Theory and Applications Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Information Theory and Applications Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITA.2007.4357613","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
A variety of distance measures has been developed in information theory, proven useful in the application to digital information systems. According to the fact, that the information for a living organism is stored digitally on the information carrier DNA, it seems intuitive to apply these methods to genome analysis. We present two applications to genetics: a compression based distance measure can be used to compute pairwise distances between genomic sequences of unequal lengths and thus recognize the content of a DNA region. The Kullback-Leibler distance will serve as basis for the estimation of evolutionary conservation across the genomes of different species in order to identify regions with potential important functionality. Moreover, we show that we can draw conclusions about the biological properties of the such analyzed sequences.