{"title":"基于lz77的条件信息度量序列的复杂度和相似性","authors":"François Cayre, N. L. Bihan","doi":"10.1109/ISIT.2019.8849610","DOIUrl":null,"url":null,"abstract":"This work concerns the definition of conditional mutual information in the framework of Algorithmic Information Theory (AIT), which is of use when no probabilistic model of the data is available, or hard to devise. We introduce a practical way to construct a conditional mutual information quantity which respects the chain rule and the data processing inequalityThe proposed implementation, named SALZA, allows to accomplish various information-theoretic tasks on sequences. The algorithmic model of the data used in this work is that of the well-known Lempel-Ziv primitive: we assume new data is to be expressed in terms of references to prior data.SALZA enables a flexible specification of prior data and extracts information quantities based on the significance of the references to these prior data. The tool readily implements the computation of an information measure based on LZ77 and a universal classifier based on the Ziv-Merhav relative coder for the universal clustering of sequences.Illustration of the proposed implementation is provided on clustering and causality inference examples.","PeriodicalId":6708,"journal":{"name":"2019 IEEE International Symposium on Information Theory (ISIT)","volume":"64 1","pages":"2454-2458"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Complexity and Similarity for Sequences using LZ77-based conditional information measure\",\"authors\":\"François Cayre, N. L. Bihan\",\"doi\":\"10.1109/ISIT.2019.8849610\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work concerns the definition of conditional mutual information in the framework of Algorithmic Information Theory (AIT), which is of use when no probabilistic model of the data is available, or hard to devise. We introduce a practical way to construct a conditional mutual information quantity which respects the chain rule and the data processing inequalityThe proposed implementation, named SALZA, allows to accomplish various information-theoretic tasks on sequences. The algorithmic model of the data used in this work is that of the well-known Lempel-Ziv primitive: we assume new data is to be expressed in terms of references to prior data.SALZA enables a flexible specification of prior data and extracts information quantities based on the significance of the references to these prior data. The tool readily implements the computation of an information measure based on LZ77 and a universal classifier based on the Ziv-Merhav relative coder for the universal clustering of sequences.Illustration of the proposed implementation is provided on clustering and causality inference examples.\",\"PeriodicalId\":6708,\"journal\":{\"name\":\"2019 IEEE International Symposium on Information Theory (ISIT)\",\"volume\":\"64 1\",\"pages\":\"2454-2458\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Symposium on Information Theory (ISIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISIT.2019.8849610\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Symposium on Information Theory (ISIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIT.2019.8849610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Complexity and Similarity for Sequences using LZ77-based conditional information measure
This work concerns the definition of conditional mutual information in the framework of Algorithmic Information Theory (AIT), which is of use when no probabilistic model of the data is available, or hard to devise. We introduce a practical way to construct a conditional mutual information quantity which respects the chain rule and the data processing inequalityThe proposed implementation, named SALZA, allows to accomplish various information-theoretic tasks on sequences. The algorithmic model of the data used in this work is that of the well-known Lempel-Ziv primitive: we assume new data is to be expressed in terms of references to prior data.SALZA enables a flexible specification of prior data and extracts information quantities based on the significance of the references to these prior data. The tool readily implements the computation of an information measure based on LZ77 and a universal classifier based on the Ziv-Merhav relative coder for the universal clustering of sequences.Illustration of the proposed implementation is provided on clustering and causality inference examples.