Complexity and Similarity for Sequences using LZ77-based conditional information measure

François Cayre, N. L. Bihan
{"title":"Complexity and Similarity for Sequences using LZ77-based conditional information measure","authors":"François Cayre, N. L. Bihan","doi":"10.1109/ISIT.2019.8849610","DOIUrl":null,"url":null,"abstract":"This work concerns the definition of conditional mutual information in the framework of Algorithmic Information Theory (AIT), which is of use when no probabilistic model of the data is available, or hard to devise. We introduce a practical way to construct a conditional mutual information quantity which respects the chain rule and the data processing inequalityThe proposed implementation, named SALZA, allows to accomplish various information-theoretic tasks on sequences. The algorithmic model of the data used in this work is that of the well-known Lempel-Ziv primitive: we assume new data is to be expressed in terms of references to prior data.SALZA enables a flexible specification of prior data and extracts information quantities based on the significance of the references to these prior data. The tool readily implements the computation of an information measure based on LZ77 and a universal classifier based on the Ziv-Merhav relative coder for the universal clustering of sequences.Illustration of the proposed implementation is provided on clustering and causality inference examples.","PeriodicalId":6708,"journal":{"name":"2019 IEEE International Symposium on Information Theory (ISIT)","volume":"64 1","pages":"2454-2458"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Symposium on Information Theory (ISIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIT.2019.8849610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This work concerns the definition of conditional mutual information in the framework of Algorithmic Information Theory (AIT), which is of use when no probabilistic model of the data is available, or hard to devise. We introduce a practical way to construct a conditional mutual information quantity which respects the chain rule and the data processing inequalityThe proposed implementation, named SALZA, allows to accomplish various information-theoretic tasks on sequences. The algorithmic model of the data used in this work is that of the well-known Lempel-Ziv primitive: we assume new data is to be expressed in terms of references to prior data.SALZA enables a flexible specification of prior data and extracts information quantities based on the significance of the references to these prior data. The tool readily implements the computation of an information measure based on LZ77 and a universal classifier based on the Ziv-Merhav relative coder for the universal clustering of sequences.Illustration of the proposed implementation is provided on clustering and causality inference examples.
基于lz77的条件信息度量序列的复杂度和相似性
这项工作涉及算法信息论(AIT)框架中条件互信息的定义,这在没有可用的数据概率模型或难以设计的情况下使用。我们介绍了一种实用的方法来构造一个尊重链式法则和数据处理不等式的条件互信息量,该方法被称为SALZA,它允许在序列上完成各种信息论任务。这项工作中使用的数据的算法模型是著名的Lempel-Ziv原语:我们假设新数据是根据先前数据的引用来表示的。SALZA能够灵活地规范先验数据,并根据对这些先验数据的引用的重要性提取信息量。该工具很容易实现基于LZ77的信息测度和基于Ziv-Merhav相对编码器的通用分类器的计算,用于序列的通用聚类。给出了基于聚类和因果推理的示例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信