{"title":"Estimation of Statistical Manifold Properties of Natural Sequences using Information Topology","authors":"A. Back, Janet Wiles","doi":"10.1109/SSP53291.2023.10207948","DOIUrl":null,"url":null,"abstract":"Modeling unknown natural sequences is a challenging area. Here we consider an information theoretic approach for analyzing probabilistic natural sequences in the context of synthetic languages, which are characterized by having no available language models. Based on the notion of efficient short-term entropy estimators, we examine the concept of extending information geometry to information topology as a method of characterizing natural sequences. A normalized relative difference entropy method is described, which is required to apply the technique to sub-word models derived from natural sequences. Visualization of information topological spaces is considered, and some aspects are considered for future work. The approach is shown to provide potential as a new method for modeling the probabilistic structure of synthetic language sequences.","PeriodicalId":296346,"journal":{"name":"2023 IEEE Statistical Signal Processing Workshop (SSP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Statistical Signal Processing Workshop (SSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSP53291.2023.10207948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Modeling unknown natural sequences is a challenging area. Here we consider an information theoretic approach for analyzing probabilistic natural sequences in the context of synthetic languages, which are characterized by having no available language models. Based on the notion of efficient short-term entropy estimators, we examine the concept of extending information geometry to information topology as a method of characterizing natural sequences. A normalized relative difference entropy method is described, which is required to apply the technique to sub-word models derived from natural sequences. Visualization of information topological spaces is considered, and some aspects are considered for future work. The approach is shown to provide potential as a new method for modeling the probabilistic structure of synthetic language sequences.