Conditional NML Universal Models

J. Rissanen, Teemu Roos
{"title":"Conditional NML Universal Models","authors":"J. Rissanen, Teemu Roos","doi":"10.1109/ITA.2007.4357600","DOIUrl":null,"url":null,"abstract":"The NML (normalized maximum likelihood) universal model has certain minmax optimal properties but it has two shortcomings: the normalizing coefficient can be evaluated in a closed form only for special model classes, and it does not define a random process so that it cannot be used for prediction. We present a universal conditional NML model, which has minmax optimal properties similar to those of the regular NML model. However, unlike NML, the conditional NML model defines a random process which can be used for prediction. It also admits a recursive evaluation for data compression. The conditional normalizing coefficient is much easier to evaluate, for instance, for tree machines than the integral of the square root of the Fisher information in the NML model. For Bernoulli distributions, the conditional NML model gives a predictive probability, which behaves like the Krichevsky-Trofimov predictive probability, actually slightly better for extremely skewed strings. For some model classes, it agrees with the predictive probability found earlier by Takimoto and Warmuth, as the solution to a different more restrictive minmax problem. We also calculate the CNML models for the generalized Gaussian regression models, and in particular for the cases where the loss function is quadratic, and show that the CNML model achieves asymptotic optimality in terms of the mean ideal code length. Moreover, the quadratic loss, which represents fitting errors as noise rather than prediction errors, can be shown to be smaller than what can be achieved with the NML as well as with the so-called plug-in or the predictive MDL model.","PeriodicalId":439952,"journal":{"name":"2007 Information Theory and Applications Workshop","volume":"41 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Information Theory and Applications Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITA.2007.4357600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 47

Abstract

The NML (normalized maximum likelihood) universal model has certain minmax optimal properties but it has two shortcomings: the normalizing coefficient can be evaluated in a closed form only for special model classes, and it does not define a random process so that it cannot be used for prediction. We present a universal conditional NML model, which has minmax optimal properties similar to those of the regular NML model. However, unlike NML, the conditional NML model defines a random process which can be used for prediction. It also admits a recursive evaluation for data compression. The conditional normalizing coefficient is much easier to evaluate, for instance, for tree machines than the integral of the square root of the Fisher information in the NML model. For Bernoulli distributions, the conditional NML model gives a predictive probability, which behaves like the Krichevsky-Trofimov predictive probability, actually slightly better for extremely skewed strings. For some model classes, it agrees with the predictive probability found earlier by Takimoto and Warmuth, as the solution to a different more restrictive minmax problem. We also calculate the CNML models for the generalized Gaussian regression models, and in particular for the cases where the loss function is quadratic, and show that the CNML model achieves asymptotic optimality in terms of the mean ideal code length. Moreover, the quadratic loss, which represents fitting errors as noise rather than prediction errors, can be shown to be smaller than what can be achieved with the NML as well as with the so-called plug-in or the predictive MDL model.
条件NML通用模型
NML(归一化极大似然)通用模型具有一定的最小最大最优特性,但它存在两个缺点:一是归一化系数只能对特殊的模型类以封闭形式求值;二是它没有定义随机过程,不能用于预测。我们提出了一个通用条件NML模型,它具有与规则NML模型相似的最小最大最优性质。然而,与NML不同的是,条件NML模型定义了一个可用于预测的随机过程。它还允许对数据压缩进行递归计算。例如,对于树形机器,条件归一化系数的计算要比NML模型中Fisher信息的平方根积分容易得多。对于伯努利分布,条件NML模型给出了一个预测概率,它的行为类似于Krichevsky-Trofimov预测概率,实际上对于极端倾斜的字符串稍微好一些。对于一些模型类,它与Takimoto和Warmuth早先发现的预测概率一致,作为另一个更严格的极小问题的解决方案。我们还计算了广义高斯回归模型的CNML模型,特别是对于损失函数是二次的情况,并表明CNML模型在平均理想码长方面达到了渐近最优性。此外,将拟合误差表示为噪声而不是预测误差的二次损失可以被证明比NML以及所谓的插件或预测MDL模型所能达到的效果要小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信