How to use expert advice

N. Cesa-Bianchi, Y. Freund, D. Helmbold, D. Haussler, R. Schapire, Manfred K. Warmuth
{"title":"How to use expert advice","authors":"N. Cesa-Bianchi, Y. Freund, D. Helmbold, D. Haussler, R. Schapire, Manfred K. Warmuth","doi":"10.1145/167088.167198","DOIUrl":null,"url":null,"abstract":"We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts''. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the algorithm by the difference between the expected number of mistakes it makes on the bit sequence and the expected number of mistakes made by the best expert on this sequence, where the expectation is taken with respect to the randomization in the predictions. We show that the minimum achievable difference is on the order of the square root of the number of mistakes of the best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have matching leading constants in most cases. We then show how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. We also extend our analysis to the case in which log loss is used instead of the expected number of mistakes.","PeriodicalId":280602,"journal":{"name":"Proceedings of the twenty-fifth annual ACM symposium on Theory of Computing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"656","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the twenty-fifth annual ACM symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/167088.167198","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 656

Abstract

We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts''. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the algorithm by the difference between the expected number of mistakes it makes on the bit sequence and the expected number of mistakes made by the best expert on this sequence, where the expectation is taken with respect to the randomization in the predictions. We show that the minimum achievable difference is on the order of the square root of the number of mistakes of the best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have matching leading constants in most cases. We then show how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. We also extend our analysis to the case in which log loss is used instead of the expected number of mistakes.
如何使用专家建议
我们分析通过结合几种预测策略(称为“专家”)的预测来预测二值的算法。我们的分析是针对最坏的情况,也就是说,我们对要预测的比特序列的生成方式不做任何假设。我们通过它在比特序列上所犯的期望错误数与最好的专家在该序列上所犯的期望错误数之间的差来衡量算法的性能,其中期望是相对于预测中的随机化的。我们表明,最小可实现的差异是在最好的专家的错误数量的平方根的数量级上,并且我们给出了实现这一目标的有效算法。在大多数情况下,上界和下界的前导常数是匹配的。然后,我们展示了这如何导致某些具有性能界限的模式识别/学习算法,这些算法可以改进当前已知的最佳结果。我们还将分析扩展到使用日志损失而不是预期错误数量的情况。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信