Polishing the Right Apple: Anytime Classification Also Benefits Data Streams with Constant Arrival Times

Jin Shieh, Eamonn J. Keogh
{"title":"Polishing the Right Apple: Anytime Classification Also Benefits Data Streams with Constant Arrival Times","authors":"Jin Shieh, Eamonn J. Keogh","doi":"10.1109/ICDM.2010.120","DOIUrl":null,"url":null,"abstract":"Classification of items taken from data streams requires algorithms that operate in time sensitive and computationally constrained environments. Often, the available time for classification is not known a priori and may change as a consequence of external circumstances. Many traditional algorithms are unable to provide satisfactory performance while supporting the highly variable response times that exemplify such applications. In such contexts, anytime algorithms, which are amenable to trading time for accuracy, have been found to be exceptionally useful and constitute an area of increasing research activity. Previous techniques for improving anytime classification have generally been concerned with optimizing the probability of correctly classifying individual objects. However, as we shall see, serially optimizing the probability of correctly classifying individual objects K times, generally gives inferior results to batch optimizing the probability of correctly classifying K objects. In this work, we show that this simple observation can be exploited to improve overall classification performance by using an anytime framework to allocate resources among a set of objects buffered from a fast arriving stream. Our ideas are independent of object arrival behavior, and, perhaps unintuitively, even in data streams with constant arrival rates our technique exhibits a marked improvement in performance. The utility of our approach is demonstrated with extensive experimental evaluations conducted on a wide range of diverse datasets.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2010.120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

Abstract

Classification of items taken from data streams requires algorithms that operate in time sensitive and computationally constrained environments. Often, the available time for classification is not known a priori and may change as a consequence of external circumstances. Many traditional algorithms are unable to provide satisfactory performance while supporting the highly variable response times that exemplify such applications. In such contexts, anytime algorithms, which are amenable to trading time for accuracy, have been found to be exceptionally useful and constitute an area of increasing research activity. Previous techniques for improving anytime classification have generally been concerned with optimizing the probability of correctly classifying individual objects. However, as we shall see, serially optimizing the probability of correctly classifying individual objects K times, generally gives inferior results to batch optimizing the probability of correctly classifying K objects. In this work, we show that this simple observation can be exploited to improve overall classification performance by using an anytime framework to allocate resources among a set of objects buffered from a fast arriving stream. Our ideas are independent of object arrival behavior, and, perhaps unintuitively, even in data streams with constant arrival rates our technique exhibits a marked improvement in performance. The utility of our approach is demonstrated with extensive experimental evaluations conducted on a wide range of diverse datasets.
抛光正确的苹果:随时分类也有利于数据流与恒定的到达时间
从数据流中提取的项目分类需要在时间敏感和计算受限的环境中运行的算法。通常,可用的分类时间不是先验的,可能会因外部环境而改变。许多传统算法无法提供令人满意的性能,同时支持高度可变的响应时间,例如这类应用程序。在这种情况下,随时算法被发现是非常有用的,它可以用时间来换取准确性,并构成了一个日益增加的研究活动领域。以前用于改进任意时间分类的技术通常关注于优化正确分类单个对象的概率。然而,正如我们将看到的,连续优化正确分类单个对象K次的概率,通常会得到比批量优化正确分类K个对象的概率更差的结果。在这项工作中,我们展示了可以利用这个简单的观察来提高整体分类性能,方法是使用随时框架在一组从快速到达流中缓冲的对象之间分配资源。我们的想法是独立于对象到达行为的,并且,也许不直观,即使在具有恒定到达率的数据流中,我们的技术在性能上也有显着的改进。我们的方法的效用通过在广泛的不同数据集上进行的广泛的实验评估来证明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信