从数据流中监督学习：概述和更新

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys Pub Date : 2025-05-27 DOI:10.1145/3737279

Jesse Read, Indre Zliobaite

{"title":"从数据流中监督学习：概述和更新","authors":"Jesse Read, Indre Zliobaite","doi":"10.1145/3737279","DOIUrl":null,"url":null,"abstract":"The literature on machine learning in the context of data streams is vast and growing. This indicates not only an ongoing interest, but also an ongoing need for a synthesis of new developments in this area. Here we reformulate the definitions of supervised data-stream learning, alongside consideration of contemporary concept drift and temporal dependence. Equipped with this, carry out a fresh discussion of what constitutes a supervised data-stream learning task; including continual and reinforcement learning; highlighting major assumptions and constraints. We carry out a fresh reconsideration of approaches and methods, with regard to their suitability to modern settings. But more than a categorization of state-of-the-art streaming methods, we provide a re-introduction to what is supervised stream learning, and our emphasis here is a survey of settings, and algorithmic settings. Our main goal is to pull theory and practice of supervised learning over data streams closer together. We conclude that practical stream learning does not mandate an online-learning regime. In the modern context, learning regimes should be selected and developed according to the factual data arrival mode, resource constraints, and maximum robustness and trustworthiness. We finish with a set of recommendations to this effect.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"5 1","pages":""},"PeriodicalIF":23.8000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Supervised Learning from Data Streams: An Overview and Update\",\"authors\":\"Jesse Read, Indre Zliobaite\",\"doi\":\"10.1145/3737279\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The literature on machine learning in the context of data streams is vast and growing. This indicates not only an ongoing interest, but also an ongoing need for a synthesis of new developments in this area. Here we reformulate the definitions of supervised data-stream learning, alongside consideration of contemporary concept drift and temporal dependence. Equipped with this, carry out a fresh discussion of what constitutes a supervised data-stream learning task; including continual and reinforcement learning; highlighting major assumptions and constraints. We carry out a fresh reconsideration of approaches and methods, with regard to their suitability to modern settings. But more than a categorization of state-of-the-art streaming methods, we provide a re-introduction to what is supervised stream learning, and our emphasis here is a survey of settings, and algorithmic settings. Our main goal is to pull theory and practice of supervised learning over data streams closer together. We conclude that practical stream learning does not mandate an online-learning regime. In the modern context, learning regimes should be selected and developed according to the factual data arrival mode, resource constraints, and maximum robustness and trustworthiness. We finish with a set of recommendations to this effect.\",\"PeriodicalId\":50926,\"journal\":{\"name\":\"ACM Computing Surveys\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":23.8000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Computing Surveys\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3737279\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3737279","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

关于数据流背景下机器学习的文献数量庞大，而且还在不断增长。这不仅表明持续的兴趣，而且表明持续需要综合这一领域的新发展。在这里，我们重新制定了监督数据流学习的定义，同时考虑了当代概念漂移和时间依赖性。有了这个，就可以对什么是监督数据流学习任务进行新的讨论；包括持续学习和强化学习；强调主要假设和约束。我们对方法和方法进行了新的重新考虑，考虑到它们对现代环境的适用性。但是，除了对最先进的流媒体方法进行分类之外，我们还重新介绍了什么是监督流学习，我们在这里的重点是对设置和算法设置的调查。我们的主要目标是将监督学习的理论和实践更紧密地结合在一起。我们的结论是，实际的流学习并不要求在线学习制度。在现代背景下，学习机制应该根据事实数据到达模式、资源约束以及最大的鲁棒性和可信度来选择和发展。最后，我们给出了一组关于这个效果的建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Supervised Learning from Data Streams: An Overview and Update

The literature on machine learning in the context of data streams is vast and growing. This indicates not only an ongoing interest, but also an ongoing need for a synthesis of new developments in this area. Here we reformulate the definitions of supervised data-stream learning, alongside consideration of contemporary concept drift and temporal dependence. Equipped with this, carry out a fresh discussion of what constitutes a supervised data-stream learning task; including continual and reinforcement learning; highlighting major assumptions and constraints. We carry out a fresh reconsideration of approaches and methods, with regard to their suitability to modern settings. But more than a categorization of state-of-the-art streaming methods, we provide a re-introduction to what is supervised stream learning, and our emphasis here is a survey of settings, and algorithmic settings. Our main goal is to pull theory and practice of supervised learning over data streams closer together. We conclude that practical stream learning does not mandate an online-learning regime. In the modern context, learning regimes should be selected and developed according to the factual data arrival mode, resource constraints, and maximum robustness and trustworthiness. We finish with a set of recommendations to this effect.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Computing Surveys 工程技术-计算机：理论方法

CiteScore

33.20

自引率

0.60%

发文量

372

审稿时长

12 months

期刊介绍： ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.